[bt-devel] [ bibletime-Bugs-1619594 ] Use of wildcards not consistent
SourceForge.net
noreply at sourceforge.net
Fri Mar 27 03:03:03 MST 2009
Bugs item #1619594, was opened at 2006-12-20 16:37
Message generated for change (Comment added) made by eelik
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=100954&aid=1619594&group_id=954
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Frontend / Search dialog
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Wolfgang Stradner (ewst)
Assigned to: Joachim Ansorg (joachim)
Summary: Use of wildcards not consistent
Initial Comment:
Using of wildcards ? and * is possible in BT (BT 1.6.2) using the clucene search engine (0.9.16a).
Here are some test-results of this:
Searching in GermElb1905/Matthew for:
- her : finds all the words her: OK
- her* : finds
-- Herrn, Herodes, hervorkommen : OK
-- welcher, himmlischer: not OK
- *her : does not find anything
- he*r : finds
-- himmlischer, her, Heuchler, Herr : OK
-- herabgestiegen, herzu, Herodes, not OK
- h?er : finds all the hier, but they are not marked in yellow (by contrast in Mat 28:6 it marks her in yellow)
- her? : finds
-- Herr :unmarked
- her : marked (I would not expect this hit as I understand ? to be a one-place-joker in contrast to * which can be 0,1 or a more places-joker (as the following example shows):
- ?ehr : finds
-- mehr,sehr : OK (only it should be marked)
----------------------------------------------------------------------
>Comment By: Eeli Kaikkonen (eelik)
Date: 2009-03-27 10:03
Message:
There have been some discussion about the prepended joker marks. We need
that feature for some languages but it doesn't currently exist in clucene.
There's not much we can do unless someone volunteers to change clucene.
Note also that even if clucene enables prepended joker marks with existing
indexes, it will slow down very much. For a quick search we would need
another, reversed index.
See
https://sourceforge.net/tracker/?func=detail&aid=2097655&group_id=954&atid=350954
for discussion about clucene features.
The inline * works correctly if means 0 or more characters. For ordinary
users it's of course wrong, but this also is a clucene dependent thing.
Someone could check if clucene supports 1 or more characters behaviour.
----------------------------------------------------------------------
Comment By: Jonathan Marsden (jmarsden)
Date: 2009-03-27 08:51
Message:
Per http://clucene.wiki.sourceforge.net/Official_CLucene_FAQ
some of the examples given in this report are invalid.
In particular, wildcards may not be placed at the start
of a word. So both *her and ?ehr are invalid.
The other unexpected results are still happening in Bibletime
2.0 alpha3 for me on Ubuntu 8.10 Intrepid x64.
The way he*r matches herabgestiegen and so forth is
clearly incorrect. While the examples used are simple
and so easy to spot, the real danger here is that a user
could rely on a complex search without realizing that one
or more completely incorrect results are being returned.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=100954&aid=1619594&group_id=954
More information about the bt-devel
mailing list