[jsword-devel] Lucene index help
DM Smith
dmsmith at crosswire.org
Thu Nov 4 09:44:44 MST 2010
On 11/04/2010 11:26 AM, Martin Denham wrote:
> Does anybody know any reason why a search for 'blessed' does not
> return any search results in ESV but searching for 'bless' work perfectly?
As best as I can tell it is a miss match between the index and the
library. I just did the same search as you and got the same results as
you. Then I deleted the index and rebuilt it. Then it started to work.
>
> When I download BibleDesktop (JSword) generated indexes to And Bible
> I have noticed that some searches like 'blessed' stop working but I
> can't figure out what the problem is and would appreciate some
> pointers as to areas to look.
It is critical that the same jsword.jar is used to build the index and
to search it.
In this case the problem is that stemming has been introduced in the
newest version of JSword. This invalidates old indexes, but there is not
a mechanism to know that. Well, there is but it is not complete.
>
> I have checked that the correct Analyzer is being used but I am not
> sure what else to check or if the 'blessed'/'bless' issue might point
> to a specific problem area.
The analyzer is merely a chain of a tokenizer and a bunch of filters.
>
> The plan is to download pre-created indexes to And Bible and in theory
> those indexes should be generated by JSword but currently And Bible
> can only use indexes it creates itself or which have been created by
> CLucene/Sword.
The indexes that CLucene/Sword create are not compatible with JSword and
haven't been for many releases. (And visa versa). For the most part they
work for English, but as you found out "most part" isn't good enough.
The biggest problem is that CLucene development is stagnant and far
behind that of Java Lucene. The second problem is that unless a
versioning mechanism is added to SWORD, the SWORD indexes will not
improve or gain new features.
We improved JSword w/o a versioning mechanism and are suffering the
problems. It needs to be fixed with the next point release.
>
> All advice, opinions, and comments are appreciated.
>
> Many thanks
> Martin
In His Service,
DM
More information about the jsword-devel
mailing list