[jsword-devel] Lucene index help

DM Smith dmsmith at crosswire.org
Thu Nov 4 09:44:44 MST 2010


On 11/04/2010 11:26 AM, Martin Denham wrote:
> Does anybody know any reason why a search for 'blessed' does not 
> return any search results in ESV but searching for 'bless' work perfectly?

As best as I can tell it is a miss match between the index and the 
library. I just did the same search as you and got the same results as 
you. Then I deleted the index and rebuilt it. Then it started to work.

>
> When I download  BibleDesktop (JSword) generated indexes to And Bible 
> I have noticed that some searches like 'blessed' stop working but I 
> can't figure out what the problem is and would appreciate some 
> pointers as to areas to look.

It is critical that the same jsword.jar is used to build the index and 
to search it.

In this case the problem is that stemming has been introduced in the 
newest version of JSword. This invalidates old indexes, but there is not 
a mechanism to know that. Well, there is but it is not complete.

>
> I have checked that the correct Analyzer is being used but I am not 
> sure what else to check or if the 'blessed'/'bless' issue might point 
> to a specific problem area.

The analyzer is merely a chain of a tokenizer and a bunch of filters.

>
> The plan is to download pre-created indexes to And Bible and in theory 
> those indexes should be generated by JSword but currently And Bible 
> can only use indexes it creates itself or which have been created by 
> CLucene/Sword.

The indexes that CLucene/Sword create are not compatible with JSword and 
haven't been for many releases. (And visa versa). For the most part they 
work for English, but as you found out "most part" isn't good enough.

The biggest problem is that CLucene development is stagnant and far 
behind that of Java Lucene. The second problem is that unless a 
versioning mechanism is added to SWORD, the SWORD indexes will not 
improve or gain new features.

We improved JSword w/o a versioning mechanism and are suffering the 
problems. It needs to be fixed with the next point release.

>
> All advice, opinions, and comments are appreciated.
>
> Many thanks
> Martin

In His Service,
     DM



More information about the jsword-devel mailing list