[sword-devel] indexed search discrepancy
DM Smith
dmsmith at crosswire.org
Sun Aug 30 07:06:09 MST 2009
On Aug 30, 2009, at 9:08 AM, Jonathan Morgan wrote:
> Just on the topic of stop words, I think that it is worth considering
> including stopwords. They can be important in quite a few Bible
> searches (ones that spring to mind are "Son of man", "Son of God" and
> "the son").
I agree. JSword uses the SimpleAnalyzer, because it preserves the
entire content, including stopwords, of the Bible.
I had submitted a patch that did this and it was rejected because it
did not preserve backward compatibility without providing a versioning
system for each generated index.
As to using a simple incrementing number to represent the version of
the index, this may not be adequate. It is sufficient if the user has
no control over the index and indexes that do not match the version
number of the engine are ignored/discarded/automatically upgraded...
by the front-end or engine.
Give the user any control over the index or provide the front-end any
indication of what is in the index and it is not sufficient. Further,
once we get to analyzers per language each feature needs a version
number as well.
Very messy.
The solution we have for BibleDesktop/JSword is to just let the user
know that if search does not perform as expected to delete the index
and rebuild it. Not at all a good solution, but we've not had any
complaints.
In Him,
DM
More information about the sword-devel
mailing list