[sword-devel] indexed search discrepancy (and sword 1.6.0+dfsg-2)

Jonathan Marsden jmarsden at fastmail.fm
Sat Aug 29 23:34:15 MST 2009


Matthew Talbert wrote:

> OK, here are results. All tests are done with my previous changes; the
> only difference is the first index has stop words, the second doesn't.

> KJV 7.3MB 6.3MB
> Finney 654KB 518KB
> ESV 5.9MB 5.0MB

So roughly 20% extra.  I see no reason not to go for it -- but then, I'm
a desktop user with a monstrous 640GB hard drive :)  Are there
situations and systems where this would be a significant issue?

> For those wondering why a search for "the lord" doesn't segfault, it's
> only when you search for a stop word alone that there is a segfault.
> If you want to talk about confusing users, the current system would
> seem illogical (I searched for "god is" and got nothing??).

Agreed.  Unless the 20% extra space requirement is really an issue in
some circumstances, it looks like the right approach would be to just
index everything, and so get more correct search results.

Jonathan



More information about the sword-devel mailing list