[jsword-devel] Searching

Eric Galluzzo jsword-devel@crosswire.org
27 Apr 2003 18:51:05 -0400


On Sun, 2003-04-27 at 13:10, Joe Walker wrote:
> 
> Hi,
> 
> Am I right in thinking that SWORD implements search by doing a linear 
> scan of the whole Bible for each search?

I'm afraid I don't know the answer to this one. ;)

> Back in the ProjectB days I wrote a search engine, and I believe that we 
> could fairly simply reuse this code to add indexed searching to Sword 
> modules, maybe with a linear scan while the index is being built?
> 
> Is there any expertise in search out there?

Well, I have basically none; however, couldn't we just use something
like Lucene (http://jakarta.apache.org/lucene/) to do all the hard work
for us?  Then we could take advantage of all their expertise, as well as
all those fancy queries that they support (e.g. X within four words of
Y).  It's pretty extensible, so we could do things like filtering by
book, chapter, and verse just by adding fields to the index.

And if we get some nice tagged Greek texts, we might even be able to
support "fancy" searches like the nicer Bible packages do that say "find
me an aorist subjunctive 'baptizo' which has a 'de' right before it, and
which is within five words of any form of the word 'sozo'."  Of course,
if we don't have tagged Greek texts, we might be able to do this by a
fancy stemmer, but that sounds complicated.... ;)  I'm not sure if
Lucene actually supports all this stuff, but I do know it supports the
"within X words of" operator, and we could probably extend it as needed.

	- Eric