[sword-devel] Search Engine

Harry Plantinga sword-devel@crosswire.org
Mon, 23 Apr 2001 14:26:55 -0400


Sorry for replying to my own message, but let me throw out an
idea to this list for building a search engine quickly.

A.  use the index-building code below directly.  I think it's
written in C.  Its very fast, ranked/boolean, compact indexes,
etc. GPL.

B.  the 'mg' version has a command-line text-based interface
for searching. Use the same code to read the index, but add a 
GUI.

Voila, an open-source search engine that can be used by Sword, 
but also for many other projects, CD-Roms, etc.  If the GUI
were done in Java it could be cross-platform as well.

Actually, I'm sure there would be other complications -- in the
case of the 'mg' software, filtering out <tags> before building 
the index, for example.  In the case of the Greenstone version,
dis-entangling the search code from the rest of the project and
modifying the interface. Returning exerpts with the search terms 
hilighted. But this code might be a good place to start.

The source code download page is 
http://www.cs.waikato.ac.nz/~nzdl/gsdl-docs/Download.html

-Harry

 
> FYI, there's excellent open-source full-text search engine code 
> available.  In its older incarnation it's called "mg" and it is
> associated with the book "Managing Gigabytes ...".  Text only.
> 
> In its newer incarnation it's a part of the New Zeeland Digital
> Library Project (aka Greenstone), www.nzdl.org.  This version
> does handle HTML and other formats intelligently, I believe.
> 
> -Harry
>>