[sword-devel] Search Engine
Harry Plantinga
sword-devel@crosswire.org
Mon, 23 Apr 2001 14:26:55 -0400
Sorry for replying to my own message, but let me throw out an
idea to this list for building a search engine quickly.
A. use the index-building code below directly. I think it's
written in C. Its very fast, ranked/boolean, compact indexes,
etc. GPL.
B. the 'mg' version has a command-line text-based interface
for searching. Use the same code to read the index, but add a
GUI.
Voila, an open-source search engine that can be used by Sword,
but also for many other projects, CD-Roms, etc. If the GUI
were done in Java it could be cross-platform as well.
Actually, I'm sure there would be other complications -- in the
case of the 'mg' software, filtering out <tags> before building
the index, for example. In the case of the Greenstone version,
dis-entangling the search code from the rest of the project and
modifying the interface. Returning exerpts with the search terms
hilighted. But this code might be a good place to start.
The source code download page is
http://www.cs.waikato.ac.nz/~nzdl/gsdl-docs/Download.html
-Harry
> FYI, there's excellent open-source full-text search engine code
> available. In its older incarnation it's called "mg" and it is
> associated with the book "Managing Gigabytes ...". Text only.
>
> In its newer incarnation it's a part of the New Zeeland Digital
> Library Project (aka Greenstone), www.nzdl.org. This version
> does handle HTML and other formats intelligently, I believe.
>
> -Harry
>>