[sword-devel] Re: [bt-devel] fast search
David Kahn
sword-devel@crosswire.org
Sun, 04 Feb 2001 14:44:58 -0500
Hi. I have been watching the forum here for a little while, reluctant
to get involved right now with development because of other
commitments. However, I am also interested in a fast search engine
development. I have some specific ideas for lightning fast searching
that would be interesting to work on. Maybe I could share these ideas
at least.
I think the fastest word search method would be something like the
"Rushmore" engine that Foxpro used. It consists of a binary "mask" file
for each possible search key. Each bit would correspond to one record,
or in the case of scriptures, one verse. One would mean that the word
occurs in that verse and zero would mean it does not occur. The mask
file would be a binary image of these ones and zeroes. This would work
with the scriptures because the bible is a fixed size, so once that
index exists on a given word for a given version, it never changes.
These would be about 50K(?) each. These masks would be "or"ed and
"and"ed together to give any search combination. Only the more commonly
used words would be indexed with a mask. Words that are used less than
a few hundred or thousand(?) times could have a more compact index (a
series of numbers) which could be quickly expanded in memory into a
mask.
The result of each search would be a mask which could be saved and used
repeatedly.
I don't think you could beat that approach for speed and compactness.
Any thoughts?
-David Kahn
Trevor Jenkins wrote:
>
> On Tue, 30 Jan 2001, Joachim Ansorg <jansorg@gmx.de> wrote:
>
> > > Fast searching is still a todo item, so hopefully someone will take it
> > > up as their project.
> > >
> > > Any takers? :)
> >
> > Maybe Trevor??
>
> Yes. Now that my ME is going away I can begin to pick up the things that
> got dropped over the last 15 months. But I have to introduce them back
> gradually. Give me a little while to get the latest source tar-ball and
> look at what you (Troy) put in. I also need time to retrieve my email of
> my old Apple PowerBook, which I was using then; I recall that there were
> some discussions on some of the issues.
>
> But consider me in.
>
> Regards, Trevor
>
> British Sign Language is not inarticulate handwaving; it's a living language.
> Support the campaign for formal recognition by the British government now!
>
> --
>
> <>< Re: deemed!