[sword-devel] Compressed modules (was fast search)
David J. Orme
sword-devel@crosswire.org
Thu, 22 Feb 2001 10:33:29 -0500
David Twyerould wrote:
> <<snip>>...on the subject of inverted
> indexes, my understanding is that the Online Bible format is in fact an
> inverted index with punctuation. From memory I think there are three parts
> to it: the words themselves(word list), the word index and the verse list
> (ie: a sequential list of word numbers). The verse text is recreated on
> the fly as required. The index is compressed as is the word list.
>
> The advantage is that you have minimal disk space plus high speed
> searching. The time taken to reconstruct the verse text is negligable on
> all but the very slowest machines and even then its not usually a problem
My opinion is that we /need/ to do this if we want Sword to be able to
effectively support PDAs and other machines with limited storage space.
What would be involved in doing this? I'd like to take a crack at it.
Best,
David Orme
(Agenda PDA front-end maintainer)