[jsword-devel] Search and its bugs

DM Smith dmsmith555 at gmail.com
Fri Apr 8 04:52:10 MST 2005


I've narrowed down some of the bugs of search. Seems that the tokenizer 
is not producing the correct stream of tokens.
Specifically, the algorithm using the tokens goes something like this:

while there are command tokens at the beginning of the stream get next one
do
    have that command consume word tokens until it reaches a terminating 
condition
done

The problem of +[mat-rev]"bread of life" is that this produces a token 
stream where +[mat-rev] is not followed by a command token.

In looking at this I noticed that there is what looks like a design 
problem. Consistently, elsewhere in JSword, an interface defines a wall 
that BibleDesktop and JSword does not look behind. However in the case 
of searching this is not the case.

jsword.book.search
    provides the interfaces for Search and Index and factories to get 
implementation
jsword.book.search.basic
    provides abstract/partial implementation of the interfaces
jsword.book.search.parse
    provides an implementation of Searcher
jsword.book.search.lucene
    provides an implementation of Indexer

Based upon this I would have expected that no code (outside of the 
package) would have directly used jsword.book.search.parse code.

The reason I noticed this was that I wanted to create another searcher 
and get it from the search factory. (Start with a copy and fix bugs, 
while retaining the ability to use BibleDesktop by changing the 
factories properties.)

What is being used is the syntax elements to pro grammatically construct 
a search. I'm thinking that we need YAI (yet another interface) for 
SearchSyntax. This would be able to:
1) decorate individual words and phrases with appropriate syntax elements.
    SearchSyntax ss = SearchSyntaxFactory.getSearchSyntax();
    String decorated = ss.decorate(SyntaxType.STARTS_WITH, "bread of life");
    decorated = ss.decorate(SyntaxType.FIND_ALL_WORDS, "son of man");
    decorated = ss.decorate(SyntaxType.FIND_STRONG_NUMBERS, "1234 5678");
    decorated = ss.decorate(SyntaxType.BEST_MATCH, "....");
    decorated = ss.decorate(SyntaxType.PHRASE_SEARCH, "....");
    ...

2) create a token stream from a string.
    Token[] tokens = ss.tokenize("search string");
    or
    TokenStream tokens = ss.tokenize("search string");
    or
    ...

3) serialize a token stream to a string.

Input desired!



More information about the jsword-devel mailing list