[jsword-devel] Search and its bugs
DM Smith
dmsmith555 at gmail.com
Fri Apr 8 04:52:10 MST 2005
I've narrowed down some of the bugs of search. Seems that the tokenizer
is not producing the correct stream of tokens.
Specifically, the algorithm using the tokens goes something like this:
while there are command tokens at the beginning of the stream get next one
do
have that command consume word tokens until it reaches a terminating
condition
done
The problem of +[mat-rev]"bread of life" is that this produces a token
stream where +[mat-rev] is not followed by a command token.
In looking at this I noticed that there is what looks like a design
problem. Consistently, elsewhere in JSword, an interface defines a wall
that BibleDesktop and JSword does not look behind. However in the case
of searching this is not the case.
jsword.book.search
provides the interfaces for Search and Index and factories to get
implementation
jsword.book.search.basic
provides abstract/partial implementation of the interfaces
jsword.book.search.parse
provides an implementation of Searcher
jsword.book.search.lucene
provides an implementation of Indexer
Based upon this I would have expected that no code (outside of the
package) would have directly used jsword.book.search.parse code.
The reason I noticed this was that I wanted to create another searcher
and get it from the search factory. (Start with a copy and fix bugs,
while retaining the ability to use BibleDesktop by changing the
factories properties.)
What is being used is the syntax elements to pro grammatically construct
a search. I'm thinking that we need YAI (yet another interface) for
SearchSyntax. This would be able to:
1) decorate individual words and phrases with appropriate syntax elements.
SearchSyntax ss = SearchSyntaxFactory.getSearchSyntax();
String decorated = ss.decorate(SyntaxType.STARTS_WITH, "bread of life");
decorated = ss.decorate(SyntaxType.FIND_ALL_WORDS, "son of man");
decorated = ss.decorate(SyntaxType.FIND_STRONG_NUMBERS, "1234 5678");
decorated = ss.decorate(SyntaxType.BEST_MATCH, "....");
decorated = ss.decorate(SyntaxType.PHRASE_SEARCH, "....");
...
2) create a token stream from a string.
Token[] tokens = ss.tokenize("search string");
or
TokenStream tokens = ss.tokenize("search string");
or
...
3) serialize a token stream to a string.
Input desired!
More information about the jsword-devel
mailing list