[jsword-devel] New Search Syntax

Joe Walker joseph.walker at gmail.com
Wed Apr 13 00:20:28 MST 2005


Some thoughts:
Is the intention of RangeNotation that it should be strictly a VerseRange, 
i.e. a set of contiguous verses in the Bible? If so we are missing being 
able to restrict searches by doing: [ Dan, Rev ], which we could do if it 
were a Passage.

I've been surprised before about the lack of i18n in Lucene. I notice that 
google does allow word based connectors (as well as punctuation ones) but 
only if they are in capitals. I looked because I was going to say "do we 
need word based connectors". Not that we need it because Google do though.
I still tend towards thinking that word based connectors drag up more 
problems than they solve though.
- Does [ Gen to Mal ] now mean the same as [ Gen - Mal ]
- Which words to we allow for "Joseph - Mary": "Joseph without Mary" or 
"Joseph but not Mary"
- What then if someone wants to use without in a search.
Do we have an option in Lucene to use only punctuation based connectors?

Joe.


On Apr 11, 2005 11:53 PM, DM Smith <dmsmith555 at gmail.com> wrote:
> 
> I am thinking of the following for a new search syntax:
> SearchRequest -> LuceneSearch
> or CompoundSearch
> or RangeRequest LuceneSearch
> or RangeRequest CompoundSearch
> 
> LuceneSearch -> defined by lucene, not repeated here
> 
> CompoundSearch -> LuceneSearch Blur LuceneSearch
> 
> RangeRequest -> [ RangeNotation ] /* all these verses are ORed with
> the rest */
> or +[ RangeNotation ] /* the search is
> retricted to these verses */
> or -[ RangeNotation ] /* these verses are
> removed from the search results */
> 
> RangeNotation -> anything that JSword can handle as a range
> 
> Blur -> ~ /* Uses the blur size as given in options */
> or ~N /* Finds verses within N verses of each
> other */
> (Currently a ~ b finds b near a. I think this is counter intuitive.
> Should it be the other way?)
> 
> Things to note:
> A search can have at most one RangeRequest. This actually will be a
> search modifier applied the results of the search.
> A search can have at most one Blur.
> Each part of the search request is expected to be separated by
> whitespace from the others.
> 
> While the search syntax can be defined recursively, I think such
> complexity does not add a lot of value for Blur and RangeRestrictions.
> 
> Without recursion in this syntax, a trivial parser can be written.
> 
> One further thing, Lucene uses English connectors. Should we entertain
> the internationalization of these?
> _______________________________________________
> jsword-devel mailing list
> jsword-devel at crosswire.org
> http://www.crosswire.org/mailman/listinfo/jsword-devel
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.crosswire.org/pipermail/jsword-devel/attachments/20050413/31fc7de0/attachment.html


More information about the jsword-devel mailing list