[jsword-devel] Range searches

Chris Burrell chris at burrell.me.uk
Sat Apr 26 06:19:09 MST 2014


So I was thinking of extracting the range... i.e. rewriting the query if
the range isn't at the start.

I'm not sure that would take too much work. We would simply look for the
range in the whole string and replace it with "".

It's now several times that I've had to debug some code to work out why a
query wasn't parsing (poor logging on my side), but it shows that to the
end user it's not obvious that the range has to be at the beginning (or at
the end).

What do you think?
Chris



On 26 April 2014 13:54, DM Smith <dmsmith at crosswire.org> wrote:

> I think that "leading" is clear. It is also "a" range. Feel free to
> improve the comment.
>
> The technical constraint is that it is not part of a general search and is
> applied after. Allowing it at the end would also be reasonable.
>
> We don't allow for:
> ([GEN] Aaron) OR (([MATT] Jesus) AND ([MATT] Mary))
> That would take a lot of work.
>
> Which is get all the Aaron references from Genesis and the Matthew
> references that contain both Jesus and Mary.
>
> We also allow one ~n (where n is a number). This will split the search
> into two parts and do a verse proximity search.
> Aaron ~5 Moses means get all the verses containing Aaron that are within 5
> of those mentioning Moses.
> Basically ~5 Moses is converted to a range. The search result for Moses is
> blurred by 5 and then intersected with Aaron search results.
>
> I might have the logic backward. It may be that Aaron ~5 is converted to
> the range.
>
> But Aaron~0.5 Moses is a lucene search that applies fuzzy search to Aaron
> and boosts the search for Aaron.
>
> In Him,
> DM Smith
>
> On Apr 26, 2014, at 8:29 AM, Chris Burrell <chris at burrell.me.uk> wrote:
>
> The code in question:
>
> Matcher rangeMatcher = RANGE_PATTERN.matcher(sought);
>
>    /**
>      * The pattern of a range. This is anything that is contained between a
>      * leading [] (but not containing a [ or ]), with a + or - optional
> prefix,
>      * perhaps surrounded by whitespace.
>      */
>     private static final Pattern RANGE_PATTERN =
> Pattern.compile("^\\s*([-+]?)\\[([^\\[\\]]+)\\]\\s*");
>
> The comment doesn't seem to make clear it should be at the start... Not
> sure if there is a technical constraint...
> Chris
>
>
>
>
> On 26 April 2014 13:28, Chris Burrell <chris at burrell.me.uk> wrote:
>
>> Hello
>>
>> Just wondering if there is any particular reason for the range pattern to
>> be forced to be the starting block of the query...
>>
>> i.e.
>>
>> +[Gen] Abraham (works)
>> Abraham +[Gen] (doesn't work).
>>
>> Chris
>>
>>
> _______________________________________________
> jsword-devel mailing list
> jsword-devel at crosswire.org
> http://www.crosswire.org/mailman/listinfo/jsword-devel
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.crosswire.org/pipermail/jsword-devel/attachments/20140426/a6d36140/attachment.html>


More information about the jsword-devel mailing list