[jsword-devel] Range searches

DM Smith dmsmith at crosswire.org
Sat Apr 26 06:26:00 MST 2014


That'd be fine.

On Apr 26, 2014, at 9:19 AM, Chris Burrell <chris at burrell.me.uk> wrote:

> So I was thinking of extracting the range... i.e. rewriting the query if the range isn't at the start.
> 
> I'm not sure that would take too much work. We would simply look for the range in the whole string and replace it with "".
> 
> It's now several times that I've had to debug some code to work out why a query wasn't parsing (poor logging on my side), but it shows that to the end user it's not obvious that the range has to be at the beginning (or at the end).
> 
> What do you think?
> Chris
> 
> 
> 
> On 26 April 2014 13:54, DM Smith <dmsmith at crosswire.org> wrote:
> I think that "leading" is clear. It is also "a" range. Feel free to improve the comment.
> 
> The technical constraint is that it is not part of a general search and is applied after. Allowing it at the end would also be reasonable.
> 
> We don't allow for:
> ([GEN] Aaron) OR (([MATT] Jesus) AND ([MATT] Mary))
> That would take a lot of work.
> 
> Which is get all the Aaron references from Genesis and the Matthew references that contain both Jesus and Mary.
> 
> We also allow one ~n (where n is a number). This will split the search into two parts and do a verse proximity search.
> Aaron ~5 Moses means get all the verses containing Aaron that are within 5 of those mentioning Moses. 
> Basically ~5 Moses is converted to a range. The search result for Moses is blurred by 5 and then intersected with Aaron search results.
> 
> I might have the logic backward. It may be that Aaron ~5 is converted to the range.
> 
> But Aaron~0.5 Moses is a lucene search that applies fuzzy search to Aaron and boosts the search for Aaron.
> 
> In Him,
> 	DM Smith
> 
> On Apr 26, 2014, at 8:29 AM, Chris Burrell <chris at burrell.me.uk> wrote:
> 
>> The code in question:
>> 
>> Matcher rangeMatcher = RANGE_PATTERN.matcher(sought);
>> 
>>    /**
>>      * The pattern of a range. This is anything that is contained between a
>>      * leading [] (but not containing a [ or ]), with a + or - optional prefix,
>>      * perhaps surrounded by whitespace.
>>      */
>>     private static final Pattern RANGE_PATTERN = Pattern.compile("^\\s*([-+]?)\\[([^\\[\\]]+)\\]\\s*");
>> 
>> The comment doesn't seem to make clear it should be at the start... Not sure if there is a technical constraint...
>> Chris
>> 
>> 
>> 
>> 
>> On 26 April 2014 13:28, Chris Burrell <chris at burrell.me.uk> wrote:
>> Hello
>> 
>> Just wondering if there is any particular reason for the range pattern to be forced to be the starting block of the query...
>> 
>> i.e. 
>> 
>> +[Gen] Abraham (works)
>> Abraham +[Gen] (doesn't work).
>> 
>> Chris
>> 
>> 
>> _______________________________________________
>> jsword-devel mailing list
>> jsword-devel at crosswire.org
>> http://www.crosswire.org/mailman/listinfo/jsword-devel
> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.crosswire.org/pipermail/jsword-devel/attachments/20140426/2c40871e/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 4145 bytes
Desc: not available
URL: <http://www.crosswire.org/pipermail/jsword-devel/attachments/20140426/2c40871e/attachment-0001.p7s>


More information about the jsword-devel mailing list