[jsword-devel] False search hits with certain locales
Martin Denham
mjdenham at gmail.com
Wed Feb 8 13:05:24 MST 2012
I found the problem:
Rev.Full = Johannes\u2019 openberring
\u0219 is an apostrophe and so it was matching Johannes' openberring 22:8
but stopping at the apostrophe which of course matched the whole of John.
Best regards
Martin
On 8 February 2012 19:23, Martin Denham <mjdenham at gmail.com> wrote:
> I have just noticed that I have not fixed the problem. I am now getting
> an error on the final hit 'Key can't be a passage' - I don't know what that
> means:
> 02-08 19:05:34.105: I/System.out(22191): 129 found:Johannes’ openberring
> 1:1 docid=30681 docbase=0 key.card:1 res.card=129
> 02-08 19:05:34.105: I/System.out(22191): 130 found:Johannes’ openberring
> 1:4 docid=30684 docbase=0 key.card:1 res.card=130
> 02-08 19:05:34.105: I/System.out(22191): 131 found:Johannes’ openberring
> 1:9 docid=30689 docbase=0 key.card:1 res.card=131
> 02-08 19:05:34.145: I/System.out(22191): JSword:Key can't be a passage:
> Johannes’ openberring 22:8
> 02-08 19:05:34.155: I/System.out(22191): 132 found:Johannes’ openberring
> 22:8 docid=31071 docbase=0 key.card:1 res.card=131
>
> To log the cardinality I just added a println in the VerseCollector as
> below:
> Key key =
> VerseFactory.fromString(doc.get(LuceneIndex.FIELD_KEY));
> results.addAll(key);
> System.out.println(++count + " found:" +key.getName()+ "
> docid="+docId+" docbase="+docBase+" key.card:"+key.getCardinality()+"
> res.card="+results.getCardinality());
>
> The problem is I can't see the bug on Windows, only when running on my
> Android phone, so I am not sure anybody without an Android will be able to
> reproduce the problem easily.
>
> Martin
>
> On 8 February 2012 19:04, DM Smith <dmsmith at crosswire.org> wrote:
>
>> I've been trying to get to it, but haven't be able to do so. I'd be
>> interested in your code to log the cardinality.
>> -- DM
>>
>>
>> On 02/08/2012 01:54 PM, Martin Denham wrote:
>>
>> I don't know what is going on but I have done more analysis and found a
>> fix for Nynorsk, but I think it is affecting other locales like Japanese
>> which I can't explain.
>>
>> Test: search for 'John' in NT in And Bible with locale set to nn
>> Result: 1389 hits including every verse in the gospel of John
>> Observation: I logged the cardinality of the results var in
>> VerseCollector and you can see that it jumps from 131 to 1389 on the last
>> hit in Rev.22.8:
>> 02-08 18:18:15.895: I/System.out(21945): 127 found:Apostelgjerningane
>> 19:4 docid=27575 docbase=0 key.card:1 res.card=127
>> 02-08 18:18:15.905: I/System.out(21945): 128 found:Galatarane 2:9
>> docid=29073 docbase=0 key.card:1 res.card=128
>> 02-08 18:18:15.905: I/System.out(21945): 129 found:Johannes’ openberring
>> 1:1 docid=30681 docbase=0 key.card:1 res.card=129
>> 02-08 18:18:15.915: I/System.out(21945): 130 found:Johannes’ openberring
>> 1:4 docid=30684 docbase=0 key.card:1 res.card=130
>> 02-08 18:18:15.915: I/System.out(21945): 131 found:Johannes’ openberring
>> 1:9 docid=30689 docbase=0 key.card:1 res.card=131
>> 02-08 18:18:15.965: I/System.out(21945): 132 found:Johannes’ openberring
>> 22:8 docid=31071 docbase=0 key.card:1 res.card=1389
>>
>> Other words in Rev 22 seem to have the same effect e.g. month, behold,
>> am,...
>>
>> The fix for nn was to change
>> Rev.Short=Op
>> to
>> Rev.Short=JoOp
>>
>> Any idea what is happening? I tried to write a junit on my pc but
>> couldn't get it to fail on Windows.
>>
>> I am using revision 2195 of JSword, which is before the AV changes.
>>
>> Thanks
>> Martin
>>
>>
>> On 2 February 2012 11:20, DM Smith <dmsmith at crosswire.org> wrote:
>>
>>> I'm trying to see what is happening. It doesn't make sense to me
>>> either.
>>>
>>> Cent from my fone so theer mite be tipos. ;)
>>>
>>> On Jan 27, 2012, at 9:44 AM, Martin Denham <mjdenham at gmail.com> wrote:
>>>
>>> Hi,
>>>
>>> I have received this error report for And Bible<http://code.google.com/p/and-bible/issues/detail?id=87> which
>>> has confused me. I would be grateful for any suggestions wrt what might be
>>> happening.
>>>
>>> A simple test I have tried:
>>>
>>> - Set locale to de or en
>>> - Search for 'John' in ESV
>>> - Works fine
>>> - Set locale to nn (Norsk Nynorsk)
>>> - Search for 'John' in ESV
>>> - Every verse of John is returned in the result list
>>>
>>> Thanks
>>> Martin
>>>
>>> _______________________________________________
>>> jsword-devel mailing list
>>> jsword-devel at crosswire.org
>>> http://www.crosswire.org/mailman/listinfo/jsword-devel
>>>
>>>
>>> _______________________________________________
>>> jsword-devel mailing list
>>> jsword-devel at crosswire.org
>>> http://www.crosswire.org/mailman/listinfo/jsword-devel
>>>
>>>
>>
>>
>> _______________________________________________
>> jsword-devel mailing listjsword-devel at crosswire.orghttp://www.crosswire.org/mailman/listinfo/jsword-devel
>>
>>
>>
>> _______________________________________________
>> jsword-devel mailing list
>> jsword-devel at crosswire.org
>> http://www.crosswire.org/mailman/listinfo/jsword-devel
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.crosswire.org/pipermail/jsword-devel/attachments/20120208/1a047666/attachment.html>
More information about the jsword-devel
mailing list