[sword-devel] Soft hyphens?

DM Smith dmsmith at crosswire.org
Sat Apr 1 07:38:35 MST 2017


Can Lucene code be improved?
Short answer: No.
Long answer: I’ve suggested improvements in the past when it was felt that JSword and SWORD should be able to use the same Lucene indexes. Going from memory, the argument against any change was that a mechanism would be needed to know when the index is invalid so that it could be rebuilt.

The other problem with soft-hyphens is that Lucene is not the only search mechanism in SWORD.

BTW, the problem is with other intra-word zero-width code-points.

Another problem with the StandardAnalyzer is that it is English centric, with a bit of support for latinate languages. E.g.: It does not handle Thai with its lack of whitespace.

But let’s not go there. ;)

DM

> On Apr 1, 2017, at 10:12 AM, David Haslam <dfhmch at googlemail.com> wrote:
> 
> Interesting.
> 
> Question prompted by an addition to /Tentative suggestions/ in 
> https://crosswire.org/wiki/CrossWire_KJV#KJV_module:
> 
> Can the Lucene code be improved ?
> 
> David
> 
> 
> 
> --
> View this message in context: http://sword-dev.350566.n4.nabble.com/Soft-hyphens-tp4657045p4657048.html
> Sent from the SWORD Dev mailing list archive at Nabble.com.
> 
> _______________________________________________
> sword-devel mailing list: sword-devel at crosswire.org
> http://www.crosswire.org/mailman/listinfo/sword-devel
> Instructions to unsubscribe/change your settings at above page




More information about the sword-devel mailing list