[jsword-devel] Chinese and Chinese Simplified

Chris Burrell chris at burrell.me.uk
Thu Aug 29 11:27:54 MST 2013

My suggestion was at Index time to index both firms of word.

But it would work at search time and could be better. Would line up with
how lucene recommends synonyms tables to be implemented. In this case you
would do it in an analyser.

The latter would also preserve to term frequency etc. And allow to turn on
and off
On 26 Aug 2013 15:43, "Sijo Cherian" <sijo.cherian at gmail.com> wrote:

> Chris,
> Is your suggestion to automatically add mapped characters to the query
> only, for e.g. given a user's simplified query xyz, make the query
> IndexField:(xyz OR tradition-mapping-for-xyz)
> thanks
> /sijo
> On Sat, Aug 24, 2013 at 6:28 AM, Chris Burrell <chris at burrell.me.uk>wrote:
>> Hi
>> Just wondering - do we want to allow Chinese to be searchable regardless
>> of whether it is traditional or simplified.
>> There is a small library (https://code.google.com/p/java-zhconverter/)
>> which does the conversion. My understanding is that traditional to
>> simplified is unique, and simplified to traditional is not always unique
>> (although the library I think only gives you one option).
>> My thought is that for chinese versions, we add the
>> simplified/traditional options in the same field of the index so that we
>> can prevent user errors (I've found myself raising a bug against Simplified
>> Chinese not working, but it was because I had a traditional Chinese version
>> selected.). We would never display these mappings, just use them to return
>> results..
>> Just a thought.
>> Chris
>> _______________________________________________
>> jsword-devel mailing list
>> jsword-devel at crosswire.org
>> http://www.crosswire.org/mailman/listinfo/jsword-devel
> --
> Regards,
> Sijo
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.crosswire.org/pipermail/jsword-devel/attachments/20130829/af882cfd/attachment.html>

More information about the jsword-devel mailing list