[sword-devel] DevTools:ICU & Normalization?

David Haslam dfhmch at googlemail.com
Fri Oct 28 08:28:16 MST 2011


FYI.  As a result of my posts in their forum arising from this topic,
DataMystic have just released v8.9.8 of TextPipe.

The release notes include:

* Updated internal PCRE (Pattern Matching ) engine to v8.13 and support for
Unicode 6.0.0.
* Updated Unicode internal libraries to support Unicode 4.1 for
Normalization etc.

I have confirmed that TextPipe now Normalizes Burmese script to NFC with
identical results to BabelPad.
As an avid user of TextPipe Standard edition, for me this is nice step
forward.

Our *BurJudson* module was made with the source text normalized to an
earlier version of Unicode.

Unless one specifies otherwise (by means of the -N switch), osis2mod
performs normalization to NFC.

I would therefore recommend that precompiled SWORD utilities (especially
those for Windows) should be built such that they adhere to the latest
Unicode standard for Normalization.

Likewise, front-end developers may have something to gain by pursuing this
topic further, seeing as ICU has implications during module search, in
regard to normalization of a search string, such that it ought to match how
the module was normalized.

David




--
View this message in context: http://sword-dev.350566.n4.nabble.com/DevTools-ICU-Normalization-tp3898398p3948253.html
Sent from the SWORD Dev mailing list archive at Nabble.com.



More information about the sword-devel mailing list