[sword-devel] Hebrew Unicode Normalization?

David Haslam dfhmch at googlemail.com
Tue May 20 11:12:23 MST 2014

Please can anyone shed any light on this version history for the WLC module?

The phrase "undid NFC normalization" in May 2009 is what prompts me to ask.

History_1.9=Updated to WLC 4.18 (2013-10-11)
History_1.8=Added pe & samekh 'paragraph' marks to text (2012-04-17)
History_1.7=Updated to WLC 4.14, corrected some spurious markup (2012-04-14)
History_1.6=Updated source attribution, undid NFC normalization, placed DH
identification notes inline, fixed spacing problem (2009-05-26)
History_1.5=Updated to WLC 4.10 with additions from www.tanach.us, using
native versification
History_1.4=Corrected internal markup to conform to OSIS 2.1.1 schema,
changed About markup to RTF (2008-07-02)
History_1.3=Fixed the conf
History_1.2=Bugfix for textual errors. Re-added setumot and paraschot, even
though their presence in L is not verified, according to Kirk Lowery. Fixed
transcription note values. Included morphological segmentation in
preliminary markup. Added xml:lang="en" to notes. Update to newer version
History_1.1=Update to newer version (wlc43-20050319) of the WLC from WHI;
Bugfixes in the conversion program that caused textual errors (thanks to
Chris Kimball); Fixed one footnote text template.
History_1.0=First public version.
Description=Westminster Leningrad Codex

The latest version of the WLC module text is normalized to NFC.

cf. The OSHB module is also NFC. The MapM module is not normalized.

Further considerations: 

A bit of digging unearthed this relevant information..

The specified font in the WLC conf is Ezra SIL.

The web page for Ezra SIL fonts includes this rather obscure note:


* Follows the recommendations for character order and encoding determined by
a group of font developers during discussions in May 2003. This does not
follow canonical order. Texts which have been converted to NFC or NFD
canonical order will not display correctly with these fonts.

Best regards,


View this message in context: http://sword-dev.350566.n4.nabble.com/Hebrew-Unicode-Normalization-tp4653971.html
Sent from the SWORD Dev mailing list archive at Nabble.com.

More information about the sword-devel mailing list