[sword-devel] NFC Normalization and osis2mod

DM Smith dmsmith555 at yahoo.com
Wed Feb 20 20:06:27 MST 2008


I've added a -n flag to osis2mod that will normalize UTF-8 to NFC,  
which we've agreed as the standard for UTF-8 modules.

I used Sword's UTF8NFC filter to do the work, but found that it was  
buggy with trailing garbage on some verses.

I have created a patch for both at www.crosswire.org/~dmsmith/nfcPatch.txt 
  and would greatly appreciate some more testing of it.

My test was fairly trivial. I took an OSIS file with limited UTF-8,  
already nfc and ran it through osis2mod with and without the -n flag  
and then compared the two files. Before I fixed UTF8NFC there were  
differences. After fixing UTF8NFC, there were none.

All that this shows is that it does not corrupt an already good nfc  
utf-8 file.

Many thanks in advance.

DM
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.crosswire.org/pipermail/sword-devel/attachments/20080220/58b47311/attachment.html 


More information about the sword-devel mailing list