[sword-devel] NFC normalization and osis2mod
DM Smith
dmsmith555 at gmail.com
Wed Feb 20 19:09:39 MST 2008
I've added a -n flag to osis2mod that will normalize UTF-8 to NFC,
which we've agreed as the standard for UTF-8 modules.
I used Sword's UTF8NFC filter to do the work, but found that it was
buggy with trailing garbage on some verses.
I have created a patch for both at www.crosswire.org/~dmsmith/nfcPatch.txt
and would greatly appreciate some more testing of it.
My test was fairly trivial. I took an OSIS file with limited UTF-8,
already nfc and ran it through osis2mod with and without the -n flag
and then compared the two files. Before I fixed UTF8NFC there were
differences. After fixing UTF8NFC, there were none.
All that this shows is that it does not corrupt an already good nfc
utf-8 file.
Many thanks in advance.
DM
More information about the sword-devel
mailing list