[sword-devel] GlobalOptionFilter=UTF8GreekAccents and non-Greek modules

David Haslam dfhmch at googlemail.com
Tue Feb 28 13:14:48 MST 2017


This is a hunch, but I'm thinking that it's very likely that three more Greek
combining characters near to that may also fail to be removed by the
UTF8GreekAccents filter.

The overlooked set would then be:

U+0342	͂	COMBINING GREEK PERISPOMENI
U+0343	̓	COMBINING GREEK KORONIS
U+0344	̈́	COMBINING GREEK DIALYTIKA TONOS
U+0345	ͅ	COMBINING GREEK YPOGEGRAMMENI

So far, I've only confirmed the last of these four, a character that occurs
in SBLG_THE as well as a few other modules.

I may need to construct a test module containing the other three in this
list.

Meanwhile, today I developed a mapping table to be used (e.g.) in a bespoke
TextPipe filter that can remove all Greek Accents without affecting any
other Unicode character.

The table currently has 233 characters, some of which we'll never see in
Biblical Greek text.

btw. I have also included one conversion which is not an accent removal.

U+0387	·	GREEK ANO TELEIA	becomes	U+00B7	·	MIDDLE DOT

That's now the proper canonical equivalent of this archaic Greek punctuation
mark.

Our Biblical Greek modules already have MIDDLE DOT, so we're not likely to
see this replace ever take place unless we plan to test for it.

Best regards,

David





--
View this message in context: http://sword-dev.350566.n4.nabble.com/GlobalOptionFilter-UTF8GreekAccents-and-non-Greek-modules-tp4656719p4656848.html
Sent from the SWORD Dev mailing list archive at Nabble.com.



More information about the sword-devel mailing list