[sword-devel] GlobalOptionFilter=UTF8GreekAccents and non-Greek modules

Peter Von Kaehne refdoc at gmx.net
Tue Feb 21 03:10:05 MST 2017


Thanks David and Troy,

What is happening is - my script tests for presence of Greek accents by doing a before-and-after comparison using a Greek accent strip filter. This works beautifully for the Hebrew stuff - vowels and breathing marks. It should work for the Greek accent filter. It does not. 

The script is under sword-tools/modules/conf/confmaker.pl. Right now the Greek accents' option has been commented out, so please have a look at the version svn-head-1.

I do not think I use the filter wrong in my script, though of course I am keen to hear about any mistakes in my use.

I have noted this a year or two ago and made a remark on the mailing list. I simply left my script as it was as it seemed correct and the problem was with the library to the best of my understanding.

Peter 

> Gesendet: Dienstag, 21. Februar 2017 um 09:04 Uhr
> Von: "David Haslam" <dfhmch at googlemail.com>
> An: sword-devel at crosswire.org
> Betreff: Re: [sword-devel] GlobalOptionFilter=UTF8GreekAccents and non-Greek	modules
>
> Hi Troy,
> 
> Surely there's no doubt the module source text was correctly encoded as
> UTF-8 and normalised to NFC? 
> 
> We can examine the output of mod2imp and see that it is. Or am I missing
> something?
> 
> mod2imp doesn't change the normalisation form, and I assume it doesn't
> change the encoding either.
> 
> CzeCEP is not only recent module to which the script has added the
> GlobalOptionFilter=UTF8GreekAccents.
> FinRK was released yesterday and suffers the same issue.
> 
> What I think has happened is this:
> 
> The Greek Accents filter was probably never adequately beta tested.
> 
> It was accepted after only being alpha tested, to see that it does remove
> Greek accents from Greek text  that has some.
> 
> Nobody thought to check whether it did anything untoward on the UTF-8
> encoded text in a variety of non-Greek scripts.  The bug has gone undetected
> until yesterday. It's either a very old bug, or a library has changed
> without anyone noticing.
> 
> I understand that the Module Team's script does the following as part of the
> automation to build the module conf file:
> 
> It applies this filter, checks for change, then adds the filter line to the
> conf file if a change was detected.
> 
> Knowing this, it's not hard to see how we have ended up with a spurious
> Greek Accents filter in some recently released modules, is it?
> 
> The mopping up containment action is to determine how many modules have been
> released with the spurious filter in the configuration file? These must each
> be corrected by removing the line, updating the version and date, and
> releasing the update.
> 
> The permanent solution should be to find out exactly how this filter works
> in detail, and rewrite it if necessary. That would require an update to
> SWORD as a significant bug fix.
> 
> The most recent mention of this filter in SWORD releases was under 1.5.10
> dated 20-Nov-2006 in which you added a further Greek accent. In fact, that's
> the only explicit mention. The string "utf8" appears earlier a few times,
> but in a more general sense.
> 
> NB. Using diatheke version 4.7,  I have thoroughly tested CzeCEP for the
> four other UTF8 filters. Only GreekAccents is delinquent.
> 
> Best regards,
> 
> David
> 
> PS. If only CrossWire had a "bug bounty" scheme.... Ah, but we're a
> "non-income" organization. 
> Looking only to the heavenly reward, and the fruit of the Gospel here in
> earth. :)
> 
> 
> 
> 
> 
> --
> View this message in context: http://sword-dev.350566.n4.nabble.com/GlobalOptionFilter-UTF8GreekAccents-and-non-Greek-modules-tp4656719p4656729.html
> Sent from the SWORD Dev mailing list archive at Nabble.com.
> 
> _______________________________________________
> sword-devel mailing list: sword-devel at crosswire.org
> http://www.crosswire.org/mailman/listinfo/sword-devel
> Instructions to unsubscribe/change your settings at above page
> 



More information about the sword-devel mailing list