[sword-devel] Module .conf files, Unicode Normalization

Greg Hellings greg.hellings at gmail.com
Sat Jan 6 10:19:05 MST 2018


Why would the front end or engine need to know this information? Would it
help the front end developers or users to know it? What do we gain by
adding this? (I'm not implying it wouldn't be beneficial. But the only
thing I know about Unicode is how the different UTF encodings work, so I
have no idea what use this information could be. I also think changes to
formats and information standards should be conservative instead of liberal)

--Greg

On Jan 6, 2018 11:01, "David Haslam" <dfhdfh at protonmail.com> wrote:

> Dear all,
>
> We've known for quite a few years that there are aspects of *Biblical
> Hebrew* that mean we should *avoid* converting the Unicode source text to
> *NFC* when we build a module.
>
> This prompts me to suggest that we ought to define a new *key* for .conf
> files.
>
> *Normalization=NFC* (this would be the default, and may be *omitted* for
> the vast majority of modules)
> *Normalization=Custom* (we should include this in certain Biblical Hebrew
> modules)
>
> This would make it clear to front-end developers and users alike that the
> source text was *not* converted to NFC during module build.
> i.e. *osis2mod* was used intentionally with the *-N* switch, in *accordance
> with the requirements of the source text provider*.
>
> The Unicode source text may already be encoded in *UTF-8* ; this memo is *only
> *about normalization.
>
> In the rare eventuality that there could arise a requrement for any of the
> other three normalization forms (*NFD*, *NFKC*, *NFKD*) defined by the
> Unicode Consortium,
> these would also be permitted values for the conf file key.
>
> A further benefit arises when a module needs to be updated.
> If the modules team sees that the .conf file includes the line
> *Normalization=Custom*
> they would be forewarned against converting to NFC through *inadvertently*
> omitting the *-N* switch during module build.
>
> *Aside*: Another language with a need for non-standard normalization is
> *Tibetan*. We don't yet have a module in that script.
>
> Best regards,
>
> David
>
> Sent with ProtonMail <https://protonmail.com> Secure Email.
>
>
> _______________________________________________
> sword-devel mailing list: sword-devel at crosswire.org
> http://www.crosswire.org/mailman/listinfo/sword-devel
> Instructions to unsubscribe/change your settings at above page
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.crosswire.org/pipermail/sword-devel/attachments/20180106/5687f7af/attachment.html>


More information about the sword-devel mailing list