[sword-devel] support for locale codes with region/script subtags

Chris Little chrislit at crosswire.org
Sun Feb 10 03:56:20 MST 2013


Just a quick heads up:

In general, locale codes (the Lang= field of .confs) can have subtags 
that indicate region, script, etc. Ideally these should be dealt with in 
some fashion by front ends since they identify important distinctions 
(in the eyes of the module maker or publisher at least).

When unknown subtags are encountered, it's probably best to recursively 
fall back to the tag minus its right-most subtag. For example, if 
'en-Latn-US' is unknown, fall back to 'en-Latn'. If that is unknown, 
fall back to 'en'. (Hopefully nearly all language subtags are known.)

We should handle this in the library, but currently don't. :(


As a specific case in point:
We now have two Urdu translations. They're the same translation and 
differ in their script (one is Arabic, the other Devanagari). Their 
language codes (as of the 1.2.1 release just made, which corrected the 
code for the Devanagari version) are: ur (Urdu in Arabic script--the 
usual script for Urdu) and ur-Deva (Urdu in Devanagari script).

Possible behaviors are to categorize the ur-Deva module as belonging to 
an unknown language (bad), to fall back and categorize it as simply Urdu 
(better, but certainly confusing if the language name is written in 
Arabic and the module is itself written in Devanagari), or to categorize 
it separately as Urdu written in Devanagari (best).

For implementers who localize the language name, Urdu written in Arabic 
is written "اردو". Urdu written in Devanagari is written "उर्दू".

--Chris



More information about the sword-devel mailing list