[osis-core] Schema: type on language
Todd Tillinghast
osis-core@bibletechnologieswg.org
Sun, 19 Oct 2003 08:22:59 -0600
Chris,
Are you saying that you will not able to sort out which of the many
forms allowed in IETF/xml:lang has been stated and that you would like
to use <language type="...">language code</language> to help sort out
with case has been encoded, but that the values for <language> and
xml:lang would be identical?
That seems resonable.
It also seems unfortunant that the XML/ISO standards bodies have made it
difficult for it to be obvious which standard is being used. (I am sure
with an enumeration of all possible values you can derive which standard
a value comes from.)
I am not sure why you want to add "French", "English", and "native"?
This would seem to further confuse the situation. Maybe I don't
understand how you would use them.
Relative to people using codes like "Austronesian (Other)", I think the
documentation should recommend a "concrete" language for xml:lang and
that a <language> entry for "Austronesian (Other)" would be fine to use
within <work> in addition to the "concrete" language code.
Todd
> -----Original Message-----
> From: osis-core-admin@bibletechnologieswg.org
> [mailto:osis-core-admin@bibletechnologieswg.org] On Behalf Of
> Chris Little
> Sent: Sunday, October 19, 2003 2:25 AM
> To: osis-core@bibletechnologieswg.org
> Subject: RE: [osis-core] Schema: type on language
>
>
>
> Todd,
>
> For one, it's questionable whether we can really say any
> language can be
> unambiguously identified. But let's suppose we really know
> what English
> is and we really know that 'en' identifies it. ISO 639 does
> a better job
> of unambiguously identifying some languages than it does for others.
> There are a bunch of codes that describe groups of codes,
> such as "Native
> America Indian" and "Austronesian (Other)".
>
> So, it's not quite true that Javanese has no ISO code, it's
> just a very,
> very ambiguous code shared with hundreds of other langauges.
> (The code
> would be 'map' -- "Austronesian (Other)".)
>
> I think it is valuable to keep type="...", since some
> organizations use
> those codes themselves for various sorting purposes (e.g. the
> Library of
> Congress uses ISO 639-2/B and SIL uses Ethnologue codes). If
> they need to
> use such data, I think we should provide a place to hold it.
>
> But for interoperability, IETF/xml:lang is probably best.
>
> What are your thoughts on also adding "English", "French", &
> "native" to
> the types enumeration. Is that unnecessary/inappropriate?
>
>
> --Chris
>
>
> On Fri, 17 Oct 2003, Todd Tillinghast wrote:
>
> > Chris,
> >
> > If there is a way to unambiguously express ALL of the
> various language
> > values using xml:lang in a IETF compliant string then it
> would seem to
> > make sense to use that same structure for the value of
> <language> and
> > for xml:lang AND not have a type="..." set of enumerated types.
> >
> > Ex:
> > Javanese for which there is not ISO code:
> > <osisText xml:lang="x-SIL-JVN">
> > and
> > <work>
> > <language>x-SIL-JVN</language>
> > </work>
> >
> > Albanian:
> > <osisText xml:lang="sq">
> > and
> > <work>
> > <language>sq</language>
> > <language>x-ISO-639-1-sq</language>
> > <language>x-ISO-639-2-T-sqi</language>
> > <language>x-ISO-639-2-B-alb</language>
> > <language>x-SIL-ALS</language>
> > </work>
> >
> > This would keep the xml:lang and <language> values consistent. It
> > would seem that we will have to enumerate the "x-" alternatives for
> > xml:lang in the documentation so we might as well use the same
> > structure both places.
> >
> > I believe that "x-" is allowed in the w3c's xml.xsd schema so the
> > above options should work. (Naturally if there is already an
> > established syntax for ISO values within xml:lang we should use it
> > rather than my x- values above.)
>
>
>
>
> _______________________________________________
> osis-core mailing list
> osis-core@bibletechnologieswg.org
> http://www.bibletechnologieswg.org/mailman/lis> tinfo/osis-core
>