[osis-core] Schema: type on language
Todd Tillinghast
osis-core@bibletechnologieswg.org
Fri, 17 Oct 2003 11:16:30 -0600
Chris,
If there is a way to unambiguously express ALL of the various language
values using xml:lang in a IETF compliant string then it would seem to
make sense to use that same structure for the value of <language> and
for xml:lang AND not have a type="..." set of enumerated types.
Ex:
Javanese for which there is not ISO code:
<osisText xml:lang="x-SIL-JVN">
and
<work>
<language>x-SIL-JVN</language>
</work>
Albanian:
<osisText xml:lang="sq">
and
<work>
<language>sq</language>
<language>x-ISO-639-1-sq</language>
<language>x-ISO-639-2-T-sqi</language>
<language>x-ISO-639-2-B-alb</language>
<language>x-SIL-ALS</language>
</work>
This would keep the xml:lang and <language> values consistent. It would
seem that we will have to enumerate the "x-" alternatives for xml:lang
in the documentation so we might as well use the same structure both
places.
I believe that "x-" is allowed in the w3c's xml.xsd schema so the above
options should work. (Naturally if there is already an established
syntax for ISO values within xml:lang we should use it rather than my x-
values above.)
Todd
> -----Original Message-----
> From: osis-core-admin@bibletechnologieswg.org [mailto:osis-core-
> admin@bibletechnologieswg.org] On Behalf Of Chris Little
> Sent: Friday, October 17, 2003 10:58 AM
> To: osis-core@bibletechnologieswg.org
> Subject: RE: [osis-core] Schema: type on language
>
> Todd,
>
> xml:lang is of type xs:language, which conforms to RFC 3066 by
definition
> (though, not necessarily in practice). RFC 3066 is what we mean by
our
> IETF type. So xml:lang essentially always has our IETF type.
>
> This is why I would suggest we recommend in prose that all languages
used
> in a document have, at least, an element with type="IETF".
>
> Now... someone should really check that whether an order or precedence
is
> described in RFC 3066 so that we can adopt it or describe our own to
make
> codes like even "alb" non-conformant (since it has a corresponding
> ISO-639-1 equivalent). I'll check into it Sunday when I'm settled in
> Dallas, if no one beats me to it.
>
> --Chris
>
> On Fri, 17 Oct 2003, Todd Tillinghast wrote:
>
> > Chris,
> >
> > The thing I am struggling with is that as an attribute xml:lang does
not
> > have a type attribute. The value of <language> is qualified by the
> > "type" attribute.
> >
> > Do we simply pre-pend the type value for xml:lang?
> > (xml:lang="x-ISO-639-2-T-alb") This seems problematic in that the
"-"
> > is used in the "type" part as well as to separate the type value
from
> > the language value.
> >
> > Todd
> >
> > > -----Original Message-----
> > > From: osis-core-admin@bibletechnologieswg.org [mailto:osis-core-
> > > admin@bibletechnologieswg.org] On Behalf Of Chris Little
> > > Sent: Thursday, October 16, 2003 8:33 PM
> > > To: osis-core@bibletechnologieswg.org
> > > Subject: Re: [osis-core] Schema: type on language
> > >
> > > Todd,
> > >
> > >
> > > Todd Tillinghast wrote:
> > >
> > > > Chris,
> > > >
> > > > Thanks that does make it much clearer.
> > > >
> > > > How do we interpret <language> with no "type" attribute?
> > > >
> > > > Should "type" be required or defaulted?
> > >
> > > I don't think type is necessarily required for <language>. If we
> > like,
> > > we could make "base" the default value, since it is most generic
and
> > > most common. We could require it in prose for > level 0
conformance.
> > > (As with xml:lang on <osisText>, I can live with it if others feel
> > this
> > > should be required.)
> > >
> > >
> > > > Does this eliminate the "x-SIL-ALS" form or is that just for
> > xml:lang?
> > >
> > > relevent examples:
> > >
> > > >>(Albanian)
> > > >><language type="ISO-639-1">sq</language>
> > > >><language type="ISO-639-2-T">sqi</language>
> > > >><language type="ISO-639-2-B">alb</language>
> > > >><language type="SIL">ALS</language>
> > > >><language type="IETF">sq</language>
> > >
> > > I would recommend that we identify a canonical order of precedence
if
> > > RFC 3066 doesn't, namely: ISO-639-1, ISO-639-2-T*, IANA, SIL,
> > LINGUIST.
> > > That is, you should only use an SIL code if none exists in
either
> > the
> > > ISO or the IANA code lists.
> > >
> > > So, while ALS is still a valid SIL code, the IETF form should be
> > > identical to the ISO-639-1 form, since it exists.
> > >
> > > (* I chose ISO-639-2-T rather than -B because it is based on
ISO-639-1
> > > whereas -B is based on MARC language codes.)
> > >
> > > > How do the xml:lang and <language> coordinate?
> > >
> > > To expand on my other reply.... xml:lang on <osisText> should
match
> > > <language type="base"> (and/or in some cases, <language
> > > type="translation">. xml:lang on <foreign> and <q> should
probably
> > > match ith <language type="quotation">. The others are probably a
> > little
> > > more complex to divine. But if a language is identified in an
> > xml:lang
> > > value, there should be some corresponding <language> element in
the
> > > header, and if its type="IETF", they should match. (Should we
require
> > > in prose that a <language type="IETF"> element occurs for each
> > language
> > > in order to match with xml:lang values?)
> > >
> > > Interlinears might be a little different. If the interlinear is
done
> > by
> > > using <w gloss="">, there's not necessarily going to be any
indication
> > > of the interlinear's language except in the <language> element
itself.
> > >
> > > --Chris
> > >
> > >
> > >
> > > _______________________________________________
> > > osis-core mailing list
> > > osis-core@bibletechnologieswg.org
> > > http://www.bibletechnologieswg.org/mailman/listinfo/osis-core
> >
> > _______________________________________________
> > osis-core mailing list
> > osis-core@bibletechnologieswg.org
> > http://www.bibletechnologieswg.org/mailman/listinfo/osis-core
> >
>
> _______________________________________________
> osis-core mailing list
> osis-core@bibletechnologieswg.org
> http://www.bibletechnologieswg.org/mailman/listinfo/osis-core