[sword-devel] TEI markup support
DM Smith
dmsmith555 at yahoo.com
Mon May 12 20:35:42 MST 2008
On May 12, 2008, at 9:29 PM, Troy A. Griffitts wrote:
> My one concern about saying that we support TEI for dictionary
> encoding
> is the confusion it might bring to our support of OSIS.
>
> From what I remember, the current OSIS plan is to include some set of
> TEI markup to support dictionary markup. I wonder if things like
> <ref>
> would be included, since OSIS already includes <reference
> osisRef=...>.
My 2 cents:
From what I can see there are a few differences between TEI and OSIS:
1) TEI has <ref target="xxxx"> while OSIS has <reference osisRef="xxxx">
Chris has suggested that we use OSIS markup for xxxx since TEI does
not define the encoding of the target. With this, it would be simple
and trivial to transform the one to the other.
Related to this is that <reference> in OSIS is not free to be place
anywhere, it is only allowed in <notes>. TEI uses the <xr> element for
similar containment. Such a thing would be appropriate to add to OSIS.
2) Both TEI and OSIS have the <hi> element. TEI uses the attribute
rend and OSIS uses type to indicate the nature of highlighting. Again,
a simple transformation would be sufficient.
3) TEI has rich content markup for dictionary elements, such as
pronunciation, etymology, orthography, definition, senses.... From
what I was told, OSIS plans to include a set of these, though what
they are is not defined yet.
The container for a dictionary element in TEI is <entry> for
structured entries, <entryFree> for any child elements allowed, and
<superEntry> to collect entries into a larger one. In OSIS there is a
type attribute value of "entry" for <div>.
My experience with dictionaries is limited. Chris' can correct me if I
am wrong. I am of the impression that <entry>, because it is so highly
structured, will find little practical use in any dictionaries we
create. When I tried to encode Lockman's NAS Lexicons my first concern
was to preserve Lockman's content as provided. At first I tried
<entry> but I had to re-arrange some of what Lockman had and I
eliminated everything that was not structural. I was able to create a
style sheet that would produce what Lockman originally had.
The problem with this approach was that such a style sheet would be
appropriate for Lockman's NAS Lexicons but probably not for other
dictionaries. My guess is that we don't want to have stylesheets on a
per module basis.
So I re-coded it using <entryFree> and even with no styling (i.e.
using the PlainFilter) the text is exactly as Lockman had it.
4) TEI for dictionaries does not have milestoned container elements.
It simply is not needed. OSIS allows the <div> entry to be milestoned
with sID and eID. For dictionaries, it should be discouraged.
Some other random thoughts/opinions:
If you consider dictionaries such as are currently in TEI, I think it
does not make sense to encode them into OSIS. For secular works such
as Webster's Dictionary, I think it makes sense to encode it in TEI
and make it available. And in the header of the xml, put our signature
with a note that it can be freely used under such and so a license
that requires attribution to be retained.
TEI is a well defined standard with a robust definition. My suggestion
would be for OSIS to adopt Chris' TEI schema as a part of OSIS with
minor changes.
I would also suggest that it be a separate stand-alone schema, as
there is very little overlap between the elements in a dictionary and
the elements in a Bible.
In His Service,
DM
More information about the sword-devel
mailing list