[sword-devel] TEI Dictionaries was Re: Bible Software Review
Chris Little
chrislit at crosswire.org
Thu May 1 14:51:59 MST 2008
DM Smith wrote:
> The TEI filter does not do much styling. May I suggest that you add
> styling for the elements I've used for the NASB lexicons.
Actually... I was looking at your encoding just now and comparing it to
my own TEI. :)
We should probably figure out what we want to do regarding TEI now,
rather than later, since there isn't yet any content in the wild, aside
from the Perseus content in beta. So feel free to say, "no that's a bad
idea" to any of my suggestions below.
First, I would recommend we support P4 for backward compatibility with
Perseus content and a few other sites publishing P4 content. But I would
recommend that we only produce P5 content ourselves when converting
non-TEI content. P4 support by TEI will only extend until 2012, and P5
itself has a number of improvements, in my opinion, not the least of
which is that it is more in line with modern XML usage and schema
validation.
The P5 dictionary reference is here:
http://www.tei-c.org/release/doc/tei-p5-doc/en/html/DI.html
> Here are the ones that I use with the styling that I am using in
> BibleDesktop:
> orth bold
> pron italic
> etym haven't decided
> def italic
> usg plain
>
> Also TEI used rend and not type for the hi element.
>
> Some of these may already be handled.
I don't know the condition of our TEI filters in Sword, but I suspect
whatever is there remains rudimentary and I take from Karl's email
regarding the NAS dictionaries that I only wrote RTF filters.
> For Strong's references, I am using <ref target="id">key text</ref>.
> For BDB references, I am using <xref doc="bdb" to="id (target)">text</xref>
> (I don't expect either of these to be handled except to have their text
> shown, which is what SWORD does!)
I think the dictionary cross-reference element (P4 & P5) is just <xr>,
whereas <ref> is a more generic element found in the core module.
The example from the manual for <xr> embeds a <ref> with an <xr> thus:
<entry>
<form>
<orth>lavage</orth>
</form>
<etym>[Fr. < <mentioned>laver</mentioned>;
L. <mentioned>lavare</mentioned>,
to wash; <xr>see <ref>lather</ref>
</xr>].
</etym>
</entry>
The only thing I would add to that is some typing. For "see" references,
I've been using <xr type="see">. So I think (putting it all together, we
would want: <xr type="see">see <ref target="lather">later</ref></xr>.
We can interpret TEI in basically the same way as we do OSIS, by
expecting a "word:ref" type of target if the reference is outside the
current document/module and expecting just "ref" if it is within the
current document/module. So, in P5, I would encode your examples as:
<xr type="xref"><ref target="key text">key text</ref></xr> and
<xr type="xref"><ref target="BDB:id">text</ref></xr>
> Also, can you see a way that we can combine the Greek and Hebrew
> lexicons into 1?
I thought we had decided to encode Greek and Hebrew keys as
/[GH][0-9]{4}/, that is, with a G or H prefix and 4 digits with leading
zeros.
> I also noticed the schema for TEI dictionaries on the wiki has osisID
> and osisRef. I didn't study the schema, but at a glance I didn't see
> where or how these are used. Would you shed some light?
I'm open to suggestions for the schema as well. I put osisID and osisRef
within the att.global.linking attribute group, so they are present on
all (or at least almost all) elements. I've been thinking about whether
this is appropriate and think it may be better to only put osisRef on
<ref> or within a more limited attribute group, such as att.pointing.
--Chris
More information about the sword-devel
mailing list