<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta content="text/html;charset=UTF-8" http-equiv="Content-Type">
<title></title>
</head>
<body bgcolor="#ffffff" text="#000000">
Like I said, for what it's worth... Perhaps Lexique will give you some
food for thought. It's a nice program for producing a nicely typeset
dictionary quickly and with minimal expertise. I might create Perl
scripts to go between TEI and their format if I ever have the need.<br>
<br>
I should have started with a question. When tagging dictionaries, then,
is the onus on the module creator to add punctuation and other means of
distinguishing elements? That would certainly be simpler for the
front-end developers, but if that is the case it would be helpful to
know what elements are styled in what way so that module creators don't
double-up on that sort of thing. Or if I use the <hi> element and
a stylesheet does otherwise, will it matter? Should I just "style-away"
using TEI?<br>
<br>
Daniel<br>
<br>
DM Smith wrote:
<blockquote cite="mid:9CAC1D0A-8A11-4350-8EE1-6C849E8F05D7@yahoo.com"
type="cite"><br>
On May 19, 2008, at 12:41 AM, Daniel Owens wrote:
<br>
<br>
<blockquote type="cite">The HTML didn't come through very well. Here
is a screenshot of the Lexique-formatted entry:
<br>
<entry from Lexique.jpg>
<br>
Daniel
<br>
<br>
Daniel Owens wrote:
<br>
<blockquote type="cite"><br>
I have been working on some TEI dictionaries, and (this is obvious, I
know)
<br>
vanilla TEI produces very boring entries in the front-ends. I point
this out as
<br>
a preface to offering a suggestion for front-end developers preparing
to
<br>
introduce TEI support. Here is a typical TEI entry:
<br>
<br>
<entry key="an toạ">
<br>
<form><orth>an toạ</orth><pron>(phonetic
representation)</pron></form>
<br>
<gramGrp><pos>verb</pos></gramGrp>
<br>
<def>To take a seat, to be seated</def>
<br>
<eg><q>mời các vị an
toạ</q></eg><trans><tr>pray, everyone, take a
<br>
seat</tr></trans>
<br>
</entry>
<br>
<br>
Here is what it looks like in BibleTime:
<br>
<br>
AN TOẠ an toạ(phonetic representation)verb To take a seat, to be
seated mời
<br>
các vị an toạpray, everyone, take a seat
<br>
</blockquote>
</blockquote>
Is BibleTime built with the latest from SVN? If not, then it will use
the Plaintext filter. The TEI filter does stylization.
<br>
<br>
<blockquote type="cite">
<blockquote type="cite"><br>
<br>
I'm not meaning to pick on BibleTime--BibleCS only formats the part of
speech in
<br>
italics.
<br>
</blockquote>
</blockquote>
<br>
The TEI filters could stand some improvement. They only style a few
elements. But not <orth> <pron> ... For example,
<orth> could be bold, <pron> be italic, ....
<br>
<br>
I've made some suggestions and implemented them in BibleDesktop.
<br>
<br>
So take a look at BibleDesktop for an example of what can be done.
<br>
<br>
<blockquote type="cite">
<blockquote type="cite"><br>
<br>
Here's the suggestion. Recently a friend of mine pointed me to an
SIL-developed
<br>
program that can be used to create and publish lexicons. It's called
Lexique
<br>
Pro, and you can download it at <a class="moz-txt-link-freetext" href="http://www.lexiquepro.com/download.htm">http://www.lexiquepro.com/download.htm</a>.
They use
<br>
a TeX-like method of tagging data, but there's no reason why what they
have done
<br>
can't be applied to XML data. Here is the above example formatted by
Lexique Pro:
<br>
<br>
*an toạ* /verb. /[(phonetic representation)];To take a seat, to be
seated.
<br>
*mời các vị an toạ* pray, everyone, take a seat.
<br>
<br>
Notice that they have varied the font, font size, font color, bold, and
italics
<br>
of each part of the entry so that it is easier to read. They have also
added
<br>
punctuation to separate parts of the entry.
<br>
</blockquote>
</blockquote>
<br>
The problem with <entry> as opposed to <entryFree> is that
it is difficult to encode the entry as found in the printed work.
<br>
<br>
<entry> is more like a database entry. The <entry> requires
elements to be in a particular order and nested in a particular fashion
and may not allow text in places one would want.
<br>
<br>
<entryFree> is more like a document. The elements can come in
any order, nested in any fashion and text can be interspersed as
desired. With entry free, it is important not to add "punctuation" as
one should assume that every "jot and tittle" is present.
<br>
<br>
When it comes to the SWORD engine (also JSword), our filters do not
invent punctuation. Just styling. Also our filters do not reorder
content. It merely dumps text content with styling base upon the
element containing it.
<br>
<br>
To properly handle <entry> and <entryFree> it probably is
necessary to note that and use it to decide on adding punctuation.
<br>
<br>
<br>
I think that as we transform e-texts into TEI that <entryFree>
will be what's used. <entry> seems more appropriate for original
works.
<br>
<br>
<br>
<blockquote type="cite">
<blockquote type="cite"> Before I heard about the upcoming
<br>
TEI support I had put together a dictionary using THML, complete with
<br>
punctuation and line breaks to help make it easier to read the entry.
That's not
<br>
the role of the TEI xml file, though. Lexique Pro's way of handling
entries is
<br>
not the only way, but I suggest it as ONE useful way developed by
people who
<br>
deal with lexicons daily.
<br>
</blockquote>
</blockquote>
<br>
I haven't looked at Lexique, but it sounds interesting.
<br>
<br>
In Him,
<br>
DM
<br>
<br>
<pre wrap="">
<hr size="4" width="90%">
_______________________________________________
sword-devel mailing list: <a class="moz-txt-link-abbreviated" href="mailto:sword-devel@crosswire.org">sword-devel@crosswire.org</a>
<a class="moz-txt-link-freetext" href="http://www.crosswire.org/mailman/listinfo/sword-devel">http://www.crosswire.org/mailman/listinfo/sword-devel</a>
Instructions to unsubscribe/change your settings at above page</pre>
</blockquote>
<br>
<pre class="moz-signature" cols="72">--
PMBX license 1502
</pre>
</body>
</html>