[sword-devel] demo TEI modules
davidtroidl at aol.com
davidtroidl at aol.com
Tue Sep 18 10:32:49 MST 2007
Sorting is most likely done by Unicode order. (The Unicode character database, UnicodeData.txt, is available from http://www.unicode.org/Public/UNIDATA/) To even alphabetize polytonic Greek, I had to reduce each character to its lower case, unaccented form, sort on those first, then on the actual word.
Peace,
David
was a bit surprised by the collation of some of the words. Any word
ith an accent is collated "out of order". For example Ā is sorted after
(there are no words beginning with Z) in the Anglo-Saxon Dictionary.
-----Original Message-----
From: DM Smith <dmsmith555 at yahoo.com>
To: SWORD Developers' Collaboration Forum <sword-devel at crosswire.org>
Sent: Tue, 18 Sep 2007 11:48 am
Subject: Re: [sword-devel] demo TEI modules
Chris,
ot that I would find any personal use in the dictionaries, I think they
re great. Hopefully, these will set the stage for osis to model this
nto it's schema.
I've got some work to do on JSword to get them to display properly in
ibleDesktop. Currently it is using a "plain text" filter. And the
erformance is terrible, because of how JSword slurps the entire module
o display the entire list of words. I didn't see any TEI to HTML
ilters in Sword, so I guess there is some work to do there too.
I was a bit surprised by the collation of some of the words. Any word
ith an accent is collated "out of order". For example Ā is sorted after
(there are no words beginning with Z) in the Anglo-Saxon Dictionary.
It appears that it is comparing the bytes and not the characters (let
lone the code points of the characters).
Also, TEI entry and entryFree and superEntry tags define the attribute
key" to be used to control the collation of the words. Can this be
everaged?
Related, when doing a lookup should we allow lookup without the diacritics?
In Him,
M
Chris Little wrote:
I posted a set of demo lexicons using TEI markup internally for people
to play with and test TEI-filters on:
BosworthToller: An Anglo-Saxon Dictionary (Old English-English)
CleasbyVigfusson:An Icelandic-English Dictionary (Old Icelandic-English)
LewisShort: A Latin Dictionary (Latin-English)
The last might be of use to folks trying to read the vulgate, but the
first two don't have much use in Bible software. But I use them
personally and thought I might as well package them up to share as demos.
They are buggy. I know that.
I'll see about posting a Middle Liddell soon, but it's been a bit more
of a challenge because of character encoding.
--Chris
_______________________________________________
sword-devel mailing list: sword-devel at crosswire.org
http://www.crosswire.org/mailman/listinfo/sword-devel
Instructions to unsubscribe/change your settings at above page
______________________________________________
word-devel mailing list: sword-devel at crosswire.org
ttp://www.crosswire.org/mailman/listinfo/sword-devel
nstructions to unsubscribe/change your settings at above page
________________________________________________________________________
Email and AIM finally together. You've gotta check out free AOL Mail! - http://mail.aol.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.crosswire.org/pipermail/sword-devel/attachments/20070918/223845de/attachment.html
More information about the sword-devel
mailing list