[sword-devel] demo TEI modules

davidtroidl at aol.com davidtroidl at aol.com
Tue Sep 18 10:32:49 MST 2007


Sorting is most likely done by Unicode order.  (The Unicode character database, UnicodeData.txt, is available from http://www.unicode.org/Public/UNIDATA/)  To even alphabetize polytonic Greek, I had to reduce each character to its lower case, unaccented form, sort on those first, then on the actual word.

Peace,

David



 was a bit surprised by the collation of some of the words. Any word 
ith an accent is collated "out of order". For example Ā is sorted after 
 (there are no words beginning with Z) in the Anglo-Saxon Dictionary.




-----Original Message-----
From: DM Smith <dmsmith555 at yahoo.com>
To: SWORD Developers' Collaboration Forum <sword-devel at crosswire.org>
Sent: Tue, 18 Sep 2007 11:48 am
Subject: Re: [sword-devel] demo TEI modules



Chris,
ot that I would find any personal use in the dictionaries, I think they 
re great. Hopefully, these will set the stage for osis to model this 
nto it's schema.
I've got some work to do on JSword to get them to display properly in 
ibleDesktop. Currently it is using a "plain text" filter. And the 
erformance is terrible, because of how JSword slurps the entire module 
o display the entire list of words. I didn't see any TEI to HTML 
ilters in Sword, so I guess there is some work to do there too.
I was a bit surprised by the collation of some of the words. Any word 
ith an accent is collated "out of order". For example Ā is sorted after 
 (there are no words beginning with Z) in the Anglo-Saxon Dictionary.
It appears that it is comparing the bytes and not the characters (let 
lone the code points of the characters).
Also, TEI entry and entryFree and superEntry tags define the attribute 
key" to be used to control the collation of the words. Can this be 
everaged?
Related, when doing a lookup should we allow lookup without the diacritics?
In Him,
M
Chris Little wrote:
 I posted a set of demo lexicons using TEI markup internally for people 
 to play with and test TEI-filters on:

 BosworthToller:    An Anglo-Saxon Dictionary       (Old English-English)
 CleasbyVigfusson:An Icelandic-English Dictionary (Old Icelandic-English)
 LewisShort:    A Latin Dictionary              (Latin-English)

 The last might be of use to folks trying to read the vulgate, but the 
 first two don't have much use in Bible software. But I use them 
 personally and thought I might as well package them up to share as demos.

 They are buggy. I know that.

 I'll see about posting a Middle Liddell soon, but it's been a bit more 
 of a challenge because of character encoding.

 --Chris

 _______________________________________________
 sword-devel mailing list: sword-devel at crosswire.org
 http://www.crosswire.org/mailman/listinfo/sword-devel
 Instructions to unsubscribe/change your settings at above page

   

______________________________________________
word-devel mailing list: sword-devel at crosswire.org
ttp://www.crosswire.org/mailman/listinfo/sword-devel
nstructions to unsubscribe/change your settings at above page


________________________________________________________________________
Email and AIM finally together. You've gotta check out free AOL Mail! - http://mail.aol.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.crosswire.org/pipermail/sword-devel/attachments/20070918/223845de/attachment.html 


More information about the sword-devel mailing list