[sword-devel] What character encoding should I use.

Chris Little sword-devel@crosswire.org
Tue, 30 Jul 2002 19:46:56 -0700 (MST)


On Tue, 30 Jul 2002, Steve Juranich wrote:

> I have a question about the localization stuff.  Which encoding are we to use 
> when we write the localization files?  Unicode?  Some funky M$ encoding?  As 
> soon as I figure this out I'll get started on the Vietnamese files.

The encoding is currently Codepage 1252 (essentially Latin-1/ISO 8859-1 
plus a couple of extra code points).  This should change shortly to UTF-8, 
but maybe not before the next release.  (I'm not clear exactly where the 
code stands at this moment.)

> I also 
> see that there's no Croatian Bible text available for SWORD, but there is one 
> for BibleDataBase.  I'd be happy to work on converting this as well, again as 
> soon as I hear what the preferred encoding is for the text modules.

I've looked at the Croatian Bible a few times and recall some problems 
with it.  BibleDatabase got their text from Unbound Bible and the problem 
was either that the data had a lot of errors or it was in a non-KJV 
versification, which we can't handle currently.  But if you want to take a 
look at it, please give it a try.

--Chris