[sword-devel] for the love of unicode

Joachim Ansorg sword-devel@crosswire.org
Mon, 18 Jun 2001 21:45:34 +0200


I'd vote for a sole UTF-8 solution.
The frontends have to provide their own converters to local encodings if they 
do not support UTF-8 or Unicode.
We need _one_ standard, not more!

Joachim

> > > What about fronends which don't support UTF-8?
> > > IMO they can normally only display latin-1 chars, so we should have a
> > > conversion UTF-8 to Latin-1.
> >
> > Do you think we should do UTF-8 to x converters for all the formats we
> > use, or just Latin-1?  We could go use the Greek & Hebrew ISO standards
> > too, plus KOI8 for Cyrillic translations.
> >
> > What should happen if you want to access a text that uses characters
> > outside of Latin-1, like the BHS or a Chinese/Japanese translation, from
> > iraeneus?
>
> There we are at the beginning point again, when thinking about how to store
> the modules.
> If we decide to provide functions for locale-specific output (iso5589-x,
> ...) we can also store the modules in these encodings, reducing the sizes.
> If not, we shouldn't do it, also not for greek modules which are now
> encoded in "symbol".
>
> Chris, you may want to take a look at the QT sources (QT 2.3.1,
> http://www.trolltech.com). They have all the conversion tables compiled in
> and working somehow. The object is called QTextCodec.
>
> I'd personally prefer to store the modules in their native (but
> standardized!) encodings and provide
> a) native
> b) UTF-8
> in- and output functions.
> But this will be much more complicated (what about 2Byte charsets?) than
> just switching to UTF-8 and may not even be worth the effort.
>
> Martin