[bt-devel] RE: UTF-8 and new module classes
Martin Gruner
bt-devel@crosswire.org
Thu, 24 May 2001 11:24:51 +0200
> Eventually, I would like to get any modules with characters that conflict
> with UTF-8 (any characters in the range 0x80 to 0xFF) into UTF-8 so we can
> do away with the Encoding value also and just accept everything as being
> UTF-8.
I didn't get this. Please explain.
> I should also retract my previous statement that we can get rid of the Font
> value because it's just a better idea to have numerous smaller fonts with
> the correct range for a module than to have a single huge font able to
> display all Unicode glyphs.
I favor moving from the font= tag to an encoding= tag. This way we'd not have
to use huge fonts, but still the flexibility to let the user choose his/her
font. E.g. encoding=iso8859-7 would define greek text. You can then just
display this text with a 1 Byte iso8859-7 font or map it into unicode for
different purposes.
IMO using standards is always a good way to go.
We could implement some mapping filters in sword which map from fontspecific
ascii encodings to the correct language specific encodings (Like a
bstgreek2iso8859-7 filter) to also support frontends favoring the font=
solution.
Some good links I want to recommend to you:
http://czyborra.com/
http://czyborra.com/charsets/iso8859.html
http://czyborra.com/charsets/cyrillic.html
Martin