[sword-devel] OSIS -vs- font requirements
Chris Little
chrislit at crosswire.org
Wed Oct 29 12:15:05 MST 2008
Karl Kleinpaste wrote:
> I am asking because I was looking at bible.org's NETnotes module and
> realizing how much damage is done by the fact that it contains no font
> support. In both print and HTML, NET Bible's notes use a great deal of
> font specification to provide high-quality Hebrew and Greek,
> transliteration clarity, and the interesting one-character exotic Greek
> figures used for witness names.
IIRC, the NET Bible makes use of roughly 5 fonts (in addition to
something basic for the Latin characters that make up the bulk of the text).
There are Greek and Hebrew script fonts, both easily covered by Unicode.
There are Greek and Hebrew transliteration fonts, also easily covered by
Unicode, but perhaps not as obvious to map as the Greek & Hebrew
characters themselves.
And there is an apparatus font.
The apparatus font is the only one with much difficulty, and even for
that, the difficulty is extremely minimal. The last time I did a
conversion of the NET text to a Sword-friendly format, there were only a
couple of characters used in the apparatus font that didn't have exactly
equivalent codepoints in Unicode. The remainder had codepoints that
looked decently similar, so I mapped to those as a cheat.
It's been a long time since I did that conversion, so there's a good
chance that all of the necessary apparatus characters now exist in
Unicode. If any remain, they should be submitted to Unicode for addition
to the repertoire. (I'm on the Unicode Technical Committee, so I'd be
happy to help write a proposal for additional characters, which could go
before the script subcommittee as soon as the February meeting.) Since I
know there was a proposal for a number of new NT apparatus symbols in
the last few years, I wouldn't be surprised if the whole text of the NET
could be encoded in Unicode now.
> I have a strong sense that OSIS is loved because it attempts to achieve
> purity by separation of structure from display. But this is an area
> where I am convinced that it is on the losing end. I'll take a stance
> of extremism, perhaps, to claim that 99% of the users of Sword Project
> applications want to read well-formatted Bibles and other content, and
> they don't care in the slightest about internal purity. How could OSIS
> (or its support in Sword) be augmented to handle this?
Using 8-bit font encodings also makes search much more difficult and
takes us back in a direction away from standards. 8-bit font encodings
are the wrong solution to the problem, especially given that a much
simpler solution exists that maintains our standards and our desire for
standards compliance.
> Perversely, although bible.org's NET Bible modules (3 flavors) are OSIS,
> the 2 varieties of notes module are ThML. So NETnotes could preserve
> fonts in this manner, but the specifications were stripped out.
The last time I converted the NET, I moved it into ThML (since this was
pre-OSIS) but I still dumped the font attribute since it's not useful
once you move to Unicode. (By which I don't mean that you wouldn't want
to use font specification with Unicode text, but rather that the NET's
Galaxie fonts weren't mapped to Unicode, so there was no benefit to
using font specification in this case.)
--Chris
More information about the sword-devel
mailing list