[sword-devel] OSIS -vs- font requirements

Chris Little chrislit at crosswire.org
Wed Oct 29 12:15:05 MST 2008



Karl Kleinpaste wrote:
> I am asking because I was looking at bible.org's NETnotes module and
> realizing how much damage is done by the fact that it contains no font
> support.  In both print and HTML, NET Bible's notes use a great deal of
> font specification to provide high-quality Hebrew and Greek,
> transliteration clarity, and the interesting one-character exotic Greek
> figures used for witness names.

IIRC, the NET Bible makes use of roughly 5 fonts (in addition to 
something basic for the Latin characters that make up the bulk of the text).

There are Greek and Hebrew script fonts, both easily covered by Unicode.
There are Greek and Hebrew transliteration fonts, also easily covered by 
Unicode, but perhaps not as obvious to map as the Greek & Hebrew 
characters themselves.
And there is an apparatus font.

The apparatus font is the only one with much difficulty, and even for 
that, the difficulty is extremely minimal. The last time I did a 
conversion of the NET text to a Sword-friendly format, there were only a 
couple of characters used in the apparatus font that didn't have exactly 
equivalent codepoints in Unicode. The remainder had codepoints that 
looked decently similar, so I mapped to those as a cheat.

It's been a long time since I did that conversion, so there's a good 
chance that all of the necessary apparatus characters now exist in 
Unicode. If any remain, they should be submitted to Unicode for addition 
to the repertoire. (I'm on the Unicode Technical Committee, so I'd be 
happy to help write a proposal for additional characters, which could go 
before the script subcommittee as soon as the February meeting.) Since I 
know there was a proposal for a number of new NT apparatus symbols in 
the last few years, I wouldn't be surprised if the whole text of the NET 
could be encoded in Unicode now.

> I have a strong sense that OSIS is loved because it attempts to achieve
> purity by separation of structure from display.  But this is an area
> where I am convinced that it is on the losing end.  I'll take a stance
> of extremism, perhaps, to claim that 99% of the users of Sword Project
> applications want to read well-formatted Bibles and other content, and
> they don't care in the slightest about internal purity.  How could OSIS
> (or its support in Sword) be augmented to handle this?

Using 8-bit font encodings also makes search much more difficult and 
takes us back in a direction away from standards. 8-bit font encodings 
are the wrong solution to the problem, especially given that a much 
simpler solution exists that maintains our standards and our desire for 
standards compliance.

> Perversely, although bible.org's NET Bible modules (3 flavors) are OSIS,
> the 2 varieties of notes module are ThML.  So NETnotes could preserve
> fonts in this manner, but the specifications were stripped out.

The last time I converted the NET, I moved it into ThML (since this was 
pre-OSIS) but I still dumped the font attribute since it's not useful 
once you move to Unicode. (By which I don't mean that you wouldn't want 
to use font specification with Unicode text, but rather that the NET's 
Galaxie fonts weren't mapped to Unicode, so there was no benefit to 
using font specification in this case.)

--Chris



More information about the sword-devel mailing list