[sword-devel] New Accented Greek NT with Morph

Mon Apr 25 03:31:29 MST 2005

This Accented Greek NT thing is great, and I'd like to share some thoughts 
about it. I'm not a specialist in fonts, Greek or unicode, so be warned. I 
hope this gives some new and useful thoughts if not else.

First, about de/precomposed characters. If the text uses decomposed 
characters, the font renderer has to compose them. A Bible software cannot 
make any difference, it might be good looking or bad looking depending on the 
renderer. If the text uses precomposed characters, the renderer renders the 
glyphs straight from the font file, and it's up to the font author to make a 
good looking glyph. Of course also when using precomposed characters the font 
file may have bugs which the renderer cannot fix, and renderer may not handle 
correctly the situations where a glyph in the font file is actually a link to 
some other glyphs (this is not the same thing as de/precomposing!).

So there are many renderers, many fonts and many combinations, and any of them 
may have bugs. Using decomposed characters adds more chances of having bugs. 
Therefore I would prefer the precomposed form.

I did not know there are free (like in thought) unicode fonts covering Greek 
Extended before this thread. I like FreeFont, though it has some bugs. I 
think I found the reason for those bugs, now I have to report them.

If there are reasons for the Sword library being Free Software, there are also 
reasons for fonts being Free or Open. In my humble opinion we as individuals 
and the Sword project as a whole could support FreeFont project in some way 
or another. Finding and reporting bugs is one way. It would be quite 
short-sighted to choose a good looking but non-free font for use with Sword.

Fonts are of course not the problem of library, but of the frontends. However, 
I think there are many developers here who are working for the frontends. A 
Bible software could even include the font files, and that would help the 
users because they would not have to find a proper font from their system. At 
least the software developers could add pointers to the Free font files into 
documentation.

Here is more information about FreeFont:
http://www.nongnu.org/freefont/
http://savannah.nongnu.org/projects/freefont/

I have put some screenshots in my www pages. I think they show quite clearly 
that precomposed is better than decomposed. I copied the text shown in
http://crosswire.org/study/parallelstudy.jsp?add=WHNU&add=WHAC&add=WHACD.
Unfortunately I did not get that page (the fonts) working with Konqueror or 
Firefox. The CSS is too complicated to edit by hand and makes the worst 
possible mistake usability wise: it overwrites the settings which the user 
has got right before. I could use FreeFont, Gentium or some other and I think 
that the browsers could handle them. But the CSS gives other font names and 
either I don't have them or they don't include Greek Extended properly.

Anyways, I copied some verses to KWord and OpenOffice (I use Debian 
GNU/Linux). They render the fonts differently. Both render the precomposed 
characters well. Both have problems with decomposed characters. Look at verse 
1, Iakobos and diaspora, and verse 4, ina eete. I used two fonts, Gentium and 
FreeSerif. FreeSerif looks better. Additionally FreeFont has also sans serif 
and monospace fonts, and sans serif looks even better or is easier to read 
with small sizes.

Here are the screenshots, they are large pictures:
http://iki.fi/eelik/kwordjacobgreek.png
http://iki.fi/eelik/oojacobgreek.png

Then, about searching. If you want to do the search using accents you have to 
know exactly what you want. Remember that accents may depend on other words 
than which you are searcing for. Also if you don't know Greek very well it 
might be hard to remember the accents even though you remember some word. 
Only rarely someone wants to really search for accents. Mostly those who use 
Sword want to do biblical interpretation, not linguistic research. Therefore 
I think that accents should in some way or another be excluded from 
searching.

For canonical New Testament the best solution might be using search with 
Strong's numbers or some equivalent. There already are modules with Strong's 
numbers and morphological tags and the new modules also have at least the 
morphological tags. Those tags give the possibility to search by any form of 
the word, and accents may be ignored. Doing syntactical analysis becomes 
possible too, and it is not a small advantage.

It is up to a frontend software to make this kind of search usable. For the 
Sword library it would be enough to offer the search for text letter by 
letter, and search for numbers/tags.

If someone wants to have search with Greek words and accents, precomposed form 
would be better. I think it is faster to do a search with precomposed 
characters because there is less to compare. Only if someone wants to search 
for a word where e.g. "the last alpha may have grave OR acute" the decomposed 
form would be better. And actually even then the frontend could alter the 
search string by normalizing and making the proper OR statements.

Troy wrote that we could "b)NFC both the search string and the text before 
searching". But why NFC the text before searching? The text should be 
normalized to NFC or NFD already, there is no reason to offer a 
non-normalized module. The search string can be normalized to any known form, 
whether it be NFC or NFD, if the form of the text module is known. 
(Normalization forms are quite hard to understand reading the Unicode 
documentation, I suppose NFC means the most precomposed form and NFD the most 
decomposed.)

The bottom line is this:
1. Precomposed is better. I don't see any reason to use decomposed text in 
modules.
2. It would be good to support Free or Open fonts in some way or another.

-- 
Eeli Kaikkonen