[sword-devel] New Accented Greek NT with Morph
Eeli Kaikkonen
eekaikko at mail.student.oulu.fi
Mon Apr 25 03:31:29 MST 2005
This Accented Greek NT thing is great, and I'd like to share some thoughts
about it. I'm not a specialist in fonts, Greek or unicode, so be warned. I
hope this gives some new and useful thoughts if not else.
First, about de/precomposed characters. If the text uses decomposed
characters, the font renderer has to compose them. A Bible software cannot
make any difference, it might be good looking or bad looking depending on the
renderer. If the text uses precomposed characters, the renderer renders the
glyphs straight from the font file, and it's up to the font author to make a
good looking glyph. Of course also when using precomposed characters the font
file may have bugs which the renderer cannot fix, and renderer may not handle
correctly the situations where a glyph in the font file is actually a link to
some other glyphs (this is not the same thing as de/precomposing!).
So there are many renderers, many fonts and many combinations, and any of them
may have bugs. Using decomposed characters adds more chances of having bugs.
Therefore I would prefer the precomposed form.
I did not know there are free (like in thought) unicode fonts covering Greek
Extended before this thread. I like FreeFont, though it has some bugs. I
think I found the reason for those bugs, now I have to report them.
If there are reasons for the Sword library being Free Software, there are also
reasons for fonts being Free or Open. In my humble opinion we as individuals
and the Sword project as a whole could support FreeFont project in some way
or another. Finding and reporting bugs is one way. It would be quite
short-sighted to choose a good looking but non-free font for use with Sword.
Fonts are of course not the problem of library, but of the frontends. However,
I think there are many developers here who are working for the frontends. A
Bible software could even include the font files, and that would help the
users because they would not have to find a proper font from their system. At
least the software developers could add pointers to the Free font files into
documentation.
Here is more information about FreeFont:
http://www.nongnu.org/freefont/
http://savannah.nongnu.org/projects/freefont/
I have put some screenshots in my www pages. I think they show quite clearly
that precomposed is better than decomposed. I copied the text shown in
http://crosswire.org/study/parallelstudy.jsp?add=WHNU&add=WHAC&add=WHACD.
Unfortunately I did not get that page (the fonts) working with Konqueror or
Firefox. The CSS is too complicated to edit by hand and makes the worst
possible mistake usability wise: it overwrites the settings which the user
has got right before. I could use FreeFont, Gentium or some other and I think
that the browsers could handle them. But the CSS gives other font names and
either I don't have them or they don't include Greek Extended properly.
Anyways, I copied some verses to KWord and OpenOffice (I use Debian
GNU/Linux). They render the fonts differently. Both render the precomposed
characters well. Both have problems with decomposed characters. Look at verse
1, Iakobos and diaspora, and verse 4, ina eete. I used two fonts, Gentium and
FreeSerif. FreeSerif looks better. Additionally FreeFont has also sans serif
and monospace fonts, and sans serif looks even better or is easier to read
with small sizes.
Here are the screenshots, they are large pictures:
http://iki.fi/eelik/kwordjacobgreek.png
http://iki.fi/eelik/oojacobgreek.png
Then, about searching. If you want to do the search using accents you have to
know exactly what you want. Remember that accents may depend on other words
than which you are searcing for. Also if you don't know Greek very well it
might be hard to remember the accents even though you remember some word.
Only rarely someone wants to really search for accents. Mostly those who use
Sword want to do biblical interpretation, not linguistic research. Therefore
I think that accents should in some way or another be excluded from
searching.
For canonical New Testament the best solution might be using search with
Strong's numbers or some equivalent. There already are modules with Strong's
numbers and morphological tags and the new modules also have at least the
morphological tags. Those tags give the possibility to search by any form of
the word, and accents may be ignored. Doing syntactical analysis becomes
possible too, and it is not a small advantage.
It is up to a frontend software to make this kind of search usable. For the
Sword library it would be enough to offer the search for text letter by
letter, and search for numbers/tags.
If someone wants to have search with Greek words and accents, precomposed form
would be better. I think it is faster to do a search with precomposed
characters because there is less to compare. Only if someone wants to search
for a word where e.g. "the last alpha may have grave OR acute" the decomposed
form would be better. And actually even then the frontend could alter the
search string by normalizing and making the proper OR statements.
Troy wrote that we could "b)NFC both the search string and the text before
searching". But why NFC the text before searching? The text should be
normalized to NFC or NFD already, there is no reason to offer a
non-normalized module. The search string can be normalized to any known form,
whether it be NFC or NFD, if the form of the text module is known.
(Normalization forms are quite hard to understand reading the Unicode
documentation, I suppose NFC means the most precomposed form and NFD the most
decomposed.)
The bottom line is this:
1. Precomposed is better. I don't see any reason to use decomposed text in
modules.
2. It would be good to support Free or Open fonts in some way or another.
--
Eeli Kaikkonen
More information about the sword-devel
mailing list