[jsword-devel] Hamza - was strongs search

Peter von Kaehne refdoc at gmx.net
Sun May 18 13:07:39 MST 2008


Sorry long story - but - as you will see related to this thread:

One of my Farsi Bible modules has a problem in BD - it shows boxes
wherever a hamza diacritic is used. A hamza is a funny little sign in
Farsi and Arabci + related scripts, which can be used as an individual
letter or be added to some as a diacritic.

It often means a glottal stop or - and that is the critical use for me
in Farsi - it is used to indicate a genitive "-ye" when attached to a an
end "h".

As with many diacritics there are two ways of encoding it in unicode -
individually as a "h" and a "hamza", which then are  rendered jointly by
the font rendering machine as a "h" with a hamza above it or simply a
single code point for a h with hamza.

Two Farsi modules use the latter option - and BD displays it fine. A
third module uses the first option and BD produces squares. (Gnomesword
does all three fine)

I was therefore thinking to run a search and replace and replace all
occurances of a "h" + "hamza" sequence with a single "h with hamza" but
then stumbled when I though whether this will have implications for
search. And then I came home from a long weekend and found this thread
which mentioned marginally searches with diacritics.

It seems that my options are

1) to have graphicly correct text, but not fully searchable or
2) poorly rendered text, but fully and correctly searchable.

Can you confirm whether I understood this correctly?

What are your suggestions?

Peter





More information about the jsword-devel mailing list