[sword-devel] Normalising on the commandline

Peter von Kaehne refdoc at gmx.net
Wed Jan 21 13:44:45 MST 2009


Peter von Kaehne wrote:
> Chris Little wrote:
> 
>> uconv -f utf-8 -t utf-8 -x NFC -o output input
> 
> Thanks a lot!

Unfortunately learned in the process that my problems with search are
not caused by lack of normalisation, but by inconsistent encoding -
there are three different Arabic/Farsi Unicode letters which look and
(largely) behave the same way for ی . But they cause a mess during
search. It is as if the letter I had a different code point for German
than for English and yet another one for French. So if you type in a
German word on an English keyboard it suddenly would not find it.

Whoever implemented Unicode for Arabic script has a lot to answer for!!
Totally bizarre!

Peter



More information about the sword-devel mailing list