[sword-devel] Search bug & New Arabic Bible, Not Shaped SVD Version
pola ashraf
5001 at hotmail.com
Mon Nov 26 00:45:42 MST 2012
Using a comparison tool from ICU the two strings resulted in different character numbers
Words to compare
يَسُوعَ
يسوع
Which is the Name of JESUS Christ in Arabic but one is shaped and the other isn't
Words converted to HEX Format
\u064a \u064e \u0633 \u064f \u0648 \u0639 \u064e
\u064a \u0633 \u0648 \u0639
That's why search engines of some frontends doesn't come with any results for not shaped words
The suggestion is to make the index contain the shaped words plus the same words without shaping
Comparison Tool link https://ssl.icu-project.org/icu-bin/scompare
Note: to clarify the meaning of shaping, shaping is the usage of Characters like the following ( ٌ ُ ٍ َ ْ ً )
these special characters are shapes, and may change the whole word meaning and help in correct reading, but as mentioned before, it make reading harder and make problem with search functions
Note: And Bible search normally without problems, but the desktop programs like Xiphos and Bible Time have this problem
Pola
I think Arabic shapes add extra Unicode characters that's why the 2 same words - i mentioned before - don't give the same results
------------------
Any Arabic search problem is unconnected to shaping.
Modules are routinely created and stored in a normalised format, user entries, e.g. for search ate equally normalised
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.crosswire.org/pipermail/sword-devel/attachments/20121126/b09953c5/attachment.html>
More information about the sword-devel
mailing list