[sword-devel] Search bug & New Arabic Bible, Not Shaped SVD Version

pola ashraf 5001 at hotmail.com
Mon Nov 26 00:45:42 MST 2012


Using a comparison tool from ICU the two strings resulted in different character numbers
Words to compare
يَسُوعَ
يسوع
Which is the Name of JESUS Christ in Arabic but one is shaped and the other isn't

Words converted to HEX Format
\u064a \u064e \u0633 \u064f \u0648 \u0639 \u064e  
\u064a \u0633 \u0648 \u0639  

That's why search engines of some frontends doesn't come with any results for not shaped words

The suggestion is to make the index contain the shaped words plus the same words without shaping

Comparison Tool link   https://ssl.icu-project.org/icu-bin/scompare

Note: to clarify the meaning of shaping, shaping is the usage of Characters like the following ( ٌ    ُ   ٍ   َ    ْ  ً  ) 
these special characters are shapes, and may change the whole word meaning and help in correct reading, but as mentioned before, it make reading harder and make problem with search functions

Note: And Bible search normally without problems, but the desktop programs like Xiphos and Bible Time have this problem

Pola




I think Arabic shapes add extra Unicode characters that's why the 2 same words - i mentioned before - don't give the same results

------------------
Any Arabic search problem is unconnected to shaping. 

Modules are routinely created and stored in a normalised format, user entries, e.g. for search ate equally normalised

 		 	   		   		 	   		  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.crosswire.org/pipermail/sword-devel/attachments/20121126/b09953c5/attachment.html>


More information about the sword-devel mailing list