[sword-devel] Search bug & New Arabic Bible, Not Shaped SVD Version

Chris Little chrislit at crosswire.org
Mon Nov 26 02:05:16 MST 2012


You're talking about vowels, not shaping. Shaping in Arabic changes the 
shape of the letter according to its context in the word (initial, 
medial, final, or isolated). I imagine unshaped Arabic would be very 
difficult to read. Arabic without vowel marks, on the other hand, is 
standard.

I would have thought that the indexing would have been done without 
vowels or both with and without vowels. It should be easy to recover the 
vowel-less text for indexing by applying the UTF8ArabicPoints filter.

--Chris

On 11/25/2012 11:45 PM, pola ashraf wrote:
> Using a comparison tool from ICU the two strings resulted in different
> character numbers
> Words to compare
> يَسُوعَ
> يسوع
> Which is the Name of JESUS Christ in Arabic but one is shaped and the
> other isn't
>
> Words converted to HEX Format
> \u064a \u064e \u0633 \u064f \u0648 \u0639 \u064e
> \u064a \u0633 \u0648 \u0639
>
> That's why search engines of some frontends doesn't come with any
> results for not shaped words
>
> The suggestion is to make the index contain the shaped words plus the
> same words without shaping
>
> Comparison Tool link   https://ssl.icu-project.org/icu-bin/scompare
>
> Note: to clarify the meaning of shaping, shaping is the usage of
> Characters like the following ( ٌ    ُ   ٍ   َ    ْ  ً  )
> these special characters are shapes, and may change the whole word
> meaning and help in correct reading, but as mentioned before, it make
> reading harder and make problem with search functions
>
> Note: And Bible search normally without problems, but the desktop
> programs like Xiphos and Bible Time have this problem
>
> Pola
> ------------------------------------------------------------------------
>
> I think Arabic shapes add extra Unicode characters that's why the 2 same
> words - i mentioned before - don't give the same results
>
> ------------------
> Any Arabic search problem is unconnected to shaping.
>
> Modules are routinely created and stored in a normalised format, user
> entries, e.g. for search ate equally normalised
>
>
>
> _______________________________________________
> sword-devel mailing list: sword-devel at crosswire.org
> http://www.crosswire.org/mailman/listinfo/sword-devel
> Instructions to unsubscribe/change your settings at above page
>




More information about the sword-devel mailing list