[sword-devel] Search bug & New Arabic Bible, Not Shaped SVD Version
pola ashraf
5001 at hotmail.com
Mon Nov 26 02:33:06 MST 2012
Sorry for choosing the wrong word
this wikipedia article talking about this topic
https://en.wikipedia.org/wiki/Arabic_diacritics
Thanks Chris for your reply about the filter, Actually I don't have any contact details for the developers of the frontends to report them this problem, hope someone in this list report them about all this discussion :)
So now we know the problem and the solution .
> Date: Mon, 26 Nov 2012 01:05:16 -0800
> From: chrislit at crosswire.org
> To: sword-devel at crosswire.org
> Subject: Re: [sword-devel] Search bug & New Arabic Bible, Not Shaped SVD Version
>
> You're talking about vowels, not shaping. Shaping in Arabic changes the
> shape of the letter according to its context in the word (initial,
> medial, final, or isolated). I imagine unshaped Arabic would be very
> difficult to read. Arabic without vowel marks, on the other hand, is
> standard.
>
> I would have thought that the indexing would have been done without
> vowels or both with and without vowels. It should be easy to recover the
> vowel-less text for indexing by applying the UTF8ArabicPoints filter.
>
> --Chris
>
> On 11/25/2012 11:45 PM, pola ashraf wrote:
> > Using a comparison tool from ICU the two strings resulted in different
> > character numbers
> > Words to compare
> > يَسُوعَ
> > يسوع
> > Which is the Name of JESUS Christ in Arabic but one is shaped and the
> > other isn't
> >
> > Words converted to HEX Format
> > \u064a \u064e \u0633 \u064f \u0648 \u0639 \u064e
> > \u064a \u0633 \u0648 \u0639
> >
> > That's why search engines of some frontends doesn't come with any
> > results for not shaped words
> >
> > The suggestion is to make the index contain the shaped words plus the
> > same words without shaping
> >
> > Comparison Tool link https://ssl.icu-project.org/icu-bin/scompare
> >
> > Note: to clarify the meaning of shaping, shaping is the usage of
> > Characters like the following ( ٌ ُ ٍ َ ْ ً )
> > these special characters are shapes, and may change the whole word
> > meaning and help in correct reading, but as mentioned before, it make
> > reading harder and make problem with search functions
> >
> > Note: And Bible search normally without problems, but the desktop
> > programs like Xiphos and Bible Time have this problem
> >
> > Pola
> > ------------------------------------------------------------------------
> >
> > I think Arabic shapes add extra Unicode characters that's why the 2 same
> > words - i mentioned before - don't give the same results
> >
> > ------------------
> > Any Arabic search problem is unconnected to shaping.
> >
> > Modules are routinely created and stored in a normalised format, user
> > entries, e.g. for search ate equally normalised
> >
> >
> >
> > _______________________________________________
> > sword-devel mailing list: sword-devel at crosswire.org
> > http://www.crosswire.org/mailman/listinfo/sword-devel
> > Instructions to unsubscribe/change your settings at above page
> >
>
>
> _______________________________________________
> sword-devel mailing list: sword-devel at crosswire.org
> http://www.crosswire.org/mailman/listinfo/sword-devel
> Instructions to unsubscribe/change your settings at above page
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.crosswire.org/pipermail/sword-devel/attachments/20121126/0ba748c3/attachment-0001.html>
More information about the sword-devel
mailing list