[sword-devel] Thai in Xiphos and BibleDesktop

Adrian Korten adrian_korten at sil.org
Fri Oct 16 21:57:45 MST 2009


Good day,

Yes, I think that it is the ICU module that makes the Thai word breaking 
work in BibleCS. I have previously inserted word break characters in the 
Bible texts because Biblical words and names are not in the ICU 
dictionary and get broken erratically. However, that led to problems 
with the search as user could not insert the characters and had to use 
spaces which was unnatural for them.

Note that there would be similar problems with Laos and Khmer. And 
probably Burmese but that is even more complicated.

ak


----- Original Message -----
*From:* DM Smith <dmsmith at crosswire.org>
*Sent:* 10/16/2009 10:13:17 PM +0700


> Adrian,
> Thai falls into the "I never thought about that" category for me. 
> While I knew that Thai does not use spaces for word breaks, it never 
> occurred to me that we need smart display to find word breaks.
>
> I'll be working on that for BibleDesktop.
>
> I'm not sure that the OSIS should have unnatural zero width spaces. 
> That just seems wrong. Maybe the module should.
>
> My guess is that we need to use ICU to find the word boundaries.
>
> Regarding search JSword does handle Thai word breaks. The next release 
> will improve that.
>
> In Him,
> DM
>
> On Oct 16, 2009, at 3:50 AM, Adrian Korten <adrian_korten at sil.org> wrote:
>
>> Good day,
>>
>> After trying my Thai OSIS test module in BibleCS, I tried the module 
>> and some other Thai ones in Xiphos (ubuntu version) and BibleDesktop 
>> (windows xp). The OSIS data seems to display better but it is hard to 
>> tell. Both of these programs do not do Thai word breaks automatically 
>> (finding the word breaks where no spaces exist). They only break by 
>> the 'phrase' spaces and so there are many line breaks and it gets 
>> hard to read. (Would they work if zero-width spaces were inserted. I 
>> didn't have time to test this.) This is too bad because I like the 
>> clean simple interfaces.
>>
>> And one additional note on BibleDesktop, I can change the default 
>> font for a Thai text but I could not change the size. Perhaps, this 
>> is because the program is out of date (but I don't remember the 
>> installed version). This is a problem for foreigners with Thai as the 
>> characters are small.
>>
>> ak
>>
>> _______________________________________________
>> sword-devel mailing list: sword-devel at crosswire.org
>> http://www.crosswire.org/mailman/listinfo/sword-devel
>> Instructions to unsubscribe/change your settings at above page
>
> _______________________________________________
> sword-devel mailing list: sword-devel at crosswire.org
> http://www.crosswire.org/mailman/listinfo/sword-devel
> Instructions to unsubscribe/change your settings at above page
>



More information about the sword-devel mailing list