[sword-devel] Soft hyphens

Cyrille lafricain79 at gmail.com
Sat Nov 4 02:33:59 MST 2017


Hi Michael,
Thank you for this informations, I have to read them carefully. But can
you give me an example with a file of dic with hyphenation.

Le 03/11/2017 à 16:31, Michael H a écrit :
> Hi Cyrille, 
>
> I am preparing to study breakpoints for Cebuano to produce a hunspell
> hyphenation list, but haven't completed the process of implementing
> it. I am working from 3 paper Cebuano bibles typeset at different
> times, and manually copying the existing hyphenated words into a list. 
>
> Here's my proposed process to produce a preliminary hyphenation dictionary
> 1. study the (vowels OR consonants) before the hyphen + vowels OR
> consonants after the hyphen. 
>      the entire group of vowels together or consonants together. 
>         That is, for English the matches for detecting breaking letter
> boundary frequency looks (technically) something like: 
>            
> ([aeiouy]*|[bcdfghjklmnpqrstvwxz]*)Ux00AD([aeiouy]*|[bcdfghjklmnpqrstvwxz]*)
>         but you'd need to work out turning the matches from this regex
> into a list of boundary pairs and frequencies. 
> 2. This should yield a list of the most common hyphenation points for
> the language.
> 3. Auto insert the Hunspell hyphenation numbering into the dictionary. 
>       For Cebuano, I am hoping to use just the letter combinations in
> the hunspell dictionary.  I have hopes that hunspell can accomodate
> this, but I haven't completed the word list to analyze yet, so my
> hopes are based only on reading the documentation about hunspell and
> hyphenation, and looking through some of the existing examples. 
>
>
> On Fri, Nov 3, 2017 at 9:39 AM, Cyrille <lafricain79 at gmail.com
> <mailto:lafricain79 at gmail.com>> wrote:
>
>     It becomes a bit difficult for me to follow this post with all
>     these technical terms in another language :-) :-( But what I can
>     tell you is that I am very interested in a hyphenation dictionary.
>     I have already created a spelling dictionary for kikongo
>     <https://gitlab.com/lafricain79/kituba-dic>, and have created an
>     extension for libreoffice
>     <https://extensions.libreoffice.org/extensions/kituba-kikongo-ya-leta-dictionary>
>     and a hunspell dictionary for Linux. This now allows us to enable
>     spell checking in kikongo/kituba. This is all the more interesting
>     as Verbum Bible is translating the Old Testament and the Roman
>     Missal. My intention is the same for lingala, once finished the
>     module I will create a dictionary. Moreover for the lingala case
>     the hunspell .aff file already exists!
>     Now for the hyphenation dictionary, I read about it and it seemed
>     like a tedious operation, so if you have a solution to offer me to
>     take advantage of already existing hyphenation words it would be
>     great.
>     I open an issue about this on gitlab
>     <https://gitlab.com/lafricain79/LinVB/issues/11>.
>
>     Br Cyrille
>
>
>     Le 03/11/2017 à 12:37, David Haslam a écrit :
>>     I had similar thoughts as Michael outlined.
>>
>>     This morning, I compiled an Excel workbook tabulating the Lingala words
>>     found to contain a soft hyphen.
>>
>>     It has been attached to the issue in the GitLab repo.
>>
>>     https://gitlab.com/lafricain79/LinVB/issues/10
>>     <https://gitlab.com/lafricain79/LinVB/issues/10>
>>
>>     And - yes - it's not only incomplete as a dictionary, but it's also further
>>     evidence of inconsistency.
>>     The use of soft hyphens was entirely an ad hoc operation done to address
>>     contingencies.
>>
>>     As a CrossWire volunteer, I don't consider this sort of activity to be
>>     outside our purview.
>>     We're here to assist other Bible Agencies too, just as we say in our
>>     website.
>>     Or, if you like, we can count it as "going the extra mile".
>>
>>     And I'm sure that Fr Cyrille appreciates the spirit in which this service is
>>     provided.
>>
>>     Best regards,
>>
>>     David
>>
>>
>>
>>     --
>>     Sent from: http://sword-dev.350566.n4.nabble.com/
>>     <http://sword-dev.350566.n4.nabble.com/>
>>
>>     _______________________________________________
>>     sword-devel mailing list: sword-devel at crosswire.org <mailto:sword-devel at crosswire.org>
>>     http://www.crosswire.org/mailman/listinfo/sword-devel
>>     <http://www.crosswire.org/mailman/listinfo/sword-devel>
>>     Instructions to unsubscribe/change your settings at above page
>
>
>     _______________________________________________
>     sword-devel mailing list: sword-devel at crosswire.org
>     <mailto:sword-devel at crosswire.org>
>     http://www.crosswire.org/mailman/listinfo/sword-devel
>     <http://www.crosswire.org/mailman/listinfo/sword-devel>
>     Instructions to unsubscribe/change your settings at above page
>
>
>
>
> _______________________________________________
> sword-devel mailing list: sword-devel at crosswire.org
> http://www.crosswire.org/mailman/listinfo/sword-devel
> Instructions to unsubscribe/change your settings at above page

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.crosswire.org/pipermail/sword-devel/attachments/20171104/3b1bcf7b/attachment.html>


More information about the sword-devel mailing list