[sword-devel] Bible in Myanmar

David Haslam dfhdfh at protonmail.com
Wed May 15 06:22:28 MST 2019


Observations: (continued)

4. In addition to the reported instances of the anomalous 3 characters (È,Ø,ò) found after the font conversion,
there are 6 instances of the string "m;" that are also probably due to bugs in the converter.

Best regards,

David

Sent with [ProtonMail](https://protonmail.com) Secure Email.

‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Wednesday, May 15, 2019 12:41 PM, David Haslam <dfhdfh at protonmail.com> wrote:

> Yep - sure - later I can do that.
>
> David
>
> Sent from ProtonMail Mobile
>
> On Wed, May 15, 2019 at 11:26, Cyrille <lafricain79 at gmail.com> wrote:
>
>> David I have no count in box, and I want not to create one. Can you push on https://framadrop.org/ it's totally free and secure (and private).
>> Thank  you.
>>
>> Il 15/05/2019 11:46, David Haslam ha scritto:
>>
>>> Interim progress report.
>>>
>>> I downloaded the file Mat_utf8.zip from Cyrille's link and unzipped the contents to Mat_utf8-odt
>>>
>>> I opened the .odt file using 7-Zip from the Windows Explorer context menu, and extracted the file contents.xml
>>>
>>> I used Notepad++ plug-in XMLTools to pretty print the XML file and saved it as contents.pp.xml
>>> This is simply a layout change that's easier to read.
>>>
>>> I viewed the .pp.xml file in BabelPad, which confirmed that the non-XML text was (mostly) Myanmar Unicode.
>>>
>>> I used a TextPipe filter to remove all XML tags, blanks from SOL & EOL and all blank lines.
>>> The output file is now contents.pp.txt
>>>
>>> This is now something that's readable content in Myanmar Unicode, with some English text such as "The Gospel according Matthew" near the start.
>>>
>>> The file is best viewed using BabelPad with the option Display Colours | Colour Code by Script.
>>> This shows Myanmar characters in light green, and non-Myanmar characters in other colours.
>>>
>>> Observations:
>>> 1. The font conversion to Unicode left a few scattered characters unconverted. :(
>>>
>>> 0000C8	È	18	LATIN CAPITAL LETTER E WITH GRAVE
>>> 0000D8	Ø	20	LATIN CAPITAL LETTER O WITH STROKE
>>> 0000F2	ò	3	LATIN SMALL LETTER O WITH GRAVE
>>>
>>> The complete character frequency analysis is attached.
>>>
>>> 2. A few verse numbers? are still present here and there.
>>> 3. The content contains section headings and parallel passage headings as well as verse text.
>>>
>>> I have just uploaded the file contents.pp.zip to a new folder in my Box account and added Cyrille & Michael as viewers.
>>>
>>> Best regards,
>>>
>>> David
>>>
>>> Sent with ProtonMail Secure Email.
>>>
>>> ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
>>> On Monday, May 13, 2019 9:19 AM, Cyrille
>>> [<lafricain79 at gmail.com>](mailto:lafricain79 at gmail.com)
>>> wrote:
>>>
>>>> Hello,
>>>> I recently receive a modern translation of Myanmar of the NT, Psalms and
>>>> Proverbs with permission to create a new module.
>>>> But the problems are many... Firs to get the text.
>>>> I tested different way, but it's done with PageMaker!
>>>> I can get the text but the problem is I don't have the verses number
>>>> because they are next in a parallel column and when I copy it I have
>>>> only the biblical text.
>>>> I have a pdf also but when I convert it to text (with pdftotext) the
>>>> columns are mixed.
>>>> Someone can help me whit any idea?
>>>> Next problem is the Unicode... The text is not typed in unicode but use
>>>> a special font.
>>>> I can send everything you need or push it the git.crosswire.
>>>>
>>>> Thanks for help.
>>>>
>>>> sword-devel mailing list:
>>>> sword-devel at crosswire.org
>>>>
>>>> http://www.crosswire.org/mailman/listinfo/sword-devel
>>>> Instructions to unsubscribe/change your settings at above page
>>>
>>> _______________________________________________
>>> sword-devel mailing list:
>>> sword-devel at crosswire.org
>>>
>>> http://www.crosswire.org/mailman/listinfo/sword-devel
>>> Instructions to unsubscribe/change your settings at above page
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.crosswire.org/pipermail/sword-devel/attachments/20190515/68d79e51/attachment-0001.html>


More information about the sword-devel mailing list