[sword-devel] Bible in Myanmar

Michael H cmahte at gmail.com
Mon May 13 06:57:10 MST 2019


Cyrille

LibreOffice Draw attempts to open the pagemaker file, with limited success.
But it confirms that even in the pagemaker source, the verse numbers are a
separate text stream. With this source, there is no way to copy the text
with verse numbers intact. It appears to be stored with each book in it's
own text stream. Each book is a separate text stream in the page maker
file. LO Draw isn't rendering all of the pages, only the first 10, So I've
only explored Matthew further.

Based on Matthew only, the verses seem to all end with the character "-" or
";/", which should aid in the reconstruction. I've looked through the PDF
and this seems to be the case for all books visually as well. However, this
isn't perfect: I find 1107 of these characters in Matthew, instead of the
expected 1071 verses.  But since the text stream has a book introduction,
this is likely easily explained. Hopefully this gets you well down the path
to creating a stream with verses.

I would NOT start from the PDF file, but from the pagemaker file.  The PDF
almost certainly has a lot of text rearranging and extra characters like
page numbers and running heads.  Pagemaker has the book text in a single
stream, in a form that will convert to unicode relatively easily.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.crosswire.org/pipermail/sword-devel/attachments/20190513/7073fa3d/attachment.html>


More information about the sword-devel mailing list