<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
Thank you Michael for your help!<br>
Let me know if you succeed to do something.<br>
<br>
<div class="moz-cite-prefix">Il 13/05/2019 15:57, Michael H ha
scritto:<br>
</div>
<blockquote type="cite"
cite="mid:CAJ9hia8pYeDvfvnK2i_rnhqzk3NC53g5zAftZDkuOhaximqBhA@mail.gmail.com">
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">
<div class="gmail_default"
style="font-family:garamond,serif;font-size:large">Cyrille<br>
<br>
LibreOffice Draw attempts to open the pagemaker file,
with limited success. But it confirms that even in the
pagemaker source, the verse numbers are a separate text
stream. With this source, there is no way to copy the
text with verse numbers intact. It appears to be stored
with each book in it's own text stream. Each book is a
separate text stream in the page maker file. LO Draw
isn't rendering all of the pages, only the first 10, So
I've only explored Matthew further. <br>
<br>
Based on Matthew only, the verses seem to all end with
the character "-" or ";/", which should aid in the
reconstruction. I've looked through the PDF and this
seems to be the case for all books visually as well.
However, this isn't perfect: I find 1107 of these
characters in Matthew, instead of the expected 1071
verses. But since the text stream has a book
introduction, this is likely easily explained. Hopefully
this gets you well down the path to creating a stream
with verses. <br>
<br>
I would NOT start from the PDF file, but from the
pagemaker file. The PDF almost certainly has a lot of
text rearranging and extra characters like page numbers
and running heads. Pagemaker has the book text in a
single stream, in a form that will convert to unicode
relatively easily. </div>
<div class="gmail_default"
style="font-family:garamond,serif;font-size:large"><br>
</div>
</div>
</div>
</div>
</div>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<pre class="moz-quote-pre" wrap="">_______________________________________________
sword-devel mailing list: <a class="moz-txt-link-abbreviated" href="mailto:sword-devel@crosswire.org">sword-devel@crosswire.org</a>
<a class="moz-txt-link-freetext" href="http://www.crosswire.org/mailman/listinfo/sword-devel">http://www.crosswire.org/mailman/listinfo/sword-devel</a>
Instructions to unsubscribe/change your settings at above page</pre>
</blockquote>
<br>
</body>
</html>