[sword-devel] Update on Tandroy Bible [tdx2023eb] for Madagascar

Michael Johnson Michael at eBible.org
Thu Mar 9 19:40:45 EST 2023


I have successfully extracted USFM from Steve's InDesign files for the Tandroy Bible and made a draft module (tdx2023eb) for it. The master USFM is now in a Bibledit cloud instance for Steve to be able to make any adjustments to, and from which I can keep the eBible.org copies updated.

In case anyone is curious, InDesign to USFM is NOT straightforward, but if the person creating the InDesign files is consistent enough in the way he uses styles for the different kinds of text (Canonical text, verse numbers, name of Diety, etc.) and paragraphs (poetry, prose, headings, etc.), it can be done. The general process is:

 1. Export InDesign tagged text from each InDesign file, verbose tags, Unicode.
 2. Convert the Unicode files (UTF-16) to UTF-8. I used UltraEdit for this, but other ways are possible.
 3. Examine the tagged text in context to determine what style names correspond to which USFM tags.
 4. Use a chain of regular expressions to convert the tags. (David H. would probably use Text Pipe for this, but I used massregex.)
 5. Edit the results to merge in more book name data (from a translator-supplied table) for book headers.
 6. Sanity check the results.
 7. Ask someone who can read the translation to check the work.

Note: this is a time-consuming process that is easy to make mistakes in. Therefore once the conversion is done, it is best to stick with USFM for the original. If InDesign files are needed again, they can be produced easily enough with SIL Publishing Assistant.

-- 
signature

Aloha,
*/Michael Johnson/**
26 HIWALANI LOOP • MAKAWAO HI 96768-8747*• USA
mljohnson.org <https://mljohnson.org/> • eBible.org <https://eBible.org> • WorldEnglish.Bible <https://WorldEnglish.Bible> • PNG.Bible <https://PNG.Bible>
Signal/Telegram/WhatsApp/Telephone: +1 808-333-6921
Skype: kahunapule • Telegram/Twitter: @kahunapule • Facebook: fb.me/kahunapule <https://www.facebook.com/kahunapule>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://crosswire.org/pipermail/sword-devel/attachments/20230309/cd98523e/attachment.htm>


More information about the sword-devel mailing list