[sword-devel] usfm2osis.py appears to be very broken

Ryan V adyeths at gmail.com
Thu Feb 23 12:59:41 MST 2017


On Thu, Feb 23, 2017 at 2:04 PM, Greg Hellings <greg.hellings at gmail.com>
wrote:

> You'll have to give more information before anyone can help you out
>

I don't need help. I am trying to make it known that there is a problem
with your usfm2osis.py conversion script so that it can be fixed.


>
> On Thu, Feb 23, 2017 at 12:46 PM, Ryan V <adyeths at gmail.com> wrote:
>
>> usfm2osis.py appears to be very broken now. I ran it on a number of
>> bibles and it failed to properly convert all of them. For the World English
>> Bible it reported the following:
>>
>>     Unhandled USFM tags: \+bk, \c, \li1<verse, \p, \p<verse, \q2, \v (7
>> total)
>>
>
> This is not broken. This is unimplemented. Tags are only implemented into
> usfm2osis.py once they are encountered. These have not been encountered in
> previous works converted using this tool. You or someone else will need to
> add support for those tags.
>

Yes, it is broken. The only usfm tag in this list that is unimplemented is
\+bk ... all of the other usfm tags are implemented and were previously
handled by the usfm2osis.py conversion script before it became broken. The
fact that both the \c and \v usfm tags appear in this list is a big
indication that there is a serious problem. Those two usfm tags are what
mark chapters and verses. And they have always been handled by the script.


>
>
>>
>> Chapter and verse numbering in the resulting osis is broken as well. It
>> looks like the following:
>>
>>     <chapter osisID="$BOOK$.1" sID="$BOOK$.1"/>
>>     <verse osisID="$BOOK$.$CHAP$.1" sID="$BOOK$.$CHAP$.1"/>
>>
>
> What language and format are you source files in? Often USFM projects
> represent a new work from a new language, and the source language names and
> abbreviations for books are not known. Perhaps the format that your books
> are in means that this information is missing from the OSIS but present in
> some other type of metadata. Somewhere, somehow the book and chapter
> information is missing from the input and thus it's not in the output. That
> doesn't necessarily mean that the utility is "very broken".
>
>
Please give us a bit more information to work with.
>
> --Greg
>

Since I am talking about usfm2osis.py which is a usfm to osis converter, it
can be inferred that the source files are usfm. If that was not clear then
I apologize and will make it clear now. The source files are usfm. The
language would be english since I specifically mentioned the results I
posted were from the World English Bible.

The book and chapter information is missing because the current version of
usfm2osis.py is broken. Going to a version of usfm2osis.py before it became
broken produced expected results.

I was trying to test your latest version of the usfm2osis.py converter to
compare its output with my u2o converter. I discovered that your
usfm2osis.py converter was broken. I thought that those who were
responsible for it would want to know. And reporting it on this list is the
only way I know to guarantee that they see that there is a problem with it.

***

And if you really want specifics, here are some specifics for you...
I took a few minutes to track down when it broke and what change broke it...
The breakage occurred on November 30th, 2016... and this is the commit that
broke it...

https://github.com/refdoc/Module-tools/commit/ad7fbad7e9a809eb65dea223d5174351bcccee8d

Undoing that one tiny little change makes usfm2osis.py work again.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.crosswire.org/pipermail/sword-devel/attachments/20170223/e63be88b/attachment.html>


More information about the sword-devel mailing list