[sword-devel] usfm to osis converter...

Chris Burrell christopher.burrell at gmail.com
Thu Jul 30 12:16:12 MST 2015

Does it supported nested tags?

On a related note, if people want a usx converter to osis, please let me know.


-----Original Message-----
From: "David Haslam" <dfhmch at googlemail.com>
Sent: ‎30/‎07/‎2015 20:07
To: "sword-devel at crosswire.org" <sword-devel at crosswire.org>
Subject: Re: [sword-devel] usfm to osis converter...

Thanks, Ryan. This looks very interesting. I expect that John Austin and
others would also find it useful.

Your description (qv) of the project should grab our attention.

I wrote my own USFM to OSIS converter in python. There are several reasons
for this:

    The usfm2osis.py converter mentioned above runs way too slow on my
computer. (It takes more than 2 minutes to process the World English Bible).
I thought I could make one that ran faster.
    The usfm2osis.py converter source is difficult for me to read, so I'm
unable to work on improving it. Obviously it would be better to submit
improvements to that script, but my limitations prevent that. I think the
biggest difficulty I have with reading the code is the huge amount of
complicated regular expressions it uses... about 200! Which reminds me of a
Jamie Zawinski quote.... “Some people, when confronted with a problem, think
‘I know, I'll use regular expressions.’ Now they have two problems.”
(Sometimes they make sense, though. The script I wrote has 9 of them.)
    I wanted a converter that targeted python3. (usfm2osis.py targeted only
python2 when I began working on my converter.)
    I wanted a converter that would be easy to update when changes are made
to the USFM standard.
    I thought it would be a fun project. (it was!)

I've tested it with CPython 2.7.6 and CPython3 3.4.0 and it works fine in
both of those versions of python. (This script works with pypy, pypy3, and
jython 2.7.0 as well, but they are signfiicantly slower at running this
script than CPython. I haven't tested it with IronPython as I don't have
that implementation of the python language.) It is public domain. You may do
whatever you wish with the code.

It's quite fast. For example, it only takes about 10 seconds to process the
World English Bible on my computer. That's about a 90% reduction in
processing time compared with usfm2osis.py in my testing. The output
validates against the OSIS 2.1.1 schema. No markup errors are reported by
osis2mod when generating modules for any of the bibles that I have access to
at this time.


Best regards,


View this message in context: http://sword-dev.350566.n4.nabble.com/usfm-to-osis-converter-tp4654838p4654840.html
Sent from the SWORD Dev mailing list archive at Nabble.com.

sword-devel mailing list: sword-devel at crosswire.org
Instructions to unsubscribe/change your settings at above page
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.crosswire.org/pipermail/sword-devel/attachments/20150730/ae747b16/attachment.html>

More information about the sword-devel mailing list