[sword-devel] USFM conformance in usfm2osis.py

Chris Little chrislit at crosswire.org
Tue Jul 31 19:16:51 MST 2012


My new usfm2osis.py script is progressing quite nicely. I've got it 
generating valid OSIS from one Bible that uses a very minimal set of 
USFM elements. At the moment, I'm working to make it process all tags 
present within the USFM versions of the WEB and RV, and this has raised 
an issue.

I've been working primarily with the USFM reference from UBS ICAP, 
treating it as a sort of specification. My question is: should this new 
utility accept USFM that does not conform to the reference at UBS ICAP?

Should it accept & interpret USFM tags that are not present in the 
reference?

One specific example is that the WEB uses \fqa*, which is obviously 
intended as an end-tag version of \fqa (used to mark alternate 
translations). But the USFM reference does not identify this as a valid 
end-tag, by my reading.

So... should we...

a) Make the new utility accept non-conformant USFM (from the perspective 
of the USFM reference). I'm leery of this, since one of my reasons for 
writing the new utility was to keep it pristinely spec-conformant and I 
have a feeling we might start incorporating tags and syntax that are 
less obviously interpretable than \fqa*.

b) Write a separate utility to convert common and interpretable 
non-conformant tags/syntax to conformant markup.

c) Add a command-line switch to usfm2osis.py so that it performs a 
pre-processing step of making non-conformant tags/syntax into conformant 
markup. (This would be the same as option b, but would place everything 
in a single utility.)

d) Punt on the issue, and let those performing conversion deal with 
non-conformant markup on a case by case basis.


--Chris



More information about the sword-devel mailing list