[sword-devel] usfm2osis.py
Chris Little
chrislit at crosswire.org
Sun Aug 5 00:25:29 MST 2012
On 08/04/2012 10:10 PM, Robert Hunt wrote:
> On 05/08/12 00:15, Chris Little wrote:
>> Bug reports are welcome if you try it, but this is still largely
>> untested stuff, so expect bugs.
>>
>>
>> The other script in the above directory can be used to identify all of
>> the USFM tags used in a set of files and will specify which of them
>> are unknown to the USFM 2.35 reference.
> I'm not sure how to submit bug reports, but in testing this on our
> in-progress translation I get:
>
> From: usfmtags.py
>
> Known USFM Tags: \b, \bk, \bk*, \c, \f, \f*, \fq, \fr, \ft, \h, \id,
> \ide, \io1, \io2, \ior, \ior*, \iot, \ip, \is, \it, \it*, \li, \m,
> \mr, \ms, \mt, \mt1, \mt2, \nb, \p, \q, \q1, \q2, \q3, \r, \s, \s2,
> \s3, \tc1, \tcr2, \tr, \v, \x, \x*, \xo, \xt
> Unrecognized USFM Tags:
>
> which is correct, but from usfm2osis.py I get:
>
> Encoding unknown, processing as UTF-8.
> Encoding unknown, processing as UTF-8.
> Unhandled USFM tags: \n, \o1, \o2, \or, \or*, \ot, \p, \v (8 total)
> Consider using the -r option for relaxed markup processing.
>
> which are all false errors. The n is actually nb in the USFM, and the
> others are all from introduction tags, i.e., io1 io2, ior, etc.
Thanks Robert, it does help tremendously. You're welcome to file reports
in the MODTOOLS project of our bug tracker, as well. (Here's the report
for this bug: http://www.crosswire.org/bugs/browse/MODTOOLS-32)
I realized earlier today that I've badly bungled handling of all the \i-
introduction elements, so that bit needs to be redone completely.
I couldn't guess why it's missing \nb, since that one is treated like
every other paragraph type and I have actually test that one. Maybe the
problem will become apparent when I finish the test suite.
--Chris
More information about the sword-devel
mailing list