[sword-devel] usfm2osis.pl
David Haslam
dfhmch at googlemail.com
Tue Jul 10 02:46:58 MST 2012
Thanks for all the clarifications. I have no disagreements with any of the
points raised since yesterday.
And yes - we encounter all sorts of unexpected characters when examining
received SFM files.
The set I was analyzing yesterday was replete with U+202F Narrow no-break
space characters.
One of these was misused as the delimiter within a verse range tag, where
there should have been a minus.
usfm2osis.pl may or may not catch all these kinds of errors.
This is also why it's generally a very good idea to use Dirk's
http://gbcpreprocessor.codeplex.com/ GoBibleCreatorUSFMPreprocessor utility
to do a few checks first, (even although in theory it might be seen as
off-topic for SWORD).
Sometimes this will catch items that, once fixed, will make the subsequent
processing steps less of a hassle for making a module.
NB. If it crashes while looking for "versification issues", then use trial
and error to isolate which SFM file[s] contain the cause of the crash.
Report discoveries like this back to the issues list in codeplex.
David
--
View this message in context: http://sword-dev.350566.n4.nabble.com/usfm2osis-pl-tp4650500p4650514.html
Sent from the SWORD Dev mailing list archive at Nabble.com.
More information about the sword-devel
mailing list