<html><head><meta http-equiv="Content-Type" content="text/html charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class=""><br class=""><div><blockquote type="cite" class=""><div class="">On Dec 29, 2014, at 2:28 PM, Robert Hunt <<a href="mailto:hunt.robertj@gmail.com" class="">hunt.robertj@gmail.com</a>> wrote:</div><br class="Apple-interchange-newline"><div class="">
<meta content="text/html; charset=utf-8" http-equiv="Content-Type" class="">
<div text="#000000" bgcolor="#FFFFFF" class="">
On 30/12/14 06:29, Peter von Kaehne wrote:<br class="">
<blockquote cite="mid:1419874157.2199.8.camel@gmx.net" type="cite" class="">
<pre wrap="" class="">It is very well written and neatly done and does its job with near
perfection. I would welcome contributions to it, as long as they are
equally well done. </pre>
</blockquote>
Just for your info: usfm2osis.py basically treats each USFM book as
a huge hunk of text to which it does a large number of global text
substitutions. Although this, in fact, does make it a very neat and
tidy program, I don't think it's nearly perfect. The main
disadvantage of using this method can be expressed as two results to
the user (and I think these are quite serious defects in terms of
reliable module making as other threads attest):<br class="">
<ol class="">
<li class="">Certain errors or non-conformities in the USFM are not even
detected (e.g., when \d is used as a paragraph type marker with
verses logically "inside" the \d marker which is not actually
documented [nor banned] in the USFM specification)<br class=""></li></ol></div></div></blockquote><div><br class=""></div>I can understand that this is outside of the range of what can be expected of usfm2osis to do. That said, the bigger problem is that a global search and replace of multiple patterns has the fundamental assumption that the order of USFM tags is the same as OSIS. It simply is not the case that a global search and replace can create the proper nesting of logical elements that OSIS predicates. (I haven’t looked at the program, so I’m taking your word on that’s what the program does.)</div><div><br class=""></div><div>Right now there are some reported problems with the placement of verse starts and ends.</div><div><br class=""></div><div>OSIS allows for milestoned elements to allow for overlapping of competing markup (e.g. document structure (BSP) vs biblical versification (BCV) vs quotations). But it also allows for overlapping of markup that shouldn’t overlap.</div><div><br class=""></div><div>I’ve been thinking for quite a while that we need more than validation against the OSIS schema. I think this can be significantly accomplished by having two OSIS schemas: one for BSP and another for BCV.</div><div><br class=""><blockquote type="cite" class=""><div class=""><div text="#000000" bgcolor="#FFFFFF" class=""><ol class="" start="1"><li class="">
<br class="">
</li>
<li class="">If there is an error, the program is completely unable to give
the user any indication of where (e.g., line number or
chapter/verse) the error occurs because it has absolutely no
concept of "position within the file”.</li></ol></div></div></blockquote><div><br class=""></div>Having the notion of “position within the file” is useful for debugging, too.</div><div><br class=""></div><div><blockquote type="cite" class=""><div class=""><div text="#000000" bgcolor="#FFFFFF" class=""><ol class="" start="1">
</ol><p class="">Perhaps this is accounted for by running some other program first
to thoroughly check that the formation of the USFM is within the
expected/programmed range???<br class="">
</p><p class="">Robert.<br class="">
</p>
<br class="">
</div>
_______________________________________________<br class="">sword-devel mailing list: <a href="mailto:sword-devel@crosswire.org" class="">sword-devel@crosswire.org</a><br class=""><a href="http://www.crosswire.org/mailman/listinfo/sword-devel" class="">http://www.crosswire.org/mailman/listinfo/sword-devel</a><br class="">Instructions to unsubscribe/change your settings at above page</div></blockquote></div><br class=""></body></html>