[sword-devel] A call for Python programmers ...
DM Smith
dmsmith at crosswire.org
Mon Dec 29 14:01:57 MST 2014
> On Dec 29, 2014, at 2:28 PM, Robert Hunt <hunt.robertj at gmail.com> wrote:
>
> On 30/12/14 06:29, Peter von Kaehne wrote:
>> It is very well written and neatly done and does its job with near
>> perfection. I would welcome contributions to it, as long as they are
>> equally well done.
> Just for your info: usfm2osis.py basically treats each USFM book as a huge hunk of text to which it does a large number of global text substitutions. Although this, in fact, does make it a very neat and tidy program, I don't think it's nearly perfect. The main disadvantage of using this method can be expressed as two results to the user (and I think these are quite serious defects in terms of reliable module making as other threads attest):
> Certain errors or non-conformities in the USFM are not even detected (e.g., when \d is used as a paragraph type marker with verses logically "inside" the \d marker which is not actually documented [nor banned] in the USFM specification)
I can understand that this is outside of the range of what can be expected of usfm2osis to do. That said, the bigger problem is that a global search and replace of multiple patterns has the fundamental assumption that the order of USFM tags is the same as OSIS. It simply is not the case that a global search and replace can create the proper nesting of logical elements that OSIS predicates. (I haven’t looked at the program, so I’m taking your word on that’s what the program does.)
Right now there are some reported problems with the placement of verse starts and ends.
OSIS allows for milestoned elements to allow for overlapping of competing markup (e.g. document structure (BSP) vs biblical versification (BCV) vs quotations). But it also allows for overlapping of markup that shouldn’t overlap.
I’ve been thinking for quite a while that we need more than validation against the OSIS schema. I think this can be significantly accomplished by having two OSIS schemas: one for BSP and another for BCV.
>
> If there is an error, the program is completely unable to give the user any indication of where (e.g., line number or chapter/verse) the error occurs because it has absolutely no concept of "position within the file”.
Having the notion of “position within the file” is useful for debugging, too.
> Perhaps this is accounted for by running some other program first to thoroughly check that the formation of the USFM is within the expected/programmed range???
> Robert.
>
> _______________________________________________
> sword-devel mailing list: sword-devel at crosswire.org
> http://www.crosswire.org/mailman/listinfo/sword-devel
> Instructions to unsubscribe/change your settings at above page
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.crosswire.org/pipermail/sword-devel/attachments/20141229/3cf16f12/attachment.html>
More information about the sword-devel
mailing list