[sword-devel] OSIS 2.0.1 modules available

Chris Little sword-devel@crosswire.org
Thu, 05 Feb 2004 15:18:04 -0800


Michael Paul Johnson wrote:
>>Looks good.  I saw just a few issues that need some correction.  The 
>>most important is that <verse> eID's need a value matching the 
>>preceding 
>>sID on another <verse> element.  I think this is the only issue that 
>>actually violates the spec.
> 
> Oops! Sorry about that. I have corrected the error in my source code 
> that did that, and will be uploading updates when I can. (I'm trying 
> not to be envious of broadband Internet connections available all over 
> the USA & other more developed nations.)

Don't be envious of me, all I have is 9.6kbps, at the moment. :)

> Of course, this does bring up a question. Should overlapping verses 
> ever be allowed? I would hope not, but the syntax would seem to allow 
> it. Perhaps something should be said in the documentation about that. 
> Actually, the content of sID and eID markers on verse elements are 
> entirely redundant (assuming you don't overlap verses), but someone 
> might actually look at them, so I would rather have them be useful. My 
> intention was to make them the same as the osisID of the first verse 
> of the verse bridge set (which is the only verse in the case of most 
> normal verses), as you suggested.

In most cases, you won't have overlapping verses.  However, if you mark 
multiple versification schemes in a single document, you can have 
overlapping verse containers.  There may even be some cases where you 
find verses embedded within other verses, in the deuterocanonicals, 
depending on how you do translation.

>>Aside from that:
>>The book <div> elements should have an osisID attribute where you 
>>used 
>>scope.
> 
> I'll add an osisID attribute to those and leave the scope. Redundancy 
> is obviously not a problem in OSIS. I rather think it is regarded as a 
> virtue. <grin>

scope and the osisID mean slightly different things (though both would 
be accurate in this case).  So it's not exactly redundancy (which is not 
regarded a virtue) but verbosity (which might be :).

>>The code for English is "en".  You can use "ENG" in the <language 
>>type="SIL"> element, however.  (This isn't yet clear from the manual, 
>>of 
>>course, but I expect the final version of the manual will cover this 
>>area adequately.)
> 
> 
> I did use "en" for English texts in <osisText osisIDWork="WEB" 
> osisRefWork="Bible" xml:lang="en">,  but since I am most interested in 
> minority languages without two-letter codes, I'd prefer to stick with 
> the SIL Ethnologue codes wherever practical. For now, "ENG" is good in 
> the language element. The type is supplied, so it is not ambiguous. If 
> I nudge people towards supporting Ethnologue language codes, that 
> would be a good thing.

I saw an "x-SIL-ENG" in the <osisText> element of the WEB.  I'd like to 
release a nice, complete (at least for the language portion of the code) 
list of what code to use for which language by the time we have OSIS 2.1 
ready.  If everyone kind of sticks to that, we'll all be speaking the 
same language.

Since SIL (and maybe LINGUIST) codes are being adopted into ISO, the 
whole issue should go away relatively soon.

>>Various other issues, like the format of the <identifier 
>>type="OSIS">, 
>>are in flux, and will probably be defined in OSIS 2.1 or the final 
>>manual.  (My current best guess at the value 
>>"Bible.en.Rainbow_Ministries.WEB.2004-01-22".)
> 
> 
> Actually, that should be "Rainbow_Missions" instead of 
> "Rainbow_Ministries" for the publisher name. That is easy to adjust, 
> as it is just a constant in the GBF -> OSIS converter code.

Sorry, my mistake.  But again... this format is not set in stone yet.

>>>If you care to alter the <q> marker and quote marks to strictly 
>>>comply 
>>>with the OSIS 2 documentation, then you face the following 
>>>difficulties:
>>>
>>>1. You MUST provide additional information outside of the OSIS 
>>>standard to the users of OSIS text that allows the punctuation to 
>>>be 
>>>EXACTLY recreated as in the original text. The rules of this 
>>>recreation and the exact markers used are different for different 
>>>languages, different dialects, and even for different translations 
>>>within the same dialect. They aren't even the same for all of the 
>>>texts above. If you use the <q> marks in the KJV to generate red 
>>>text, 
>>>that is OK, but if you generate quotation marks, you are changing 
>>>the 
>>>text. The KJV has no quotation marks, nor does the ASV.
>>
>>I was sympathetic with this position, since it really does make 
>>conversion from other formats easier, but using <q> is undeniably 
>>better.
> 
> 
> I still deny that it is better. I remain unconvinced that use of <q> 
> to generate punctuation should be mandatory. Maybe I just don't like 
> computer geeks telling linguists & Bible translators what to do. Maybe 
> I have some valid reasons that you should consider.
> 
> I do concede that it is good to allow <q> to be used to generate 
> quotation marks where it makes sense -- and in some places it makes 
> lots of sense. I still disagree that it should be mandatory. I might 
> want to use this feature if I were drafting an entirely new 
> translation in OSIS (or something that converted more directly to 
> OSIS, which is more likely), and if I had software in hand to insert 
> the quotation marks the way they should go for this language and 
> style. I still think that once that insertion was done, I would prefer 
> to distribute the resulting text with quotation marks already 
> generated, and <q> tags, if present, serving only to indicate who the 
> speaker was. That way OSIS readers don't have to know all language & 
> style rules pertaining to punctuation for every language (not likely 
> to happen, really), and OSIS doesn't have to be extended to specify 
> all of these rules.

I think most of us who've worked on the standard are only part-time 
computer geeks.  We manage to pack an unusually high percentage of 
linguists and people who work in translation into the room when we talk 
about this stuff.  (I don't know that he have any translators, but we 
have translation checkers from UBS, folks from IPUB at SIL, and such.)

Using typographic quotes isn't invalid.  It just fails to meet one of 
the lower levels of conformance.

> Of course, only a hard-core computer geek would manually edit 
> OSIS Scripture texts (i. e. for a new translation) with nothing but a 
> text editor, so I'll wait to see if anyone generates a Scripture 
> editor that generates OSIS text that is easier to use than the current 
> alternatives. 

There's going to be an MS Word 2003-based OSIS export facility 
(developed by CCEL).  Also, an upcoming version of Paratext should 
support OSIS export (and notably, should support at least some degree of 
note type specification, since USFM has the facility for that).

--Chris