[sword-devel] Re: [osis-editors] OSIS 2.0.1 modules updated

Michael Paul Johnson sword-devel@crosswire.org
Wed, 17 Mar 2004 10:55:00 +1000


--=====================_10144507==.ALT
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

At 06:46 17-03-04, Patrick Durusau wrote:
>Michael,
>
>Appreciate all the work you have done with OSIS and even more with the=20
>Bible but there is a remark below that puzzles me:
>
>Michael Paul Johnson wrote:
>> We have no plans to support the use of <q> replacing typographic quote=20
>> characters for any of our uses of OSIS, because the current standard=20
>> provides no way to ensure that the generated punctuation would be=20
>> correct for each translation and language. The OSIS standard should=20
>> represent the Bible text, not change it.
>>=20
>>=20
>
>Curious how you see the current standard as providing "no way to ensure=20
>that the generated punctuation would be correct for each translation and=20
>language."?
>
>Generation of punctuation, as opposed to encoding quotations, is really=20
>a matter of the rendering software and not the OSIS encoding. I don't=20
>think you could do it in XSLT (perhaps but I don't think so) but if you=20
>want continuation quotes, for example, if a <q/> as a milestone has=20
>opened, and has not yet closed, if you want continuation quotes at the=20
>beginning of each <verse> before the closing <q/> milestone, you can=20
>certainly do so.
>
>Is there some case in particular that is a problem? Realizing that it is=20
>going from one language to another where handling of quotations gets=20
>real messy. Best we can do is mark the quotes accurately and without=20
>ambiguity.

Let me try to explain again what I meant. In short, I don't believe that=
 OSIS provides enough information to the processors of OSIS texts to=
 reliably regenerate the correct punctuation with regard to quotation marks.=
 That is because it does not.

It is not good enough to "allow" the use of different characters for=
 quotation marks. It is not good enough to "allow" the generation of=
 quotation continuation reminders at the beginnings of verses and/or=
 paragraphs. These rules MUST be specified OR the quotations MUST be already=
 in place, rendered correctly. Otherwise, you are inviting programmers to=
 change the punctuation in Bible texts. That may not bother you, but it=
 strikes me as being WRONG. Let the Bible translators and publishers control=
 the punctuation for each language and each translation. Please.

Not every language uses the same rules for quotation punctuation as English.=
 Not even every dialect of English, nor even every modern English Bible=
 translation uses the same rules. Even languages that use mostly the same=
 alphabet and punctuation as English may use different quotation marks. If I=
 were implementing an OSIS reader right now, and desired to faithfully=
 reproduce the punctuation of the original translation based on what was in=
 the OSIS text alone, I could not. Not even for modern English.

Take a good look at a printed NASB. Note that continuation opening quotes=
 are present at the beginning of every verse when a quote is open, there.=
 Take a good look at a printed NIV. Note that continuation opening quotes=
 are present at the beginning of every paragraph when needed, but not at the=
 beginning of every verse. Now consider the Spanish "La Biblia de las=
 Am=E9ricas." Note that it doesn't usually use quotation marks, but it does=
 mark Jesus' words in red. It uses colons, capitalization, and other hints=
 to indicate quotations. The Spanish RVA uses quotation marks for some=
 quotations, but not for all of them. The Bargam (Madang Province, PNG) New=
 Testament does not use quotation marks like English, but the Borong (Morobe=
 Province, PNG) uses quotation marks in the same manner as the English NIV.

ALL of those cases present problems for using the <q> element as the current=
 revision of OSIS defines it.

This is NOT acceptable to me. I think I'm pretty reasonable, and I like to=
 use standards in a standard way, but if OSIS stays the same, I will never=
 use it exactly as it was specified. If one of your most active proponents=
 of your standard feels that way, maybe you should look at the problem=
 again?

For now, I will continue to recommend that everyone embed correct=
 punctuation directly in the text of OSIS documents and to use <q> in a=
 nonstandard manner to mark Jesus' words, when desired, like the following=
 quote.

<q sID=3D"Matt.3.15.1" who=3D"Jesus" type=3D"x-doNotGeneratePunctuation"=
 />=93Allow it now, for this is the fitting way for us to fulfill all=
 righteousness.=94<q eID=3D"Matt.3.15.1" />

If anyone ignores the type=3D"x-doNotGeneratePunctuation", and generates=
 punctuation from the markers anyway, they will get double punctuation. This=
 is not good. I'm hoping you will change the standard to something that we=
 can actually use and feel good about.

The <q> marker as you have defined it has merit when generating quotation=
 punctuation in the first place in a new translation. After that is done, it=
 has no merit, at least for any application that I am concerned with:=
 translation, typesetting, and electronic distribution. The only reason I=
 use it at all is that you provided no other way to mark Jesus' words for a=
 "red letter edition" of a Bible. Granted, the words of Jesus were not so=
 marked in the original manuscripts, and some people argue that they should=
 not be, but you simply won't get widespread acceptance of a Scripture=
 interchange standard unless you support this traditional feature. People=
 who read such texts can freely choose to use red ink or not, as far as I'm=
 concerned, but the markup should be there and accurate for those who choose=
 to use it.

I hope that now you understand why my conversion to OSIS software inserts=
 some disclaimers in the revision description element.

   <revisionDesc resp=3D"Rainbow Missions, Inc. http://RainbowMissions.org">
    <date>2004-03-14T12.25.09</date>
    <p>
This draft version of the World English Bible is substantially complete in
the New Testament, Genesis, Exodus, Job, Psalms, Proverbs, Ecclesiastes,
Song of Solomon, and the =E2=80=9Cminor=E2=80=9D prophets. Editing continues=
 on the other
books of the Old Testament. Apocrypha books in this file are still in rough
draft form.
</p>
    <p>Converted ..\..\web.gbf in GBF to web.osis.xml in
an XML format that attempts to comply with OSIS 2.0 using gbf2osis.exe.
(Please see http://ebt.cx/translation/ for links to this software.)</p>
    <p>GBF and OSIS metadata fields do not exactly correspond to each other,=
 so
the conversion is not perfect in the metadata. However, the Scripture=
 portion
should be correct.</p>
    <p>No attempt was to convert quotation marks to structural markers using=
 q or
speech elements, because this would require language and style-dependent
processing. In English texts, the hard part is figuring out what =E2=80=99=
 means.
The other difficulty is that I am not yet convinced that the proper
punctuation marks would be reconstituted by software that reads OSIS=
 files.</p>
    <p>The output of gbf2osis marks Jesus' words in a non-standard way using=
 the q
element AND quotation marks if they were marked with FR/Fr markers in the=
 GBF
file. The OSIS 2.0 specification requires that quotation marks be stripped=
 out,
and reinserted by software that reads the OSIS files when q elements are=
 used.
To convert this to an OSIS 2.0 file, you must either remove all q elements,
remove the quotation marks around Jesus' quotes, or convince the keepers of=
 the
standard to change the standard.</p>
    <p>OSIS does not currently support footnote start anchors. Therefore,=
 these
start anchors have been represented with milestone elements, in case someone
might like to use them, for example, to start an href element in a=
 conversion
to HTML.</p>
    <p>Traditional psalm book titles are rendered as text rather than=
 titles, because
the title element does not support containing transChange elements, as would=
 be
required to encode the KJV text using OSIS title elements.</p>
    <p>The schema location headers were modified to use local copies rather=
 than the
standard locations so that these files could be validated and used without=
 an
Internet connection active at all times (very important for the developer's
remote island location), but you may wish to change them back.</p>
   </revisionDesc>


I hope this helps. :-)

Your fellow servant of Jesus Christ,
Michael

--=====================_10144507==.ALT
Content-Type: text/html; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

<html>
<body>
<font size=3D3>At 06:46 17-03-04, Patrick Durusau wrote:<br>
&gt;Michael,<br>
&gt;<br>
&gt;Appreciate all the work you have done with OSIS and even more with
the <br>
&gt;Bible but there is a remark below that puzzles me:<br>
&gt;<br>
&gt;Michael Paul Johnson wrote:<br>
&gt;&gt; We have no plans to support the use of &lt;q&gt; replacing
typographic quote <br>
&gt;&gt; characters for any of our uses of OSIS, because the current
standard <br>
&gt;&gt; provides no way to ensure that the generated punctuation would
be <br>
&gt;&gt; correct for each translation and language. The OSIS standard
should <br>
&gt;&gt; represent the Bible text, not change it.<br>
&gt;&gt; <br>
&gt;&gt; <br>
&gt;<br>
&gt;Curious how you see the current standard as providing &quot;no way to
ensure <br>
&gt;that the generated punctuation would be correct for each translation
and <br>
&gt;language.&quot;?<br>
&gt;<br>
&gt;Generation of punctuation, as opposed to encoding quotations, is
really <br>
&gt;a matter of the rendering software and not the OSIS encoding. I don't
<br>
&gt;think you could do it in XSLT (perhaps but I don't think so) but if
you <br>
&gt;want continuation quotes, for example, if a &lt;q/&gt; as a milestone
has <br>
&gt;opened, and has not yet closed, if you want continuation quotes at
the <br>
&gt;beginning of each &lt;verse&gt; before the closing &lt;q/&gt;
milestone, you can <br>
&gt;certainly do so.<br>
&gt;<br>
&gt;Is there some case in particular that is a problem? Realizing that it
is <br>
&gt;going from one language to another where handling of quotations gets
<br>
&gt;real messy. Best we can do is mark the quotes accurately and without
<br>
&gt;ambiguity.<br><br>
</font>Let me try to explain again what I meant. In short, <b><i>I don't
believe that OSIS provides enough information to the processors of OSIS
texts to reliably regenerate the correct punctuation with regard to
quotation marks.</i></b> That is because it does not.<br><br>
It is not good enough to &quot;allow&quot; the use of different
characters for quotation marks. It is not good enough to
&quot;allow&quot; the generation of quotation continuation reminders at
the beginnings of verses and/or paragraphs. These rules MUST be specified
OR the quotations MUST be already in place, rendered correctly.
Otherwise, you are inviting programmers to change the punctuation in
Bible texts. That may not bother you, but it strikes me as being WRONG.
Let the Bible translators and publishers control the punctuation for each
language and each translation. Please.<br><br>
Not every language uses the same rules for quotation punctuation as
English. Not even every dialect of English, nor even every modern English
Bible translation uses the same rules. Even languages that use mostly the
same alphabet and punctuation as English may use different quotation
marks. If I were implementing an OSIS reader right now, and desired to
faithfully reproduce the punctuation of the original translation based on
what was in the OSIS text alone, I could not. Not even for modern
English.<br><br>
Take a good look at a printed NASB. Note that continuation opening quotes
are present at the beginning of every verse when a quote is open, there.
Take a good look at a printed NIV. Note that continuation opening quotes
are present at the beginning of every paragraph when needed, but not at
the beginning of every verse. Now consider the Spanish &quot;La Biblia de
las Am=E9ricas.&quot; Note that it doesn't usually use quotation marks, but
it does mark Jesus' words in red. It uses colons, capitalization, and
other hints to indicate quotations. The Spanish RVA uses quotation marks
for some quotations, but not for all of them. The Bargam (Madang
Province, PNG) New Testament does not use quotation marks like English,
but the Borong (Morobe Province, PNG) uses quotation marks in the same
manner as the English NIV.<br><br>
ALL of those cases present problems for using the &lt;q&gt; element as
the current revision of OSIS defines it.<br><br>
This is NOT acceptable to me. I think I'm pretty reasonable, and I like
to use standards in a standard way, but if OSIS stays the same, I will
never use it exactly as it was specified. If one of your most active
proponents of your standard feels that way, maybe you should look at the
problem again?<br><br>
For now, I will continue to recommend that everyone embed correct
punctuation directly in the text of OSIS documents and to use &lt;q&gt;
in a nonstandard manner to mark Jesus' words, when desired, like the
following quote.<br><br>
&lt;q sID=3D&quot;Matt.3.15.1&quot; who=3D&quot;Jesus&quot;
<font color=3D"#FF0000">type=3D&quot;x-doNotGeneratePunctuation&quot;</font>
/&gt;<font face=3D"Times New Roman, Times">=93Allow it now, for this is the=
 fitting way for us to fulfill all righteousness.=94</font>&lt;q=
 eID=3D&quot;Matt.3.15.1&quot; /&gt;<br><br>
If anyone ignores the type=3D&quot;x-doNotGeneratePunctuation&quot;, and=
 generates punctuation from the markers anyway, they will get double=
 punctuation. This is not good. I'm hoping you will change the standard to=
 something that we can actually use and feel good about.<br><br>
The &lt;q&gt; marker as you have defined it has merit when generating=
 quotation punctuation in the first place in a new translation. After that=
 is done, it has no merit, at least for any application that I am concerned=
 with: translation, typesetting, and electronic distribution. The only=
 reason I use it at all is that you provided no other way to mark Jesus'=
 words for a &quot;red letter edition&quot; of a Bible. Granted, the words=
 of Jesus were not so marked in the original manuscripts, and some people=
 argue that they should not be, but you simply won't get widespread=
 acceptance of a Scripture interchange standard unless you support this=
 traditional feature. People who read such texts can freely choose to use=
 red ink or not, as far as I'm concerned, but the markup should be there and=
 accurate for those who choose to use it.<br><br>
I hope that now you understand why my conversion to OSIS software inserts=
 some disclaimers in the revision description element.<br><br>
<font face=3D"Courier New, Courier" size=3D1>&nbsp;&nbsp; </font><font=
 face=3D"Courier New, Courier" size=3D1 color=3D"#008080">&lt;</font><font=
 face=3D"Courier New, Courier" size=3D1=
 color=3D"#000080"><b>revisionDesc</b></font><font face=3D"Courier New,=
 Courier" size=3D1> resp</font><font face=3D"Courier New, Courier" size=3D1=
 color=3D"#0000FF"><b>=3D</b></font><font face=3D"Courier New, Courier"=
 size=3D1 color=3D"#800000">&quot;Rainbow Missions, Inc.=
 http://RainbowMissions.org&quot;</font><font face=3D"Courier New, Courier"=
 size=3D1 color=3D"#008080">&gt;<br>
</font><font face=3D"Courier New, Courier" size=3D1>&nbsp;&nbsp;&nbsp;=
 </font><font face=3D"Courier New, Courier" size=3D1=
 color=3D"#008080">&lt;</font><font face=3D"Courier New, Courier" size=3D1=
 color=3D"#000080"><b>date</b></font><font face=3D"Courier New, Courier"=
 size=3D1 color=3D"#008080">&gt;</font><font face=3D"Courier New, Courier"=
 size=3D1>2004-03-14T12.25.09</font><font face=3D"Courier New, Courier"=
 size=3D1 color=3D"#008080">&lt;</font><font face=3D"Courier New, Courier"=
 size=3D1>/</font><font face=3D"Courier New, Courier" size=3D1=
 color=3D"#000080"><b>date</b></font><font face=3D"Courier New, Courier"=
 size=3D1 color=3D"#008080">&gt;<br>
</font><font face=3D"Courier New, Courier" size=3D1>&nbsp;&nbsp;&nbsp;=
 </font><font face=3D"Courier New, Courier" size=3D1=
 color=3D"#008080">&lt;</font><font face=3D"Courier New, Courier" size=3D1=
 color=3D"#000080"><b>p</b></font><font face=3D"Courier New, Courier" size=
=3D1 color=3D"#008080">&gt;<br>
</font><font face=3D"Courier New, Courier" size=3D1>This draft version of=
 the World English Bible is substantially complete in<br>
the New Testament, Genesis, Exodus, Job, Psalms, Proverbs, Ecclesiastes,<br>
Song of Solomon, and the =E2=80=9Cminor=E2=80=9D prophets. Editing continues=
 on the other<br>
books of the Old Testament. Apocrypha books in this file are still in=
 rough<br>
draft form.<br>
</font><font face=3D"Courier New, Courier" size=3D1=
 color=3D"#008080">&lt;</font><font face=3D"Courier New, Courier"=
 size=3D1>/</font><font face=3D"Courier New, Courier" size=3D1=
 color=3D"#000080"><b>p</b></font><font face=3D"Courier New, Courier" size=
=3D1 color=3D"#008080">&gt;<br>
</font><font face=3D"Courier New, Courier" size=3D1>&nbsp;&nbsp;&nbsp;=
 </font><font face=3D"Courier New, Courier" size=3D1=
 color=3D"#008080">&lt;</font><font face=3D"Courier New, Courier" size=3D1=
 color=3D"#000080"><b>p</b></font><font face=3D"Courier New, Courier" size=
=3D1 color=3D"#008080">&gt;</font><font face=3D"Courier New, Courier"=
 size=3D1>Converted ..\..\web.gbf in GBF to web.osis.xml in<br>
an XML format that attempts to comply with OSIS 2.0 using gbf2osis.exe.<br>
(Please see http://ebt.cx/translation/ for links to this=
 software.)</font><font face=3D"Courier New, Courier" size=3D1=
 color=3D"#008080">&lt;</font><font face=3D"Courier New, Courier"=
 size=3D1>/</font><font face=3D"Courier New, Courier" size=3D1=
 color=3D"#000080"><b>p</b></font><font face=3D"Courier New, Courier" size=
=3D1 color=3D"#008080">&gt;<br>
</font><font face=3D"Courier New, Courier" size=3D1>&nbsp;&nbsp;&nbsp;=
 </font><font face=3D"Courier New, Courier" size=3D1=
 color=3D"#008080">&lt;</font><font face=3D"Courier New, Courier" size=3D1=
 color=3D"#000080"><b>p</b></font><font face=3D"Courier New, Courier" size=
=3D1 color=3D"#008080">&gt;</font><font face=3D"Courier New, Courier"=
 size=3D1>GBF and OSIS metadata fields do not exactly correspond to each=
 other, so<br>
the conversion is not perfect in the metadata. However, the Scripture=
 portion<br>
should be correct.</font><font face=3D"Courier New, Courier" size=3D1=
 color=3D"#008080">&lt;</font><font face=3D"Courier New, Courier"=
 size=3D1>/</font><font face=3D"Courier New, Courier" size=3D1=
 color=3D"#000080"><b>p</b></font><font face=3D"Courier New, Courier" size=
=3D1 color=3D"#008080">&gt;<br>
</font><font face=3D"Courier New, Courier" size=3D1>&nbsp;&nbsp;&nbsp;=
 </font><font face=3D"Courier New, Courier" size=3D1=
 color=3D"#008080">&lt;</font><font face=3D"Courier New, Courier" size=3D1=
 color=3D"#000080"><b>p</b></font><font face=3D"Courier New, Courier" size=
=3D1 color=3D"#008080">&gt;</font><font face=3D"Courier New, Courier"=
 size=3D1>No attempt was to convert quotation marks to structural markers=
 using q or<br>
speech elements, because this would require language and style-dependent<br>
processing. In English texts, the hard part is figuring out what =E2=80=99=
 means.<br>
The other difficulty is that I am not yet convinced that the proper<br>
punctuation marks would be reconstituted by software that reads OSIS=
 files.</font><font face=3D"Courier New, Courier" size=3D1=
 color=3D"#008080">&lt;</font><font face=3D"Courier New, Courier"=
 size=3D1>/</font><font face=3D"Courier New, Courier" size=3D1=
 color=3D"#000080"><b>p</b></font><font face=3D"Courier New, Courier" size=
=3D1 color=3D"#008080">&gt;<br>
</font><font face=3D"Courier New, Courier" size=3D1>&nbsp;&nbsp;&nbsp;=
 </font><font face=3D"Courier New, Courier" size=3D1=
 color=3D"#008080">&lt;</font><font face=3D"Courier New, Courier" size=3D1=
 color=3D"#000080"><b>p</b></font><font face=3D"Courier New, Courier" size=
=3D1 color=3D"#008080">&gt;</font><font face=3D"Courier New, Courier"=
 size=3D1>The output of gbf2osis marks Jesus' words in a non-standard way=
 using the q<br>
element AND quotation marks if they were marked with FR/Fr markers in the=
 GBF<br>
file. The OSIS 2.0 specification requires that quotation marks be stripped=
 out,<br>
and reinserted by software that reads the OSIS files when q elements are=
 used.<br>
To convert this to an OSIS 2.0 file, you must either remove all q=
 elements,<br>
remove the quotation marks around Jesus' quotes, or convince the keepers of=
 the<br>
standard to change the standard.</font><font face=3D"Courier New, Courier"=
 size=3D1 color=3D"#008080">&lt;</font><font face=3D"Courier New, Courier"=
 size=3D1>/</font><font face=3D"Courier New, Courier" size=3D1=
 color=3D"#000080"><b>p</b></font><font face=3D"Courier New, Courier" size=
=3D1 color=3D"#008080">&gt;<br>
</font><font face=3D"Courier New, Courier" size=3D1>&nbsp;&nbsp;&nbsp;=
 </font><font face=3D"Courier New, Courier" size=3D1=
 color=3D"#008080">&lt;</font><font face=3D"Courier New, Courier" size=3D1=
 color=3D"#000080"><b>p</b></font><font face=3D"Courier New, Courier" size=
=3D1 color=3D"#008080">&gt;</font><font face=3D"Courier New, Courier"=
 size=3D1>OSIS does not currently support footnote start anchors. Therefore,=
 these<br>
start anchors have been represented with milestone elements, in case=
 someone<br>
might like to use them, for example, to start an href element in a=
 conversion<br>
to HTML.</font><font face=3D"Courier New, Courier" size=3D1=
 color=3D"#008080">&lt;</font><font face=3D"Courier New, Courier"=
 size=3D1>/</font><font face=3D"Courier New, Courier" size=3D1=
 color=3D"#000080"><b>p</b></font><font face=3D"Courier New, Courier" size=
=3D1 color=3D"#008080">&gt;<br>
</font><font face=3D"Courier New, Courier" size=3D1>&nbsp;&nbsp;&nbsp;=
 </font><font face=3D"Courier New, Courier" size=3D1=
 color=3D"#008080">&lt;</font><font face=3D"Courier New, Courier" size=3D1=
 color=3D"#000080"><b>p</b></font><font face=3D"Courier New, Courier" size=
=3D1 color=3D"#008080">&gt;</font><font face=3D"Courier New, Courier"=
 size=3D1>Traditional psalm book titles are rendered as text rather than=
 titles, because<br>
the title element does not support containing transChange elements, as would=
 be<br>
required to encode the KJV text using OSIS title elements.</font><font=
 face=3D"Courier New, Courier" size=3D1 color=3D"#008080">&lt;</font><font=
 face=3D"Courier New, Courier" size=3D1>/</font><font face=3D"Courier New,=
 Courier" size=3D1 color=3D"#000080"><b>p</b></font><font face=3D"Courier=
 New, Courier" size=3D1 color=3D"#008080">&gt;<br>
</font><font face=3D"Courier New, Courier" size=3D1>&nbsp;&nbsp;&nbsp;=
 </font><font face=3D"Courier New, Courier" size=3D1=
 color=3D"#008080">&lt;</font><font face=3D"Courier New, Courier" size=3D1=
 color=3D"#000080"><b>p</b></font><font face=3D"Courier New, Courier" size=
=3D1 color=3D"#008080">&gt;</font><font face=3D"Courier New, Courier"=
 size=3D1>The schema location headers were modified to use local copies=
 rather than the<br>
standard locations so that these files could be validated and used without=
 an<br>
Internet connection active at all times (very important for the=
 developer's<br>
remote island location), but you may wish to change them back.</font><font=
 face=3D"Courier New, Courier" size=3D1 color=3D"#008080">&lt;</font><font=
 face=3D"Courier New, Courier" size=3D1>/</font><font face=3D"Courier New,=
 Courier" size=3D1 color=3D"#000080"><b>p</b></font><font face=3D"Courier=
 New, Courier" size=3D1 color=3D"#008080">&gt;<br>
</font><font face=3D"Courier New, Courier" size=3D1>&nbsp;&nbsp;=
 </font><font face=3D"Courier New, Courier" size=3D1=
 color=3D"#008080">&lt;</font><font face=3D"Courier New, Courier"=
 size=3D1>/</font><font face=3D"Courier New, Courier" size=3D1=
 color=3D"#000080"><b>revisionDesc</b></font><font face=3D"Courier New,=
 Courier" size=3D1 color=3D"#008080">&gt;<br><br>
<br>
</font>I hope this helps. :-)<br><br>
Your fellow servant of Jesus Christ,<br>
Michael<br>
</body>
</html>

--=====================_10144507==.ALT--