[osis-editors] Re: [IPS] Re: OSIS: history, quotation marks,
any other issues?
Kahunapule Michael P. Johnson
Michael_Paul_Johnson at sil.org
Fri Mar 18 21:05:24 MST 2005
Doug Higby wrote:
>Dear Michael and Todd,
>
>Forgive me for entering into a rather complex discussion with a rather simple concern about quotations.
>
>
Actually, your simple concern is very important, and one that I have
considered. The other ones are:
* versatile reformatting of punctuation to display both "verse list" and
"paragraph oriented" Scriptures from the same source
* automatic adjustment of quotation marks when pulling a quote
containing quotes from a Scripture portion
* initial generation of quotation punctuation in a specific Scripture
translation, then optionally "freezing" the results.
Balance those with the following concerns:
* keeping OSIS readers simple by not requiring intimate knowledge of
every language and style on Earth.
* honoring the punctuation decisions of existing translations and
publishers, and respecting their copyrights
* facilitating automated, lossless conversion between significant Bible
interchange formats.
There is a way to do both. This is not an either-or proposition. The
markup used should be flexible enough to deal with the applications it
is used for, and the languages and styles it is used to encode.
>Michael, you insist that quotations are punctuation that is an integral part of the scripture text and should not be coded as anything but actual text.
>
That is only partially correct. Punctuation, including quotation
punctuation, should always be allowed to be encoded as actual text.
There are times when it makes sense to encode quotations with markup.
> And the OSIS standard, as I surmise, is coding them as markup rather than text.
>
>
That is what I understand.
I would like to allow you or any other Bible translator to use markup to
generate quotation punctuation as you see fit, and to be able to do it
differently for different situations where it is appropriate for your
language.
I strongly object to requiring quotation punctuation to always for all
languages and all translations and for all time to be coded as markup
and not as actual text. I insist on having the OPTION of coding
punctuation as actual text, and not as markup. Unmodified OSIS is
unsuitable for my applications (including most SIL applications) because
of this restriction.
I would prefer to be able to use markup to indicate quotation
punctuation for your case and for the others that I mentioned, when the
translator is wanting to do that.
When the Scripture file moves from the translator to the publisher, then
I would like for the publisher to be able to surmise from the markup
itself how to correctly generate quotation punctuation without reference
to another document; if not in your source document, at least in a
processed generated document used for exchange purposes.
The OSIS standard, at least as of the last time that I looked, does not
allow the use of quotation punctuation as part of the text AND marking
quotations that may be desirable to render with a different character
style. It has other minor defects that I could live with, but this one
is fatal. That means that I am opposed to using unmodified OSIS until
this defect is corrected.
A good Bible XML markup schema would:
* Allow 100% exact specification of where every single punctuation mark
goes, including quotation punctuation if desired by the translators,
without reference to any external style sheets.
* Allow markup for quotation start and end quotations, and even for
quotation reminders (which might be automatically generated), for the
purpose of generating quotation punctuation in different circumstances.
The automatic generation need not be included in the markup standard if
the results of that process can be unambiguously encoded.
* Allow for 100% automatic lossless conversion to and from USFM.
Unmodified OSIS does only one out of those three, unless this has
changed very recently. It isn't a matter of implementation. It is a
matter of what is possible to implement given the USFM and OSIS standards.
The best solution I could imagine is embodied in USFX
(http://ebt.cx/usfx/) already. In that case, quotation start, quotation
end, and quotation reminder markup exists, but (1) is not mandatory to
use, (2) those element act as containers for the actual quotation
punctuation to be used at that point, and (3) it is easy to run a
separate process to generate or regenerate quotation marks in any style
you please, embedding the results in your document in such a way that
not every USFX reader has to understand the punctuation rules of your
language-- just the one you use to generate or regenerate your punctuation.
(Note: USFX isn't intended to do everything that OSIS can do, but it can
do everything that USFM can do plus a few things OSIS can do but USFM
can't do. I didn't invent USFX to compete with OSIS, but to solve a
problem that even slightly modified OSIS couldn't solve. Indeed,
authoring in USFX then converting to OSIS would be a good way to produce
OSIS files for a lot of people, as it can handle pretty much everything
the ordinary working linguist would use, and it is much simpler.)
Here is the essential difference between the religion of OSIS and my
rather pragmatic view of the many uses of a Scripture interchange format
file: I don't trust programmers, publishers, and people who don't even
speak the language of a Scripture translation to always generate the
correct quotation punctuation from markup. I don't believe that a few
simple rules are sufficient for people other than the translators to get
it right. There are stylistic decisions and exceptions to rules that are
intentionally made. For example, even in the extremely simple case of
the World English Bible, there are at least two intentional exceptions
to the rules in the quotation punctuation checking program that I wrote.
I do, however, trust the translators to provide a set of rules to
generate quotation punctuation from markup, or even multiple sets of
rules as options. I want the translators to be able to intentionally
specify exceptions to the rules, if necessary. I want to have the
translators' rules and exceptions "stick" when they pass the Scripture
file on to others. I don't want those others to have to know or
understand all the rules and exceptions, but to be able to simply read
the markup and display the results, and have them be right according to
the translator.
OSIS could be easily modified to do that. I would not be so vociferous
about this problem, had they done so many moons ago when I first brought
this problem to light.
In short, I'm on your side, Doug, but I am still opposed to using OSIS
as it is currently specified.
Does that make sense?
>Here is a case for you to consider:
>
>I am publishing the New Testament in Fulfulde over the next month in Dallas, and we have adopted a quotation system that follows the French system for direct speech. We mark direct speech with an m-dash at the beginning of the line as in:
>
>Peter said:
>--You are the Messiah, the Son of God.
>
>I am not confident that we will keep this form of punctuation, and some day, when we print the entire Bible, we may want to switch to using the angle brackets that both open and close the punctuation. Both are acceptable forms of punctuation.
>
>I would much rather have my data stored in a format where the markup was aware of where the quote started and where it stopped. The system I am currently using is opened with the m-dash, but can be closed by any number of format markers. Some format markers allow the quotation to continue with a new paragraph with no additional markers other than that the new paragraph is indented to the same level as the one above. I know there are other complex quotation system for another reason too. If you go to the quotation checking utility built into Paratext, you will find that they have to determine the following information to see if a quote is closed properly or not:
>
>Data fields:
>Quotes:
>Quotes in Quotes:
>Quotes in Quotes in Quotes:
>Continue quotes (are quotes continued at each new paragraph?)
>Continue quotes in quotes
>Continue quotes after these markers:
>
>It isn't worth explaining all the purpose of these fields except that they are to help Paratext check to see if quotes have been properly terminated and marked.
>
>As complex as this model is, it can't handle my quotation system, and I have to check quotes by hand to a large degree.
>
>
This is a good argument against using markup to automatically generate
quotation marks, isn't it? I understand that you want to mark the
quotation start and stop points, but do you want some programmer in
India who never met you to write the rules for placing your quotation marks?
>To me, the markup language would be the ideal place to signal when a quote starts and when it stops. If the markup language permitted this, I would be able to switch from one quotation system to another, based on the media.
>
>
Yes, IF you were able to write the rules easily enough and embed them
into every OSIS reader on the planet.
>Example: The Parole de Vie, French translation used the m-dash system for their New Testament, but when they came out with the Old Testament, they switched to an open/close system using angle brackets. They probably did this because the text had to be smaller point size for the whole Bible, and also, they needed to conserve more space in the whole Bible, since the m-dash quote system they used, created a lot of white space that couldn't be sacrificed in the whole Bible printed edition.
>
>I can't argue that the quotation system is an integral part of the text. I would instead argue that the quotation marking system is part of the markup language. The benefits of such are:
>
>1. Software can easily check the integrety of the quotation system, which is overly complex to accomplish with existing USFM.
>
>
You could check the open/close quote matching and nesting, but you
couldn't check the punctuation any easier than you can, now.
>2. The quotation marks can be adapted to what works best with the media and format: Web page, PDA screen, Large print edition, New Testament only, Whole Bible, Passage excerpts.
>
>
Yes, IF you have the rules for your language encoded into a standardized
style sheet readable by every OSIS reader in the galaxy, OR you limit
the processing to a few OSIS processors that "understand" your language
and embed the results into as many OSIS texts as are appropriate.
The unspoken assumption in OSIS is that someone else will deal with the
quotation punctuation generation problem using a style sheet that should
be easy to generate and use. Yeah, right.
--
Kahunapule M. P. Johnson <Michael_Paul_Johnson at sil.org>
http://eBible.org/mpj/
More information about the osis-editors
mailing list