[sword-devel] Re: [osis-editors] OSIS 2.0.1 modules updated
Michael Paul Johnson
sword-devel@crosswire.org
Thu, 18 Mar 2004 10:01:56 +1000
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
At 00:29 18-03-04, Patrick Durusau wrote:
...
> From what you say in the next paragraph, I take it that you think an
>encoding should "compell" the same rendering of quotes as the
>original
>text that is being encoded. Either you follow the encoding and get a
>correct result or you deviate from it and get an incorrect result. Is
>that a fair summary?
It would be more accurate to say that I insist that the encoding must
compel the exact same rendering of quotation marks (and all other
punctuation) as the original text being encoded, regardless of the
language and style. Is that asking too much? Do I need to look to
another encoding standard, instead of OSIS?
>The reason I ask is that from my text encoding background, quotes
>(separate from how they are rendered) must be encoded in a way that
>allows the user to always distinguish a quote from other quotes as
>well
>as other material in the text. From my perspective, I never reach the
>question of rendition until someone wants me to actually render the
>text.
- From my perspective, punctuation is part of the text. This includes
quotation marks. If you claim to encode the text but don't do so in a
way that guarantees that the text can be reconstructed, including
punctuation, then you have what I call "lossy encoding." This kind of
encoding is not acceptable for Bible texts, in my opinion. I
understand your point of view. I just disagree with it.
>Are we disagreeing about where the rules for rendering should be
>placed?
We are disagreeing as to the nature of quotation mark insertion. You
seem to be of the opinion that this is a rendering issue, much as
selection of font parameters for the various kinds of encoded text
elements would be, or how verses are marked (or not marked) in the
rendered text. I strongly disagree with that perspective. Punctuation
marks, including quotation marks, are part of the Bible text. If you
want to include metadata in the XML markup about the quotations,
perhaps to enable analysis of the text or even to regenerate a
different version of the text punctuated differently, that is a
separate issue.
>Normally rendering is separated as much as possible from content, but
>I
>don't know of any common markup system that does that entirely.
Fine, but I don't regard punctuation marks as a rendering issue. They
are part of the text, as are periods, commas, apostrophes, colons, em
dashes, etc. Each of those could be represented with markup, too, but
I don't see any advantage to doing that, either.
>One possibility, depending upon the degree of rendering information
>you
>wish to embed would be to use PIs (processing instructions) but I
>don't
>have time to cover that in detail at the moment.
I don't want to treat quotation marks as a rendering issue. Therefore,
encoding the rules into the OSIS text could never be more than a lame
work-around, at best.
>That different languages follow different quote rendering traditions
>is
>really beside the point. We have a text in a particular language and
>that is the language in which it will be rendered.
No, it is NOT beside the point. I deal with texts in a multitude of
languages, and I want to keep the text and rendering style information
separate, but I also want to specify with 100% accuracy exactly where
every quotation mark of any kind goes.
>OK, I have the conclusion but not the "why" that underlies it.
You now have the "why." You may disagree with it, but you have it.
This is a major issue, as far as I'm concerned. I refuse to use OSIS
in the way you envision it, with quotation marks inserted only at
rendering time. Period. You have given me no good reason to do
otherwise, and the reasons you have given are ones with which I
disagree.
>True you would need a separate stylesheet for each edition, but I am
>assuming enough variation in rendering for that to be necessary in
>any
>event.
Putting the quotation mark rendering rules in the style sheets is not
acceptable to me. Even if you wanted to edit the way a text was
punctuated, for example to turn NASB-style quotations and paragraphs
to NIV-style quotations and paragraphs, I think that should be a
totally separate process.
>Note that the case we have not discussed is the rendering of multiple
>overlapping and sometimes nested quotes. That I concede is a problem
>and
>one that we have not entirely addressed. Whether that should be by
>encoding (strictly speaking), PIs or simply stylesheets is an open
>issue.
That is an issue that is not even a concern when the quotation marks
are treated as text.
>> This is NOT acceptable to me. I think I'm pretty reasonable, and I
>> like
>> to use standards in a standard way, but if OSIS stays the same, I
>> will
>> never use it exactly as it was specified. If one of your most
>> active
>> proponents of your standard feels that way, maybe you should look
>> at the
>> problem again?
>>
>:-) Appreciate the support and we really do want a solution that
>works.
OK, then make it so. :-)
>> For now, I will continue to recommend that everyone embed correct
>> punctuation directly in the text of OSIS documents and to use <q>
>> in a
>> nonstandard manner to mark Jesus' words, when desired, like the
>> following quote.
>>
>> <q sID="Matt.3.15.1" who="Jesus" type="x-doNotGeneratePunctuation"
>> />“Allow it now, for this is the fitting way for us to fulfill all
>> righteousness.”<q eID="Matt.3.15.1" />
>>
>> If anyone ignores the type="x-doNotGeneratePunctuation", and
>> generates
>> punctuation from the markers anyway, they will get double
>> punctuation.
>> This is not good. I'm hoping you will change the standard to
>> something
>> that we can actually use and feel good about.
>>
>Don't understand the necessity for the
>type="x-doNotGeneratePunctuation"
>attribute?
That is because the punctuation is already rendered correctly in the
text, and the q element is only there to facilitate the rendition of
"red letter" editions for those who want to do so. This attribute is
there to remind the user of the text that additional punctuation marks
are not to be inserted, here. See the example, above, and note that
the quotation marks are already in the text AND a q element (in
milestone format) surrounds that same quotation. The attribute is
necessary because I chose not to remove the existing quotation marks,
for the reasons I already gave you above.
>> The <q> marker as you have defined it has merit when generating
>> quotation punctuation in the first place in a new translation.
>> After
>> that is done, it has no merit, at least for any application that I
>> am
>> concerned with: translation, typesetting, and electronic
>> distribution.
>
>OK, but again you are telling me what you concluded but not why?
All I ask of OSIS is that it be a good standard Bible text interchange
format. To do that, it must be able to represent the entire Bible
text, including punctuation. It would be nice if it were more elegant
and efficient, but inelegance and inefficiency are not that important.
Lossless encoding is essential. If it can't do that, then it is not
useful to me or the organizations I work with, at least not for the
applications I would consider using it for.
The wording of the current documentation for OSIS pretty much demands
that I treat quotation marks as a rendering issue instead of part of
the text. I am unwilling to do that. I would rather introduce a
competing standard than do that. Treating quotation marks as rendering
issues may make sense when dealing with one or even a small number of
languages, but the very idea of doing so is repulsive to me when
dealing with any significant fraction of the world's languages.
>Not trying to be
>pushy but I think we can work together towards a solution if we can
>illustrate the problem as I have tried to show a solution above.
>There
>maybe reasons why a particular solution does not appeal to you or
>work
>in a given context but that again is something we can address.
If you really want to provide a solution that works, then alter the
OSIS specification to allow it to be used in the manner that I'm using
it.
>Note that truly random typographic markup, quotes or other markers
>that
>occur in a manner that cannot be described in terms of the structure
>of
>the text, cannot be encoded or rendered using a stylesheet aside from
>use of PIs or specific stylesheet instructions that address those
>elements.
So why would you want to force me to do it that way? That is very
convoluted and doesn't even directly address the main issue.
>It is a fundamental limitation of XML that it cannot, without use of
>one
>of the mechanisms I mentioned (PIs/specific element styles by ID),
>reproduce random typography. It may be very important and significant
>typography but structured markup is ill-suited to that purpose.
>Emphasis
>on the fact we can do it, question is how important is it?
So don't do that. Let the punctuation be in the text.
>Get the same issue with academics and XSL-FO. Question there is that
>a
>text may look "better" with hand inserted micro-spaces between
>letters
>for an ancient text. Well, do you want to pay someone $20/hour (or
>more,
>I'm guessing) to typeset 200 pages of text or do you want me to spend
>60
>minutes setting up an XSL-FO stylesheet that allows you to render it
>over and over, even after every correction? Is it as good as hand
>typesetting? No, but then it is far cheaper and allows for revision
>up
>to the point we ship to the printer. Suppose you can guess which one
>I
>advocate. :-)
This is a totally unrelated issue, at least to my way of thinking.
>> <revisionDesc resp="Rainbow Missions, Inc.
>> http://RainbowMissions.org">
>> <date>2004-03-14T12.25.09</date>
>> <p>
>> This draft version of the World English Bible is substantially
>> complete in
>> the New Testament, Genesis, Exodus, Job, Psalms, Proverbs,
>> Ecclesiastes,
>> Song of Solomon, and the “minor” prophets. Editing continues on
>> the
>> other
>> books of the Old Testament. Apocrypha books in this file are still
>> in rough
>> draft form.
>> </p>
>> <p>Converted ..\..\web.gbf in GBF to web.osis.xml in
>> an XML format that attempts to comply with OSIS 2.0 using
>> gbf2osis.exe.
>> (Please see http://ebt.cx/translation/ for links to this
>> software.)</p>
>> <p>GBF and OSIS metadata fields do not exactly correspond to
>> each
>> other, so
>> the conversion is not perfect in the metadata. However, the
>> Scripture
>> portion
>> should be correct.</p>
>> <p>No attempt was to convert quotation marks to structural
>> markers
>> using q or
>> speech elements, because this would require language and
>> style-dependent
>> processing. In English texts, the hard part is figuring out what
>> ’ means.
>> The other difficulty is that I am not yet convinced that the proper
>> punctuation marks would be reconstituted by software that reads
>> OSIS
>> files.</p>
>> <p>The output of gbf2osis marks Jesus' words in a non-standard
>> way
>> using the q
>> element AND quotation marks if they were marked with FR/Fr markers
>> in
>> the GBF
>> file. The OSIS 2.0 specification requires that quotation marks be
>> stripped out,
>> and reinserted by software that reads the OSIS files when q
>> elements are
>> used.
>> To convert this to an OSIS 2.0 file, you must either remove all q
>> elements,
>> remove the quotation marks around Jesus' quotes, or convince the
>> keepers
>> of the
>> standard to change the standard.</p>
>> <p>OSIS does not currently support footnote start anchors.
>> Therefore, these
>> start anchors have been represented with milestone elements, in
>> case someone
>> might like to use them, for example, to start an href element in a
>> conversion
>> to HTML.</p>
>> <p>Traditional psalm book titles are rendered as text rather
>> than
>> titles, because
>> the title element does not support containing transChange elements,
>> as
>> would be
>> required to encode the KJV text using OSIS title elements.</p>
>> <p>The schema location headers were modified to use local
>> copies
>> rather than the
>> standard locations so that these files could be validated and used
>> without an
>> Internet connection active at all times (very important for the
>> developer's
>> remote island location), but you may wish to change them back.</p>
>> </revisionDesc>
>>
>
>I recall some recent discussion of footnote start anchors but don't
>have
>it at my finger tips. Can you say a few words about that?
I thought that I already did that. I look at footnotes as a note that
pertains to either a range of text or a point in the text. This note
may be rendered at the bottom of the page, in a pop-up window, or
whatever, but it contains information about the main text that may be
helpful but that is not part of the main text. Since the note may
pertain to a range of text, I have found it useful to mark the
beginning of the text with a "begin reference" marker (<RB> in GBF),
then mark the end of the text with an element containing the note
itself. This way, it is easy to render the text to which the footnote
pertains as a hyperlink. Also, if you wanted to treat footnotes like
the JPS Tanach did in print (with superscripted markers at both
places), you can. OSIS has no equivalent marker, so I put in a generic
milestone. In rendering footnotes, if there is no beginning footnoted
text marker, I just render it as a point, for example as a hyperlinked
asterisk pointing to the note. In print rendering, I usually ignore
the first marker, but could do something with it.
This isn't a major problem, just a feature that I miss that would be
easy to supply. I can live with my current work-around in the current
OSIS version just fine.
>Appreciate the disclaimer with information on what to change but I
>still
>don't see the language dependency as being a problem. That happened
>when
>the translation was made so I think we agree that the quotation style
>from that perspective is fixed.
We obviously don't agree on this point. I guess the next question is
"Can you humor me and allow the OSIS specification to be flexible
enough to accommodate my needs as well as yours, even if you disagree
with my philosophy of quotation mark rendering?"
I suppose I can always just modify the standard to my own taste (which
I have sort of done) or generate a competing one (which I actually
have a private draft of), but I would rather see us come to some kind
of consensus. What is the point of a standard if it isn't really
standard?
May God bless you with wisdom and insight.
Michael
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (MingW32)
Comment: http://eBible.org/mpj/gpg.htm
iD8DBQFAWObnRI/gxxfXR7sRAj7kAKCe4pTUp2g4/liQvySzspgEBQhQGACeLG4y
Owzv1F9ZeQPnuGHBguTrXdY=
=sESR
-----END PGP SIGNATURE-----