[osis-editors] Re: [IPS] Fw: OSIS criticisms
Patrick Durusau
patrick at durusau.net
Thu Feb 16 11:58:44 MST 2006
Michael,
Kahunapule Michael Johnson wrote:
> Patrick Durusau wrote:
>
>> Michael,
>>
>> Just so you know, the ABS posted an uncorrected version of the latest
>> Users Manual. I should have the corrected HTML version done tomorrow
>> and the PDF version by the first of next week. Why they post things
>> without asking the persons responsible for it, I have no explanation.
>>
>> Nothing seriously different, some typos in the examples, etc., but
>> they are the sort of thing I wanted to correct before it was posted.
>
> I seriously hope you have more corrections than typos. If not, then I
> have been very accurate in my criticisms of OSIS, even in its current
> state. Nevertheless, I want OSIS and its documentation to be as good
> as it can be, no matter what level of support or resistance I decide
> to lend to OSIS.
>
> My comments are below, intended to be constructive. They include both
> big and little things, things that are "no-brainer" corrections to
> typos, etc., and things that are subject to debate. Take them for what
> they are worth. I felt I owed it to you to at least do a careful
> reading, and to do so again when you say the document is ready for it.
> I'd be glad to preview a draft or two, if you like, if there are
> substantial changes, at least in the areas that interest me.
>
> Most of the following is pretty routine proofreading, except for the
> part about "<q>".
>
> May God bless you with wisdom as you edit.
>
>
> Review comments on OSIS Schema 2.1.1 downloaded 14 Feb 2006
>
>
> Forward
>
> Typo: Bullet for Nathan Miles: change “no only” to “not only”
>
> Type: last sentence: missing “is” in “In a time when faith __ often
> treated with disdain,”
>
+1
>
> 1. Introduction
>
> Typo: Page 10 “the end of a paragraph with <p>.” missing a /
>
+1
>
> 2. Getting Started
>
> Believability issue: first paragraph, claim that anyone who can type
> using a basic word processor or text-editing program can use OSIS? You
> obviously don't know the same users at this level as I know. I think
> you are setting unrealistic expectations. I think people who can
> understand and edit HTML can probably, with effort, learn to use OSIS
> the hard way, but people who struggle to write an email or type a
> simple essay are not to be expected to manually edit OSIS.
>
If you mean looking at the entire schema in an XML editor, you may be
right.
But, I could set someone to encoding a Bible with no more than 10-12
elements and they would do a pretty good job. True, they might not be
marking everything but one of the failures of a lot of markup projects
is the notion that a single person has to do every part of the text. It
is far better to have a rough encoding done by someone who marks
paragraphs, quotes, divs, that sort of "big" structures and then allow
other do more detailed markup. Such a process also gives you more eyes
looking at the text.
Which your comments demonstrate is a good thing! (Good catch on the
missing / in the <p> in the midst of a lot of prose.)
>
> 5. XML and OSIS declarations
>
> Implementation issue: page 17 “OSIS is developing a schema for
> declaring versification systems formally, and for declaring some
> systems in terms of others. This will enable programs to map between
> systems.” In the mean time, we have no standard way of dealing with
> these minor variations?
>
Yes and no. OSIS can handle the variant versification by simply encoding
them.
But no, we have not issued a canonical mapping that would enable
automated processing of texts for display of the "same" material from
different Bibles.
> Note to self: set canonical attribute to true in Psalm titles in
> sample OSIS converter.
>
>
> 7.2.1.1 Title Placement
>
> Observation: I didn't expect to see layout-specific markup in OSIS. No
> real harm here except in added complexity and blurring the lines
> between content and formatting.
>
Symptom of having developed OSIS not in isolation but listening to the
needs of real users, like those at SIL.
Had I designed it on my own for my use, it probably would be
substantially different than it is today. But, others bring different
skill sets and interests than I do and any schema that is going to work
for more than one person or group, is going to reflect that influence.
And yes, from a purist standpoint it does blur the lines between content
and formatting. But purity without users is really fairly marginal.
>
> 7.2.4 Date
>
> Observation: the multiple calendar capability is an example of
> unnecessary complexity coupled with incomplete specification. I would
> think that it is reasonable for an OSIS processor to only interpret
> ISO dates, and leave the rest for uninterpreted display only or just
> ignore them.
>
>
> 7.2.11 Format
>
> Does this element serve any useful purpose? Isn't an OSIS document
> always text/xml? If so, why clutter the document with such obvious
> self-reference?
>
Refers to the original text, the one you are working to encode. More of
an interest for scholarly users.
>
> 7.2.15.5 refSystem
>
> What value is repeating this information here?
>
Sorry, missed me, what is being repeated?
>
> 7.5 The <div> Element
>
> It quickly becomes obvious that OSIS is designed for all kinds of
> things that are not Bibles. This is good, I suppose, if you are
> interested in encoding those other things. It does bring along a lot
> of excess complexity baggage when dealing with Bibles, though.
>
Perhaps, but I have heard that ministers write sermons and want to
include Bible portions, Bible Societies produces less than whole Bibles
and want to include Bible portions, scholars write commentaries and want
to include Bible portions, etc., etc.
All of those communities have a legitimate interest in Bibles and there
isn't any reason, at least that I have seen, to discount their needs in
favor of focusing only on producing printed Bibles.
> 8.1 <p>
>
> Illustration on page 42 is missing.
>
+1 I think this is fixed in the most recent one, but I will check after
I output the PDF.
>
> 8.2 <q>
>
> This wording of this section contradicts email communications that I
> have received from OSIS committee members (most recently Chris Little)
> saying that it is OK to leave the quotation punctuation in the text of
> the Bible AND use <q>.
>
> Chris Little said: “You want preservation of presentational forms of
> quotation marks. This was addressed at length already, yet manages to
> earn about half of the text of your complaints against OSIS. I'm sorry
> it's not codified in prose for you yet, but everyone at the last OSIS
> TC meeting was in agreement that it was already part of the standard
> (<q n="'">-type encoding).” He also said “The OSIS Manual (2.1 draft,
> Appendix K, Conformance Requirements) doesn't actually specify
> anything about marking quotations with <q> vs. ". If it does, in a
> later version, I would expect to see that conformance requirement
> appearing at either level 2 or 3. (I would say it is at least implicit
> in the level 3 requirement.) So, while it may be arguably "improper to
> put quotation punctuation directly in the text", it is certainly not a
> conformance requirement, meaning that documents which fail to do so
> are poor OSIS, but OSIS nonetheless.”
>
> I have been told that neither UBS nor SIL supports using markup for
> quotations, but that they remain in the text.
>
> In past interactions with OSIS committee members, I have been told in
> no uncertain terms that to be valid OSIS, all of the quotation marks
> should be automatically derived from <q> or <speech> markup, and never
> placed in the text, and that this was the superior way to do things
> for a variety of reasons that you all seem very convinced of but which
> many people (including me) strongly disagree with. It isn't that we
> don't understand your preferences. We just disagree.
>
> Troy Griffiths said: “Hello my friend. It's good to hear from you. It
> seems like 2/3rds of your issues with OSIS are having to do with <q>.
> May I suggest patience to review what comes out of the last OSIS
> meeting back in December. We had a good hard look at practical uses of
> <q>, and believe me, your concerns have been heard.” I appreciate his
> assurance, but it is contradicted by the text of this portion of the
> document. You said that only some minor changes would be made to what
> was posted. I regard this as major, and always will.
>
> Now, what if I want to leave quotation punctuation in the text, not
> converting it to markup, and I also want to mark direct quotations of
> Jesus Christ so that they may optionally be rendered as red or not. (I
> know that there are differences of opinion on if this is a good idea
> or not, but for now please just accept that there are some Bible
> translators and publishers who want to do that-- some that I don't
> want to ignore.) It looks like <q who=”Jesus”>...</q> (or the
> milestone form of the same) is the only way you provide to do that. I
> should be able to reliably communicate to the OSIS reader that
> additional quotation marks are not to be generated by using an empty n
> attribute, thus: <q who=”Jesus” n=”” sID=”someuniqueidentifier”/>...<q
> who=”Jesus” n=”” eID=”someuniqueidentifier”>. This usage is the best
> solution given to me so far, and it also is directly contradicted by
> the first sentence (and more) of the <q> section of the OSIS
> documentation.
>
> The following would best be left out of the documentation, unless your
> purpose is to antagonize a large part of your intended user
> constituency: “There exists truly exceptional circumstances where
> automatic generation of quotes may not meet the needs of some Bible
> encoding efforts. To meet that need, the *q *element has a *marker
> *attribute where the original quotation marker can be recorded.
> Frequent use of this attribute may indicate a lack of attention to
> technical solutions that provide a greater consistency in the
> rendering of biblical texts.” First of all, from my perspective, those
> cases where it is best to not use automatically generated quotes are
> the majority of cases, not something truly exceptional. This isn't
> just me. This is the people I work with who do Bible translation in
> the field. Second, this contradicts what Chris Little told me by
> giving a different attribute to use to preserve or suppress quotation
> punctuation generation. Frankly, I don't care which attribute you use
> for this purpose as much as I care that you document its meaning
> clearly and that it can be used for what I want to do with it.
>
> It seems to me that peaceful coexistence can be found within OSIS
> between those who want to generate OSIS texts where quotation
> punctuation is generated on-the-fly, and those (like me) who do not
> believe that OSIS specifies how, where, and what quotation marks to
> generate for each language and translation well enough to trust this
> generation on-the-fly to generic OSIS readers, and who do not see any
> value in re-encoding existing texts where the quotation punctuation is
> already in the text. That peaceful coexistence depends on clearly
> stating ways to mark quotations (or not mark them) such that there is
> never an ambiguity when or if a compliant OSIS reader should generate
> quotation marks, what kind it should produce, and where they should
> be. Empty marker or n attributes may be a good way to say “don't
> generate quotation punctuation-- this <q> or <speech> marker is just
> here to help identify the speaker for red letter edition or search
> purposes”. A non-empty marker would indicate the exact quotation
> punctuation to use, of course, and non-use of <q> or <speech> should
> be OK where the required punctuation, if any, is already in the text.
>
> If you have a better idea that does not involve trying to convert
> people to the religion of the alleged superiority of markup over
> punctuation in the text, which still allows the followers of that
> religion to mark up and interpret text that way, and which does not
> place undue burdens on conversions of legacy encoded Scripture texts,
> let's discuss it.
>
> Maybe the best thing to do with this section is to erase it and
> rewrite it. Don't forget the example figure this time. Try not to
> antagonize any of your intended constituency in the process. If you
> don't get this right, you provide proof that pretty much everything I
> have said negative about OSIS is still true. Please don't do that. I
> would much rather issue a retraction.
>
So your request is that we simply give both solutions and not offer a
preference?
You questioned OSIS getting close to formatting under title placement
but fail to recognize that not using quotation marks is as much a part
of "pure" markup as not talking about title placement.
Perhaps we are both being inconsistent. ;-)
Let me think about it. Since you have the mechanism you want, and I have
the one I prefer, would it help if I simply state what the options are?
>
> 8.5 Examples of Notes
>
> More illustrations are missing. I'll not comment on these individually
> any more, as picture loss seems to be a global problem.
>
Hmmm, I will have to check on that because the copy a friend was
proofing from is fine. Or at least he sent it to me with his annotations
written in, some type of notebook software, and all the images are there.
>
> 10. Elements that cross other elements
>
> The idea of letting the encoder choose when to make certain entities
> containers and when to make the same entities milestones is an
> interesting choices. It feels like an afterthought, and it makes OSIS
> decoders more complex. However, there is no way to “fix” this without
> compromising on backward compatibility. In the case of a new standard,
> I would argue against this method. Instead, I would rather divide
> those hierarchies found in Biblical texts into classes that do not
> interfere with each other, choose one of them (namely
> Book/Chapter/Verse OR Book/Section/Paragraph|stanza/verse) to use XML
> containers, and the others to use milestones. This would enable
> simpler milestone end processing, and eliminate the need for some of
> the milestone end markers. All of that is just academic for this
> document, though, unless you define OSIS subsets (like maybe OSIS Best
> Practice) that allow for simplified OSISBP readers that need not
> address all possible combinations of milestones and containers for the
> same information.
>
> The “For future reference” restriction on forbidding unnecessary
> crossing of hierarchy boundaries adds complexity to OSIS encoders, but
> might reduce complexity slightly for decoders. Is this restriction
> worth it? I don't know, but it looks ugly either way.
>
The alternative was to duplicate all the elements that could be
milestones as milestones. Or have one really hairy milestone element
with more options than you could shake a stick at.
Nothing prevents a project, without modifying the schema, from choosing
a set of elements to always be milestone forms. In fact, in a book the
MLA is supposed to publish sometime fairly soon, I have an essay on
doing precisely that in order to prevent markup errors.
The unnecessary crossing causes problems for XSLT stylesheets.
>
> 13.10 Name
>
> I was surprised to see <divineName> applied to “God” instead of just
> to the Tetragrammaton and its translations. I foresee some
> inconsistency in markup here.
>
>
> ? osis.osisText.div.title
>
> Documentation for this <title> seems to be sparse, not in the Table of
> Contents, or missing. There is documentation for
> osis.osisCorpus.header.work.title, though. Using the same name for
> both of these distinct elements may be confusing.
>
> When added, it would be good to note that the current OSIS schema
> seems to allow (among other things) <transChange>, thus correcting a
> defect in earlier versions of OSIS.
>
Sort of a What's New list?
>
> 18. Conclusion
>
> Observation: “The theme or motto of OSIS has been ‘A Common Format for
> Many Visions.’ It will have succeeded only when you are able to
> express your vision of the Bible using OSIS.” This is true. So far, I
> can't... but maybe when the <q> mess is cleared up and documented
> clearly as being cleared up?
>
But I think you agree that you can express your quotations as you want,
but object to the way I said it. Yes?
>
> Appendix C.1
>
> Missing: Designators for Daniel (Greek) and Esther (Greek), which are
> essentially those two books with the Apocryphal additions integrated
> into them in the normal reading order, as you would find them in the
> NRSV with Apocrypha and some “Catholic” Bibles. (Not a big deal to
> SIL, but if you are including common Apocrypha/Deuterocanonical books,
> you might consider at least getting the most popular ones in there.)
>
Sorry, AddDan and AddEsth don't cover that? Serious question, people
call them different things.
>
> Appendix D
>
> Suggestion: might it make sense to suggest a way to construct standard
> OSIS codes for Bible Editions for minority language translations from
> a three-letter language code plus the date of publication and an
> optional dialect name? This could even be used in some cases for
> majority languages where there are not so many translations as there
> are for English.
>
Good point.
>
> Appendix E
>
> Example 1: Does the International Bible Society know about this
> example? Would they approve?
>
> Example 2: Might it be better to use an example where the copyright
> owner and translation more clearly match? The British Monarch might
> not approve of this example. (You may not be living in a British
> Commonwealth country right now, but I am.)
>
Unknown if others know of these examples and whether or not they would
approve.
>
> Appendix F USFM to OSIS Mapping
>
> There is a formatting glitch on the header for USFM.
>
> Explanation for \c (chapter marker) is missing.
>
> No equivalent is given for \ie. I'm not sure why I would use \ie,
> anyway, but I presume someone at UBS knows this.
>
> The USFM marker \wj ... \wj* is missing. (<q who=”Jesus”>, be sure to
> discuss the effects and interactions this has on quotation punctuation
> that are not present in USFM.
>
> Nothing listed for \xo (cross reference origin reference).
>
I will have to go through these in detail.
>
> Appendix F.1
>
> On page 141, a line break element in parenthesis is interpreted as a
> line break in the manual, not displayed literally.
>
OK.
>
> Appendix L.1.1 Level 1: “Minimal OSIS document”
>
> Must declare the versification system used, but the versification
> system definition mechanism is not yet defined. Is there therefore no
> such thing as a minimal OSIS document?
>
> The true empty end element vs. start & end tags with nothing in
> between may preclude the use of some standard XML processing libraries.
>
I think that is old language on versification. Should be reference system.
>
> Appendix L.1.2 Level 2: “Basic OSIS Document”
>
> Usage of <divineName> here seems inconsistent with earlier documentation.
>
Will check.
>
> Appendix N The Bible Technology Group
>
> Thank you for the acknowledgement. If you must publish my email
> address, the one you have is the best one to publish. It is heavily
> spammed and heavily filtered, but actually still works. You might want
> to consider merging these few acknowledgments in with those in the
> Forward and dispensing with this appendix.
>
Good idea.
>
> Deprecate – Global, including Appendix O
>
> The Webster dictionary site says:
>
> deprecate: Etymology: Latin /deprecatus, /past participle of
> /deprecari /to avert by prayer, from /de- + precari /to pray -- more
> at PRAY <http://www.m-w.com/dictionary/pray>
> *1 a* /archaic/ *:* to pray against (as an evil) *b* *:* to seek to
> avert </deprecate/ the wrath ... of the Roman people -- Tobias Smollett>
> *2* *:* to express disapproval of
> *3 a* *: PLAY DOWN <http://www.m-w.com/dictionary/play+down+>:* make
> little of <speaks five languages ... but /deprecates /this facility --
> /Time/> *b* *: BELITTLE <http://www.m-w.com/dictionary/belittle>,
> DISPARAGE <http://www.m-w.com/dictionary/disparage>* <the most
> reluctantly admired and least easily /deprecated /of ... novelists --
> /New Yorker/>
>
> depreciate: Etymology: Late Latin /depretiatus, /past participle of
> /depretiare, /from Latin /de- + pretium /price -- more at PRICE
> <http://www.m-w.com/dictionary/price>
> /transitive senses/
> *1* *:* to lower in estimation or esteem
> *2* *:* to lower the price or estimated value of
> /intransitive senses/ *:* to fall in value
>
> Normally, obsolete or discouraged markup is called “depreciated”
> instead of “deprecated.” Take your pick, but I would recommend the
> less harsh word.
>
> Note that whenever you remove any element without warning, you violate
> your promise of backward compatibility. Therefore, you should either
> not promise backward compatibility, as you do in the Forward, or not
> threaten removal of elements, as you do in Appendix O. Pick only one.
>
Actually in markup circles the term "deprecated" is a term of art.
Not to be overly legalistic, ;-) (I was a lawyer for ten years) but the
backwards compatibility promised is with *this* manual, which cautions
people to not use those elements.
I should finish the typos and some of your suggested revisions in time
to post another version tomorrow.
I won't reach all of them until another break in a current ISO editing
cycle, in about a week. So, expect fuller changes in response to your
remarks and those of others by the end of February.
Hope you are having a great day!
Patrick
> Michael
> http://kahunapule.org
>
--
Patrick Durusau
Patrick at Durusau.net
Chair, V1 - Text Processing: Office and Publishing Systems Interface
Co-Editor, ISO 13250, Topic Maps -- Reference Model
Member, Text Encoding Initiative Board of Directors, 2003-2005
Topic Maps: Human, not artificial, intelligence at work!
More information about the osis-editors
mailing list