[osis-core] Re: Bullet Points on OSIS 1.5
Steven J. DeRose
osis-core@bibletechnologieswg.org
Tue, 10 Jun 2003 14:27:23 -0400
At 4:15 PM -0400 6/9/03, Mike Perez wrote:
>Steve-
>
>Please send me bullet points on the changes to OSIS 1.5 from OSIS 1.1.1. I
>wish to include them in an article for the BTG website.
>
>Thanks.
>
>Mike
Feel free to shorten as needed. Sorry I didn't get this to you last
night as I intended... Everybody else: skim for egregious errors,
please.
------------------------
After 3 different groups had fully marked up various Bibles, it was
found that the splitID approach to marking up elements that cross
over others, is too ugly to contemplate. Discussion also showed
limited support for it among those represented at the Dallas meeting,
or others they knew of. On the other hand, there was broad agreement
that having 2 different ways to mark up crossovers (segmentation vs.
milestones) is a bad thing, because it reduces interoperability, is
more for users to learn, and makes more work for implementors. After
lengthy consideration, we dropped the splitID method.
Troy Griffiths then proposed a tweak to the syntax for the milestone
method; many in the room groaned that we hadn't thought of it before.
1.1.1 uses 2 generic tags, <milestoneStart/> and <milestoneEnd/>;
each requires a "type" attribute whose value is always the name of
the element the milestone pair was standing in for. Troy's insight
was that we can allow all of the "milestoneable" elements to occur in
empty form, and use that as the milestone. Thus, instead of the old
<milestoneStart type="q" end="xyz"/>... <milestoneEnd type="q" start="xyz"/>
we now have:
<q sID="xyz"/>...<q eID="xyz"/>
This has the advantages that we can delete 2 element types; that all
the instances of an element type whether crossing over or not have
the same element type name; that it's easy to teach people how to
make the crossing form of an element once they know the regular form;
that the crossing form automatically permits the right set of
attributes (which we couldn't enforce before); and more. It's also a
lot shorter and easier to read. These two changes do have the effect
that any existing documents that use crossing markup, must be updated
if they wish to conform to the new version of the spec.
Given this, it makes sense to state that best practice is to encode
in terms of books, sections, and paragraphs (and so on), and whenever
chapter/verse markers cross these, it is the chapter and verse
markers which must be turned into milestones. For editions that have
no section, paragraph, or similar markup, this poses no problem
because no crossovers occur. To accommodate such works, we added a
literal <chapter> element, and now recommend against the use of <div
type="chapter">.
We divided up the over-used osisRef attribute, so that in all cases
it now represents a true reference: for example, a link to a Biblical
passage from a sermon, commentary, or similar work. We added a
"scope" attribute for the other meaning osisRef sometimes had, which
was to identify what portion of the Bible was contained or quoted in
a given element. For example, a block quote of the Bible within a
commentary. Scope can also be used in the header declaration of a
work, to identify works that are not complete Bibles, but portions.
We defined a best-practice for encoding time-organized works like
lectionaries, daily devotions, and so on. It does not break any
existing documents. This required no schema changes, merely the
statement that the osisIDs to be used in such works are the
applicable times. We studied time syntax in standards from IETF, W3C,
ISO, and TEI, and found that none deal adequately with named times
(like "vespers"), times BCE, approximate times, time ranges, and
times with the year unspecified (as in most daily devotionals). We
developed extensions to the RFC 3339 time format to allow these
things, and will state as best practice that OSIS documents of these
types should use this time syntax.
We also made a small change to the reference syntax, which does not
break any existing usage. Most Bibles use one of a few overall
versification schemes; but most also make minor internal changes,
such as splitting some verses into "a" and "b" parts. Some Bibles may
also wish to insert identifiers for non-canonical headers, sections,
and the like. We therefore now allow individual works to freely
extend the versification scheme they use, by adding "!" and then an
identifier of their choice. It is understood that anything after the
"!" is not part of the "official" versification scheme, and is
specific to the particular work.
We tightened up the rules for how poetry is marked up. The 1.1.1
rules permitted a great deal of variation, which we wished to reduce.
The rule now is that <l> is used for line breaks that are justified
on the basis of some linguistic phenomenon (for example, Hebrew
poetic parallel structures); purely typographical line breaks,
however important, are to be encoded with the <lb/> (line-break) tag,
whose "type" attribute can be used to distinguish indentation types,
preference levels for line-breaking in wide and narrow column
editions, and other matters of importance in actual layout.
The semantics and best practice for a number of other elements were
discussed and clarified, but these affect only the documentation, and
not the schema itself.
Overall, the group found excellent consensus on most of the issues
discussed, and we feel we've tightened up the expected practice quite
a bit, so as to make learning, teaching, using, and implementing OSIS
all easier.
--
Steve DeRose -- http://www.derose.net
Chair, Bible Technologies Group -- http://www.bibletechnologies.net
Email: sderose@acm.org or steve@derose.net