[osis-core] Wrapping up?

Todd Tillinghast osis-core@bibletechnologieswg.org
Mon, 22 Jul 2002 14:42:15 -0600


> >>self-identify some of the more contemporary texts that versify by
> >>paragraphs. e.g. "This paragraph is Mark 1:1-9"
> >>
> >
> >
> > In this case the "1-9" becomes a verse name in the current reference
> > system in same way that "4" is a verse name in "Gen.1.4".  Other
options
> > were discussed in recent posts.
> 
> This seems useless, unless self referencing in the same document.  If
I
> have a Bible such as this installed in some Bible software
application,
> and I have a commentary that has a <reference> to Mark.1.7, I would
hope
> my Bible would jump to the "This paragraph is Mark 1:1-9" paragraph.
I
> don't think we decided how to do this.
> 
> In Dallas, we decided to force these types of Bibles to have multiple
> milestone starts, so we could still, easily do a string-match
reference
> resolution system.
> 
> e.g.
> <verseStart ref="Mark.1.1" />
> <verseStart ref="Mark.1.2" />
> ...
> 
> 
> Now that we're using containers, I'm not sure how we've decided to
allow
> this.  I still think it's not a trivial jump we're making if we decide
> to allow ranges.  I'm not necessarily against it, but am concerned
about
> the complexities introduced.  The multiple milestart start solution
was
> brainless and made for easy implementation.  I could write XPath to
> resolve to any versification reference that this Bible claims to
> implement.  In the range solution, this is no longer true.
> 

Regardless of weather or not milestones are permitted the same problem
exists.  (See discussion prior to Rome, regarding this issue.)  In
either case we only have a SINGLE point at which to place an identifier
or set of identifiers.  With the milestones option, there is no way of
determining which part of a paragraph belongs to the individual
identifiers from a "standard" reference system.  As a result all of the
applicable identifiers have to be assigned to a single milestone.  The
best option here is to allow osisID to be a list of osisIDs.  This if
great for the case where there is one element in the document for many
"standard" reference system identifiers.  The trouble starts when there
are many elements to one identifier in the "standard" reference system
or even worse many to many!  It was at that point that I conceded my
earlier position similar to the one your are taking and contend that in
the cases mentioned that we must use the "work specific" reference
system.

(We can not let osisID be a list of osisIDs because it needs to be a
unique identifier.)

To soothe the pain of this option, we can provide an attribute that is a
list of ids that this verse represents.  The verse need not be the
exclusive verse with that id NOR would the verse be expected to carry
only a single identifier from each reference system.  This would
complicate the matching of an simple string identifier with an osisID,
because if the identifier is not found then looking in the list would be
secondarily required in order to find the desired element(s) with that
identifier or determine that no such verse exists.  

But this is the trouble that we are left with as a result of translators
determinations AND the trouble exists regardless of the existence of
milestones.

(Declare in the header that the default reference system for this TEV
work is Bible.KJV)
<verse osisID="Bible.Todd:Mark.1.1-9" altIDs="Mark.1.1 Mark.1.2 Mark.1.3
Mark.1.4 Mark.1.5 Mark.1.6 Mark.1.7 Mark.1.8 Mark.1.9">....</verse>
<verse osisID="Mark.1.10">...</verse>
<verse osisID="Mark.1.11">...</verse>
<verse osisID="Bible.Todd:Mark.1.12a" altIDs="Mark.1.12">...</verse>
<verse osisID=Mark.1.13">...</verse>
<verse osisID="Bible.Todd:Mark.1.12b" altIDs="Mark.1.12
Mark.1.14">...</verse>
<verse osisID="Bible.Todd:Mark.1.14" altIDs="Mark.1.12
Mark.1.14">...</verse>


First this is just a example but we have discussed concrete examples in
the past, either in this list or in the scripture reference working
group.

Note that the logical Bible.KJV:Mark.1.12 falls into three verses with
this version and that Bible.Todd:Mark.1.14 is not only
Bible.KJV:Mark.1.14 but also part of Bible.KJV:Mark.1.12.

This insures a unique osisID, while accommodating all manner of
alternate reference systems.  Further the milestone element can be used
to provide precision marking of points where a verse starts and ends in
other reference systems.  In this case the type attribute would likely
be set to type="verseStart" and type="verseEnd".  While this would not
be the primary mechanism for finding verses it is a supported secondary
mechanism.

Thoughts?


> 
> > The case with Mark and Mark.1 you state is an osisID but there is an
> > other case where there are divs smaller that Mark.1.
> > (Mark.1.1-Mark.1.12) In this case an osisID can not be used to self
> > identify itself.  What I am proposing with the container ID is that
when
> > a container can not self id with an osisID that a range based
self-id be
> > provided.  I think that we should retain osisID as a non-range and
> > continue to provide it for the cases you state above but allow this
> > secondary self-id mechanism.
> 
> My thinking on this is that if this new range mechanism is INDEED to
be
> treated as an "I am this" self-identification, that could be the
target
> of a reference... if indeed we are going to allow this, and force
> support of this in implementations, then there is no difference
between
> osisID and this new mechanism.  They are both doing exactly the same
> thing.  There should not be 2 mechanisms to do this exact same thing.
> We should instead just extend osisID.
> 
> Unless there is a difference between the 2, other than the granularity
> to which they can point, then the 2 should be consolidated.

The difference is that with a container there could be a simple osisID
OR a self-id that allows a range that the osisID does not.  The point
being that there is very strong benefit in precluding ranges in osisIDs.
This precludes self-identification for containers except for those that
are nodes of the hierarchy of the reference system.  As we are well
aware not all nodes of the document hierarchy will coincide with the
nodes of the hierarchy of the reference system.  

Why would you need a self-id for a container element that could not have
an osisID?  If the document is small (either part of a chapter or part
of a book but larger than a chapter) then having an mechanism to
identify what is in this document would be helpful.  For containers
farther down the tree, it is beneficial to know what range it refers to.
This is not necessarily for the purpose of a search/xpath/xlink but to
self-identify.  

Further, if a larger document is broken into pieces and a skeleton tree
that represents the whole document is created the provision for self-ids
for all container elements provides a reliable mechanism to match the
fragment trees into the whole.  The purpose of this being to provide
access to the Bible without having to retrieve an entire version if all
that is needed is a small fragment or several small fragments.  Also the
part that is in demand can be present very quickly by getting the
overall tree and a fragment while the rest of the document is retrieved.

> 
> This is my thinking.
> 
> 
> 	Thanks for the discussion,
> 		-Troy.
>