[osis-core] osisID: Summary and Proposal
Patrick Durusau
osis-core@bibletechnologieswg.org
Sun, 30 Jun 2002 08:34:05 -0400
Greetings,
Sorry I did not get this stuff posted yesterday as promised. I got
distracted by the XSLT problem of documenting the schema (unsuccessfully
I might add) and have laid it to one side for post-OSIS 1.1 work.
I have tried to separate out the issues into a series of posts and have
tried to summarize the various pending posts along with a suggested
syntax for the OSIS schema.
osisID: The "who am I" question
The traditional ID mechanism of SGML/XML carries with it certain
syntactic constraints (such as not beginning with a number, therefore no
1John with traditional ID) and is meant to be used with IDREF for
internal document referencing (in part, that is not the full story).
ID's must be unique in a document instance.
The decision in Rome was to abandon the traditional ID datatype so as to
avoid changing the traditional practice of writing 1John to John1, for
example.
Therefore, whatever the eventual syntax, the osisID is actually based on
xs:string, so as to allow 1John, 2Kgs, etc.
Another question that has arisen is where do we use the osisID?
It has been suggested that we could use osisID on verse, Matt.1.5, etc.,
but that leaves us unable to identify larger divisions of Bibles, such
as the fifth chapter of Job, suggesting Job.5 or even an entire book, Gen.
But, since OSIS will hopefully have a range of application beyond Bible
texts, such a commentaries (both modern and ancient) as well as related
works, the osisID must be applicable to any canonical referencing system
and able to represent any level of that system. (This implies that an
osisID can occur on any element that represents a division in that
referencing system.)
At a minimum, I think that the osisID should be able to appear on any
element that represents a division in the referencing system, thereby
allowing books, chapters, verses, etc, to identify themselves within
such a system. The same would be true for Josephus or any other work
with a known referencing system.
(In a separate post I will be treating documents, like the CEV, that
reference but do not use (in my opinion) a canonical reference system.)
(Some of our confusion my be due to my conflating osisID and osisRef
syntax in the schema and I will be trying to sort that out for your
approval.)
Proposal (not all of this is new, just trying to state it all afresh and
in one place):
osisID will be based on xs:string
osisID's will be used on elements that correspond to an identified
reference system (work?)
osisID's will NOT have grain or range syntax (see next before responding)
Elements that do not correspond to a division in a reference system, may
use begin/end attributes to indicate a continuous range of material
based upon a reference system. (no discontinuous segments) (Reasoning:
simple applications this will resolve to the beginning reference so the
user at least gets close to the desired material)
osisID's will use a dotted syntax that represents the reference syntax
in use.
Examples:
Matt.3.2 (refers to Book of Matthew, chapter 3, verse 2)
Matt.3 (refers to Book of Matthew, chapter 3)
Matt (refers to Book of Matthew)
Hmmm, question: How do we specify the treatment of a single token? I can
see how we would do it for Bible texts, one token = book, token + "."
token = book.chapter, and token + "." token + "." token =
book.chapter.verse, but I am not sure how we can expand that to Josephus
and all the other works that people might want to cite. (May be one of
those questions we want to punt on at the present moment and just
specify for Bible texts.)
Note that the osisID answers the question of "who am I" for an element
(probably more precise, "what do I contain").
Suggested syntax:
<xs:attribute name="osisID" type="xs:osisID" use="optional"/>
<xs:simpleType name="osisID">
<xs:restriction base="xs:string">
<xs:pattern value="(([^\s]*\.){0,6}([^\s]*))"/>
</xs:restriction>
</xs:simpleType>
We would need to add begin/end global attributes to deal with the cases
where the element contains less (or more) than the common division of
materials.
<xs:attribute name="osisBegin" type="xs:string" use="optional"/>
<xs:attribute name="osisEnd" type="xs:string" use="optional"/>
Note that I do not suggest these be type="osisID" since they can be
used on notes for double ended attachment and notes may well wish to
attach to elements than can bear no legitimate osisID, such as words in
a verse. A word can have an ID (included by default for all elements)
and a note could use those ID's (subject the usual rules for ID's to
attach themselves to a text).
(Note that we currently have work, cite, outwork and outcite, which I
think are confusing to some degree.)
Forthcoming: osisRef, href on <a>, subject attributes on links and
other elements, etc.
Patrick
--
Patrick Durusau
Director of Research and Development
Society of Biblical Literature
pdurusau@emory.edu