[osis-core] Review of OSIS Requirements [Robin]

Robin Cover osis-core@bibletechnologieswg.org
Sat, 10 Nov 2001 10:04:08 -0600 (CST)


Lying in bed just ten minutes after sending this review (appended)
to Patrick, I realized that I should sent the document to
the 'osis-core' mailing list.  I was one to make annoying
loud noises at the Dallas meeting about using mailing lists
rather than ad hoc recipient lists, and I just violated
my principle.  My bad.  So herewith...

As indicated to Patrick near the end of the review, I'm not
entirely comfortable sending this 'Draft 9 + 1' version out for
public release, although

1) the misgiving is *not* related to Patrick's work, but to
   inadequate collective working group effort to date
2) I understand the decision on release to belong to ABSI,
   Scholars Press, and the project director
3) deadlines were clearly announced by Patrick

I suppose I would vote for as "limited" a distribution of
the current draft as is possible from the marketing point of
view.  In particular, I would not post this version to a
public web site, even if paper copies are printed and
distributed publicly at the Denver BTWG eetings.  Inclusion of
the draft on a widely-distributed and easily-accessible CD
would also not be advisable in my view, but certainly not as
risky as putting the document into public web space.

Thanks, double-thanks to Patrick for his (usual) great efforts
at keeping things moving forward by volunteering to draft key
documents.

Some personal/professional issues have made it difficult for me
to me more engaged since October 13th, but I'll try to be
on deck here as I can.

Best wishes all,

Robin

---------- Forwarded message ----------
Date: Sat, 10 Nov 2001 00:35:08 -0600 (CST)
From: Robin Cover <robin@isogen.com>
To: Patrick Durusau <pdurusau@emory.edu>,
    Steve DeRose <Steven_DeRose@Brown.edu>
Cc: Robin Cover <robin@isogen.com>
Subject: Review of OSIS Requirements [Robin]

Comments on OSIS Requirements
Robin Cover
2001-11-09/10

Target: OSIS Requirements. BTWG Working Draft 9 November 2001
Version: http://www.sbl-site2.org/osis/05osis-requirements-20011109.html 

Quoted portions of target text preceded by ##
Item suggestions for possible revision preceded by >>

General Comments:

Thanks for your work, Patrick!  Sorry I have been so otherwise
occupied.

The draft looks very professional (screen and print).  From
five feet away it looks to me much like a W3C document.  If
the overall document structure, formatting, and boilerplate
language formularies are modeled on W3C documents, I think it
would be appropriate to explicitly acknowledge this intentional
dependency -- by attribution.

Re Patrick's note

 > Date: Tue, 06 Nov 2001 12:12:36 -0500
 > From: Patrick Durusau <pdurusau@emory.edu>
 > To: osis-core@bibletechnologieswg.org
 > Subject: [osis-core] Re: OSIS Core Requirements, Update
 > [...]
 > Robin: I know there have been some issues in the past with your being
 > identified with outside initiatives. I defaulted to placing you on the
 > document as an editor but can remove if that will cause problems for
 > you.

I think it would be a good idea to remove my name from the document.
I don't care whether I receive public credit or not, and there are
a couple tricky matters relating to my two employers which could
make my identification (with institution, or lack thereof)
problematic.  I can explain more via phone if necessary, but since
some of the long-standing OASIS/Sun/ISOGEN problems remain, it's best
not to name me on this document.  Thanks.

Of the remaining names,

#     Steve DeRose (Brown University) <Steven_DeRose@Brown.edu>
#     Troy Griffiths (Crosswire) <scribe@crosswire.org>
#     Jerry Fincher <dandjfincher@juno.com>
#     Patrick Durusau (Society of Biblical Literature)
<pdurusau@emory.edu>

I think the order should be given in some way as to make sense to
a reader who knows nothing about the project and about the likely
roles played in editorial activity.  Bring neither alphabetic nor
role-qualified makes the above order irregular, IMO.  If Jerry is
a full-time employee of ABSI, should not his affiliation be printed?

I don't know the history of the document in detail, but it seems
like you could put your name on it as 'Editor' and name the others
an 'Contributors' is this is accurate to the situation; or list
you and Steve as co-editors.

Seems to me that the people named as 'Editors' should have the
chance to review and 'sign off' on the draft before it is committed
in print as a "public working draft."

Sub "Status" section, I'd vote for a line/field similar to that used in
ISO documents, like "Distribution: BTWG Participants".  This would not
preclude reading by others, but would alleviate some of the burden
of authorship to make this a context-neutral, backgrounf-provided
document.  If a CEO somewhere picks it up and finds some gaps, it would
be because he's not up to speed with the BTWG activities/conversations.

#1. Overview

# This document contains the requirements

>> perhaps "contains the initial/draft requirements"

#  OSIS 1.0 is a base encoding
>> "base encoding" could be misunderstood, or simply not understood
because "encoding" means a thousand things; perhaps similar to
what is stated in the final para of this section,
e.g., "OSIS 1.0 will define a basic vocabulary and grammar model...

# The purpose of this requirements document is to chart the scope

>> The purpose of this requirements document is to chart
   a proposed scope of OSIS 1.0...
   
# Readers, critics and commentators should feel free to contribute
  their remarks
>> Reviewers should feel free to provide feedback using...

# (XSEM and Dennis Drescher)
>> (XSEM, directed by Dennis Drescher)

# and Chinese Christian Markup Language (CCML)
>> omit?  In my review.work, CCML played no role sufficient to
warrant being "notably" mentioned as background.  At this juncture
I would also feel comfortable making reference to TEI, since the
TEI Guidelines are behind LOGOS and XSEM at many points, and
were strongly in the minds of (most of) the WG members doing
the DTD review.

# 2. Terminology

The language MUST, SHOULD, MAY seems to me a bit formal and heavyweight
for the document at this stage, but I won't protest too loudly.  Perhaps
the tension I feel is this: the bare statements in terms of assertions
and clarity of articulation are propositions-tough-enough-already such
that
the subtlety of 'MUST, SHOULD, MAY' seems like a 2nd/3rd layer fine
distinction -- to be tweaked after further review.  Just a suggestion.


# 3. General Requirements
#
# General Syntax
#
# Must Use the XML "Family" of Standards

>> Perhaps this could be clarified with additional language, e.g.,

OSIS specifications will make use of the XML family of W3C core
standards, including XML 1.0, XLink, XPointer, CSS, XSL, XSLT, SVG, etc.

#Expression of OSIS
#
# The vocabulary and structures of OSIS MUST be expressed in XML syntax
# but not limited by features peculiar to any expression language.

This statement seems to raise the problem of schema formalisms alongside
notation for document instances.  If the bare statement "The vocabulary
and structures of OSIS MUST be expressed in XML syntax..." is to
accommodate both XML 1.0 DTD notation and W3C XML Schema, then it's
not clear what "XML syntax" means, since it could not mean simply
"XML 1.0 instance syntax".  Maybe something like:

Conforming OSIS documents will be valid XML instances.  Formal notations
for expressing data types, hierarchical structure, and other constraints
will not make use of features peculiar to any particular schema or
constraint language.  [Or something....]

#Ontologies
#
# OSIS 1.0 MUST provide a mechanism for declaring XML
# vocabularies for ontologies.
# This requirement addresses the need to allow markup of biblical
# texts in the language most comfortable for a particular author
# and yet retain interoperable validation of the
# resulting files.

see Patrick's note 11-09

    >What I am trying to say is that we will require the construction
    >of a mapping from base name for elements and attributes that 
    >will allow us to use an exchange schema to produce a target
    >schema that is in a person's preferred language for them to
    >use in markup of a biblical text. We can then validate against
    >the target schema or convert back into English element and
    >attribute names if we wanted to use it in software that only
    >recognizes the English target schema. 
    >This is not original (for me) and was suggested by Jonathan
    >yesterday. 
    >It looks like a very good idea, since I can then allow
    >people to work in the language in which they are the
    >most comfortable and yet load the resulting text into
    >software that uses a different schema for validation. 

>> This seems a lot like Requirement "LTS.2 Users, readers and
   producers of texts in non-English languages MUST have the
   ability to use element names and attributes in their
   native languages."

----------------------------------------------------------------

So even with our earlier exchange, per my comments below, I am
not sure I understand this concern of yours/Jonathan's. I don't
think we should get into the matter of prescribing how
software should be required to work, in any case

Thus...I think you're using "ontology" in a slightly different
sense than I normally understand it.

I would think of "ontology" as operating at a much higher level
of abstraction than markup.  I think the nominal requirement is that
any XML 'NAME' (element type name, attribute name, token name in
an enumerated-datatype default value, etc) which we use in the
official/canonical OSIS DTD should be susceptible to renaming
via one or more mechanisms so that the XML markup language in
actual use worldwide could employ names in various languages.

If you mean something other than this, I'm lost.  I assume
that for DTDs, the TEI-style indirection would allow 
renaming.. something like

<!-- driver portion -->
<!ENTITY % p.name "p" >
<!-- ... elem/attr decls -->
<!ELEMENT %p.name (#PCDATA | %em.name; | %q.name;)* >

I think of a vocabulary as a collection of names (elements,
attributes, etc) in a namespace.  I think of an ontology as
a specification in a different domain.  Trying to map
different ontologies via a single markup spec will not
work, as far as I know, because the semantic space
in the different ontologies will be factored differently,
or otherwise irreconcilably mismatched.

----------------------------------------------------------------


# 4. Metadata Requirements
#
# The division of requirements that follows is upon the experience
# of the editors...

>> perhaps "is based upon...

I think it will be more clear if "that follows" is specified to
mean "Sections 4-8, viz.,

The listing of requirements in Sections 4-8 as "Metadata; Large Text
Structure; Notes and Annotation; Phrases; Reference/Linking" is based
upon...

# Metadata.1 OSIS 1.0 MUST include Dublin Core metadata.

>> This could be read to imply that we will use the prescribed
categories and labels in DCMI.  I thought we only agreed to something
like "OSIS will support metadata annotation covering at least the
items addressed by Dublin Core..."   The use of DC familiar to me
(qualifiers: http://dublincore.org/documents/dcmes-qualifiers/ and
profiles: http://dublincore.org/documents/library-application-profile/ )
suggests that use of the canonical fifteen is not so important

# Metadata.3 Within OSIS 1.0 all metadata MUST be inherited by
#  elements in a document instance unless over-ridden at a
#  lower element level.
  
I'd vote for removing 'MUST' in any case, since I cannot foresee how
"inheritance" will work in some cases, For example: what happens when
a property applicable (conceptually) to a parent does not apply
(conceptually) to the descendant?  We can't force inhertance of
a value for a property that is not in view.  Links and transclusions
also pose challenges...  Perhaps 'MAY'


#Metadata.5 OSIS 1.0 MUST provide a mechanism for identification
# of the versification of a biblical text. 

I sorta think this one belongs in Section 6 or 8.

#Metadata.7 OSIS 1.0 MUST include metadata for elements that
#includes certainty and responsiblity as currently modeled
#in the TEI Guidelines.

>> for "as currently modeled in the TEI Guidelines" I might say
"similar to the certainty/responsibility features defined in
the TEI Guidelines...

I would vote to move this req. to Section 6 since (I think) it will
be applicable most of the time within the framework of analysis
(who made this particular critical judgment, and with what level
of confidence...?")


#5. Large Text Structure Requirements
#
# LTS.2 Users, readers and producers of texts in non-English
# languages MUST have the ability to use element names and
# attributes in their native languages.

Including "readers" seems to imply a requirement for software
developers (?).  I think what we mean to say is that OSIS 1.0
will provide one or more recommended mechanisms for allowing
markup labels (typically 'names' of elements and attributes)
to be recast into other languages without implication for
conformance and interoperability.  Somesuch. 

#LTS.4 OSIS 1.0 MUST define a mechanism for elements are required
#to insert illustrations (children's bibles for instance) or
#other material that is not strictly speaking a part of the
#normal flow of elements.

>> Perhaps:

OSIS 1.0 MUST define markup constructs to support the
(by-reference) inclusion/transclusion of non-textual materials
into encoded texts (raster/vector graphics, video, audio, etc).
For example, this would allow the...


#6. Notes and Annotation Requirements

#Note.1 OSIS 1.0 MUST provide a mechanism for the
imposition of an editorial apparatus upon a text.

>> I would not make this the first item, but move it down
in the list.  Alternately, replace "editorial apparatus"
with some phrase that more naturally connotes the idea
of a simple footnote in a commentary's preface.  The
phrase "editorial apparatus" may to some people imply
an app crit for textual variation.

#Note.4 OSIS 1.0 MUST declare a mechanism for alignment
#of parallel passages. 

>> Would this req. be better situated in Section 8?

#7. Phrase Level Requirements
#
#All markup can be reduced to a series of <seg> elements
#with appropriate attributes.

Hmmmm.... I suppose so.  The DTD could be:

<!DOCTYPE osis [
<!ELEMENT osis (seg*) >
<!ELEMENT seg (#PCDATA | seg)* >
<!ATTLIST seg type CDATA #IMPLIED
                   id ID #IMPLIED
                   refTarget IDREFS #IMPLIED >
]>

But... will this discussion with this opening statement
confuse the typical reader?

#Phrase.1 OSIS 1.0 MUST declare elements for phrase
#structures such as, abbr, blockquote and similar elements.

>> Perhaps omit 'blockquote' here, since in many/most
markup models, 'blockquote' is a block that may contain
many phrase level items as well as other blocks.

For a possibly useful distinction between "Large text
structures" and "Phrase level structures," the typical
HTML model of block vs inline may be useful, where (crib):

   "Block-level elements typically contain inline elements
   and other block-level elements. When rendered visually,
   block-level elements usually begin on a new line. Inline
   elements typically may only contain text and other
   inline elements. When rendered visually, inline
   elements do not usually begin on a new line.

If I had to choose in a two-level design, I'd put
DIVs and (HTML-style) blocks in one group, with
phrases/inline/atoms in the second... I think...

#8. Reference/Linking Requirements

#Reference.1 OSIS 1.0 MUST declare robust pointing and
#linking mechanisms for biblical references.

>> Maybe instead:

OSIS 1.0 MUST declare robust pointing and linking mechanisms
for intra- and inter-document referencing.  In particular, the
reference mechanisms should account for corpora of 'canonical'
historical literature in which documents have multiple
reference schemes.

#Reference.3 OSIS 1.0 MUST declare an OSIS namespace for
#all OSIS references.

>> ? A single namespace?  Forward versioning?  Do we mean
"namespace" here in the W3C Rec sense?

#Reference.6 OSIS 1.0 MUST declare a keyword syntax for
#use in construction of indexes and other finding aids
#(with associated metadata).

>> Perhaps downgrade to 'SHOULD' if these qualifiers are retained.
I would probably vote to move this requirement to Section 4 or 6.


#Reference.7 OSIS 1.0 MUST declare both basic and extended
#mechanims for bibliographic references.

>> This req nicely illustrates, IMO, the difficulty of being
forced to release this draft as a public review draft at this
time.  Everyone wants support for "bibliographic references,"
I feel certain.  But I feel sorry for the readers as reviewers,
since this level of description offers inadequate basis for
an evaluation and reaponse.  A numner of things might be meant
by "basic" and "extended", but I don't know what is meant here,
and I doubt that the average reader will know either...  this
is NOT (repeat NOT) a criticism of the writing style, but of the
notion that an "as-yet-too-cryptic" document be issued as
a review draft.

"Some things are ready when they are ready, and not sooner."

Sorry this review is not more thorough; hopefully, it's better
than nothing.

Keep me posted.

Robin