[osis-core] type on identifier and subject--deference to an established standard

Chris Little osis-core@bibletechnologieswg.org
Wed, 29 Oct 2003 18:39:11 -0600


I'm don't know why this hadn't occurred to any of us before, but I 
checked the recommended use of the identifier and subject elements 
according to Dublin Core.  I feel that, regardless of our other feelings 
on the subject and our desire to extend beyond what they have defined, 
we should minimally provide a system symmetric to theirs, since ours is 
still essentially a derivative thereof.  (Of course, I may be biased by 
the fact that they recommend what I had previously stated as my preference.)

The best place to look for DCMI's take is the document at 
http://dublincore.org/documents/dcmi-terms/ .  Consulting their 
"identifier" entry, you will find it states:

"Recommended best practice is to identify the resource by means of a 
string or number conforming to a formal identification system. Example 
formal identification systems include the Uniform Resource Identifier 
(URI) (including the Uniform Resource Locator (URL)), the Digital Object 
Identifier (DOI) and the International Standard Book Number (ISBN)."

Similarly, the "subject" entry reads:

"Typically, a Subject will be expressed as keywords, key phrases or 
classification codes that describe a topic of the resource. Recommended 
best practice is to select a value from a controlled vocabulary or 
formal classification scheme."

DCMI never makes any mention of reformatting data to be XML Names or 
adding prefixes.  And they include the type attribute specifically for 
the purpose of including values to indicate formal 
identification/classification systems.

If you scroll down to section 4 of that page, you'll find a list of 
standard type values for certain elements, significantly those intended 
for subject: LCSH, MESH, DDC, LCC, & UDC.  (We may wish to add those of 
these that are absent from our osisSubjects enumeration, should we add 
it back into the schema.  MESH could conceivably be omitted since it is 
outside our domain.  LCC will be similar to the LCCall identifier value, 
but omits author/date information.)  Unfortunately they don't list any 
canonical values for type on identifier, but I think the mention of ISBN 
above indicates their leaning.

--Chris