[osis-core] osis_0109 Arrives! Very close to 1.0!!!

Sun, 14 Apr 2002 15:06:51 -0500

Patrick,

> Todd,
>
> Todd Tillinghast wrote:
>
> >>b. Can have multiple NMTOKENS, i.e., OSISID="Gen.17.17 Gen.17.18",
> >>validated against the regex for referenceType.
> >>
> >This seems BAD to me.  This still does not handle the case were
> there truely
> >is a different name explicitly assigned by the translator that has a
> >different meaning than traditional reference systems use.
> >
> Hmmm,
>
> Sure it does, can have "Gen.17.17 Gen.17.18 Pats.BadTrans.BigWhoop.35.26
> Todds.Perfect.System.23.23"  and for that particular encoding, you may
> even want to consider that an implied mapping?
>
> >
> >>3. OSISIDREF is now NMTOKEN (should only have a single ref back to a
> >>starting point, not IDREF but probably not used that much anyway.)
> >>
> >>4. Now have refWork attribute on <text>. Validates against referenceWork
> >>simpleType (which is redefined by osisScripture_0109.xsd (this gets you
> >>the default prefix for your other references.)
> >>
> >>5. Note now has refWork attribute (so can point outside to a particular
> >>referenceWork).
> >>
> >Why just in note?  I thing we should be able to use a
> non-default reference
> >anywhere we are using a referenceType.  At least this should be
> possible in
> ><reference>, <figure>, and possibly a few key other elements.
> >
> Hmmm, could  the optional expression of refWork eliminate the need for
> the refWork attribute? In other words, all refs need not have the
> Bible.KJV.. prefix but where it does appear,  the reference has been
> qualified to appear in that system?

If you are talking about refWork in <note> there is no need for it if the
reference system can be expressed as a part of the refernce as in
"Bible.KJV..John.3.16"

>
> >
> >>6. Regex no longer has Bible.KJV, etc., handled by referenceWork
> >>
> >I like the old way better see #5 above.
> >
>
> See reply above.
>
>
> Patrick
>
> >
> >>7. Completely reformed regex expressions, recall from my earlier post:
> >>
> >This fixes the totally incorrect previous version!  It seeme
> that we should
> >preclude some characters that are allowed by "/c".  (".", "-",
> and possibly
> >a few others.)  This will reserve the right to use them later.
> >Also I'm not sure we should have a "default" regex at all in
> OSISCore since
> >it gets ORed with the regexs in the redefined versions and negates the
> >opportunity for real validation.
> >
> >>>1. Regexs:
> >>>
> >>>Generally see: http://www.w3.org/TR/xmlschema-2/#regexs
> >>>
> >>>ReferenceType
> >>>
> >>>Now reads: ([^.]+)((.[^.]+){0,})?
> >>>
> >>>Note that "^" begins a negative character group.
> >>>
> >>>Note that the "." character in XML Schema is the equivalent of:
> >>>[^\n\r] : any character except newline
> >>>
> >>>So, [^.] means only newline (excludes all other characters)
> >>>
> >>>Or more formally from the standard:
> >>>
> >>>[Definition:]   A * negative character group* is a ·positive character
> >>>group· <http://www.w3.org/TR/xmlschema-2/#dt-poschargroup> preceded by
> >>>the |^| character. For all ·positive character group·
> >>><http://www.w3.org/TR/xmlschema-2/#dt-poschargroup> s /P /, ^/ P/ is a
> >>>valid *negative character group*, and / C(^P)/ contains all XML
> >>>characters that are /not/ in /C(P)/ .
> >>>
> >>>*Negative Character Group*
> >>>|[15]   | | negCharGroup| |   ::=   | |'^' posCharGroup
> >>><http://www.w3.org/TR/xmlschema-2/#nt-posCharGroup> |
> >>>
> >>>
> >>>I assume the intent of the expression is:
> >>>
> >>>1. Any legal namestart character, followed by,
> >>>2. Any legal name character, followed by,
> >>>3. literal "." character, followed by
> >>>4. one or more groups of legal name characters separated by a literal
> >>>
> >"."
> >
> >>>If that is the case, I would suggest that we re-write ReferenceType to
> >>>read:
> >>>
> >>>([\i]([\c])*\.((\c)*\.)?
> >>>
> >>>Note that \i = any legal initial name character, \c = an y legal name
> >>>character, \. = literal "." or full stop
> >>>
> >>>Additionally, since we have compScriptureReferenceType (I treat that
> >>>regex below) not sure what ReferenceType is getting us in terms of
> >>>validation? Structure of the references? Perhaps, would welcome some
> >>>discussion on this and WorkType (next).
> >>>
> >>>(BTW, schema regexs always match from the beginning of the line so no
> >>>need to anchor.)
> >>>
> >>>WorkType:
> >>>
> >>>Now reads: ([^.]+(.[^.]+)
> >>>
> >>>Same problems as above with "^" and invoking of literal full stop.
> >>>
> >>>Is the intent of this expression the same as ReferenceType?
> >>>
> >>>In other words to:
> >>>
> >>>1. Any legal namestart character, followed by,
> >>>2. Any legal name character, followed by,
> >>>3. literal "." character, followed by
> >>>4. one or more groups of legal name characters separated by a literal
> >>>
> >"."
> >
> >>>if so, why would I want both of them? For that matter, the more I
> >>>think about it, I am not sure what function either one would serve, at
> >>>least in light of our not declaring a set of references to other works.
> >>>
> >>>Suggestion: Why not settle on an outside reference pointer that
> >>>subclasses xs:string the way we have for enumerated values on
> >>>attributes. You can at this point declare whatever other pointers you
> >>>like, but prepend "x-" to them? That would allow us to later (probably
> >>>by the Fall release of translator and publisher modules, to declare
> >>>references like compScriptureReferenceType that provide validation of
> >>>at least part of the reference?
> >>>
> >>>compScriptureReferenceType:
> >>>
> >>>Now reads (in part) ((...All Book Names...))((.[^.]+){0,}))?
> >>>
> >>>Same problems as above with "^" and invoking of literal full stop.
> >>>
> >>>In other words to:
> >>>
> >>>1. Book Name, followed by
> >>>2. literal "." character, followed by
> >>>3. any digit or letter (one or more) (question, do we need letter for
> >>>some Bible references?), followed by
> >>>4. literal "." character, followed by
> >>>5. any digit or letter (one or more) (question, do we need letter for
> >>>some Bible references?), followed by (optional)
> >>>
> >>>If that is the case, would the following work?
> >>>
> >>>((...All Book Names...))\.[A-Za-z0-9]*(\.[A-Za-z0-9]*)?
> >>>
> >>>Note that this expression requires book name plus chapter, could
> >>>someone want to just refer to Matthew?
> >>>
> >>If I am completely off base on my reading of XML Schema regex
> >>expressions please point me to the correct information but it sounds
> >>like [^.]  which is in all the expressions I corrected, excludes all
> >>letters but newline? Fairly sure that is not what was intended.
> >>
> >>Recall that XML Schema regex expressions always, automatically, never do
> >>differently, bind at the beginning of the string, so "^" to match the
> >>beginning of a string is not required (Not to mention has a different
> >>meaning than the way it is used in Perl/sed/awk, etc.).
> >>
> >>Chime in now on these or any other issues because save for typos, I
> >>would like to consider this version a code freeze so tomorrow I can work
> >>on documentation to insert into the schema and Todd/Chris/Troy can start
> >>getting some sample texts together for a Monday release.
> >>
> >>I'm about to take a break for a couple of hours but will be back online
> >>later today and early this evening.
> >>
> >>Looks really good guys!
> >>
> >>Patrick
> >>
> >>
> >>
> >>
> >>
> >>--
> >>Patrick Durusau
> >>Director of Research and Development
> >>Society of Biblical Literature
> >>pdurusau@emory.edu
> >>
> >>
>
> --
> Patrick Durusau
> Director of Research and Development
> Society of Biblical Literature
> pdurusau@emory.edu
>
>
>
>