[osis-core] OSIS work regex
Harry Plantinga
osis-core@bibletechnologieswg.org
Wed, 14 Aug 2002 16:48:01 -0400
I was speaking hypothetically -- if we are going to try to
conform to XML name character usage, that is what we would have
to do.
But, we've already decided not to conform -- we're allowing a number
to start out osisIDs. So, I suggest we allow letters, digits, and _
as start characters.
Also, we seem to be ignoring the use of ideographs and accented
characters in names. It's OK with me, but I want to make sure it's
intentional.
-Harry
> -----Original Message-----
> From: owner-osis-core@bibletechnologieswg.org
> [mailto:owner-osis-core@bibletechnologieswg.org] On Behalf Of
> Todd Tillinghast
> Sent: Wednesday, August 14, 2002 4:35 PM
> To: osis-core@bibletechnologieswg.org
> Subject: RE: [osis-core] OSIS work regex
>
>
> The statement below does not make since to me. It seems you
> are saying two conflicting things. In any case, it seems
> that you are saying that we should conform to the XML standard.
>
> I guess what I am suggesting is that we have references that
> can be XML IDs. I am not sure what all of the precluded and
> allowed characters are. I know that Patrick was much better
> verse at this when we talked several months ago on this very topic.
>
> The trouble with this whole line of discussion is that ":",
> "[", and "]" are not allowed in XML IDs!
>
> Also the test I did with an "_" leading was ok, it was the
> leading number that was the problem we had before.
>
> SORRY FOR THE BOGUS DETOUR RELATED TO "_"!
>
> The issue still remains related to OSIS references and
> identifiers as XML IDs. I think that is why I was using ".."
> rather than ":" long ago. If we trade the ":" for ".." and do
> away with the "[" and "]" then we would be back with a valid
> XML ID. (Of course ALLOW "_" and preclude numeral as the
> leading character.)
>
> Todd
>
> > My "XML in a Nutshell" reference book says that XML name start
> > characters are letters, ideographs, and the underscore, _.
> If we want
> > to conform to XML usage, we should allow ideographs,
> underscore,
> > but no _ or digits in osisIDs, I guess.
> >
> > -Harry
> >
> > > -----Original Message-----
> > > From: owner-osis-core@bibletechnologieswg.org
> > > [mailto:owner-osis-core@bibletechnologieswg.org] On
> Behalf Of Todd
> > > Tillinghast
> > > Sent: Wednesday, August 14, 2002 3:34 PM
> > > To: osis-core@bibletechnologieswg.org
> > > Subject: RE: [osis-core] OSIS work regex
> > >
> > >
> > > I think I am clear now on the proposal.
> > >
> > > Although we don't intend to use our ids as XML IDs, by allowing a
> > > leading "_" we preclude others from using the same
> syntax/form and
> > > set of identifiers in other implementations. This weakens our
> > > standard.
> > >
> > > I hope that encoders other than those encoding OSIS
> documents would
> > > use identifiers that are of the same "currency" as our references
> > > and identifiers. By elimination the option for those
> identifiers to
> > > XML IDs we limit the possibility for wider adoption,
> influence and
> > > interoperability with OSIS document.
> > >
> > > Todd
> > >
> > > >
> > > > Todd,
> > > >
> > > > I don't think Harry meant "_" as an extra delimiter (in the
> > > same sense
> > > > as "." is a delimiter in our syntax but more as a name character
> in
> > > > writing customary citations of names. It is in a sense a
> > > delimiter but
> > > > as part of the name to be matched as a string and not a
> delimiter.
> > > (Does
> > > > that make any sense at all? Perhaps Harry can state what he
> > > meant more
> > > > clearly. ;-)
> > > >
> > > > Patrick
> > > >
> > > > Todd Tillinghast wrote:
> > > >
> > > > >What extra value does the "_" give us?
> > > > >
> > > > >Are you proposing Bible_.TEV_ ?
> > > > >
> > > > >Or just that "_" would be an option as in
> > > > >Bible.Todd_New_And_Different_Reference_System ?
> > > > >
> > > > >I can see "_" as an allowable character as long as it
> is not the
> > > leading
> > > > >character but don't see any value in having it as an
> additional
> > > > >delimiter to ".".
> > > > >
> > > > >Todd
> > > > >
> > > > >>-----Original Message-----
> > > > >>From: owner-osis-core@bibletechnologieswg.org
> [mailto:owner-osis-
> > > > >>core@bibletechnologieswg.org] On Behalf Of Harry Plantinga
> > > > >>Sent: Wednesday, August 14, 2002 7:26 AM
> > > > >>To: osis-core@bibletechnologieswg.org
> > > > >>Subject: RE: [osis-core] OSIS work regex
> > > > >>
> > > > >>If schema RegExps behave as they do in Perl, the ? is
> > > superfluous.
> > > > >>Perhaps
> > > > >>
> > > > >> [\L\N][\.\L\N]*
> > > > >>
> > > > >>The underscore character (_) is pretty commonly used in names
> and
> > > may
> > > > >>
> > > > >be
> > > > >
> > > > >>present in documents converted to OSIS. I can't see that
> > > it would do
> > > > >>
> > > > >any
> > > > >
> > > > >>harm. Could it be included? Perhaps
> > > > >>
> > > > >> [\L\N_][\.\L\N_]*
> > > > >>
> > > > >>-Harry
> > > > >>
> > > > >>----------------------------------
> > > > >>For the work portion:
> > > > >>
> > > > >><xs:pattern value = "([\L\N\.]([\L\N\.]*)?)" />
> > > > >>
> > > > >>By which I am trying to say, any letter or number combination,
> > > > >>
> > > > >followed
> > > > >
> > > > >>by a period is complusory, followed by any number of optional
> > > > >>letter/number combinations that also end in a period (periods,
> > > > >>
> > > > >hyphens,
> > > > >
> > > > >>etc., being excluded from the work name).
> > > > >>
> > > > >
> > > >
> > > > --
> > > > Patrick Durusau
> > > > Director of Research and Development
> > > > Society of Biblical Literature
> > > > pdurusau@emory.edu
> > > >
> > > >
> > >
> > >
>
>