[osis-core] morph regex error
Chris Little
osis-core@bibletechnologieswg.org
Mon, 08 Dec 2003 02:50:35 -0600
Troy A. Griffitts wrote:
> I think you are sorely incorrect about historical facts, but in regard
> to the current schema:
>
> You may argue that it SHOULD conform to osisIDRegex, but NOW is not the
> time to argue that.
>
> The PROBLEM I have is not this:
>
> <xs:attribute name="morph" type="osisIDType" use="optional"/>
>
> it's NOT defined that way in the schema. If that's what you want, we
> can talk/debate about changing it to the above at our next meeting.
>
> The PROBLEM is that being defined correctly, like this:
>
> <xs:attribute name="morph" type="osisGenType" use="optional"/>
>
> (which is how it IS defined in the official schema)
> osisGenType (osisGenRegex) SHOULD NOT BE RESTRICTED TO THE SAME THING AS
> osisIDType (osisIDRegex) or we wouldn't have 2 types.
>
>
> Does that make sense?
>
> -Troy.
>
>
> PS. Even if changing it to osisIDType was being proposed (which I think
> you've done). I still believe that a serious flaw exists in this proposal:
Really, that's not at all what I've "proposed" or would propose. I
simply believe that the morph attribute should be a value that can serve
as an osisID (or osisRef, for that matter). That means it needs to be
equal to, or a subset of, the osisID regex. It is such a regex
currently, and I think it works well.
> There has to be a way programmatically to restore the encoding WITHOUT
> the software knowing anything about the morph scheme or else we've
> forced enumeration of the known morph schemes in software implementation.
>
> e.g. You can't Change 'N-[G]@5' to 'N__G_5'. If we ever decide to force
> morph to conform to osisIDType, then we MUST provide a programmatic way
> to restore the original morph code, e.g 'N%2D%5BG%5D%405' Which I think
> still looks horrible and is not acceptable to me, but at least would
> allow me to remove the ambiguity and programmatically reconstruct the
> original code.
Does "N-[G]@5" actually exist as a morphological tag in some system or
are you just making up examples that would be difficult to encode? I
just looked through about 5 systems for morphological tagging in
BibleWorks (and know of 3 others used elsewhere for biblical languages)
and none of them require anything other than space or hyphen. In
linguistics, you might find a period used in a morphological tag (but in
those cases, you would never find a hyphen or a space--and finding a
period would itself be rare).
In no system that I'm aware of is any semantic content held by a
character other than letters & numbers. For that reason, I consider it
truly irrelevent how these are rendered. Internally, I believe they
should be represented by underscores. How they are rendered is a matter
of preference for stylesheet designers to determine.
--Chris