[osis-core] morph regex error

Chris Little osis-core@bibletechnologieswg.org
Wed, 03 Dec 2003 14:15:40 -0600


Okay, okay.  No need to shout.  Don't kill the messenger.  Etc. :)

The problem with changing the format is that we can no longer use morph, 
lemma, etc. values as osisRefs.  As it stands, any of these attributes 
could double as an osisRef/osisID.  So your lexicon, organized by lemma, 
could have divisions with osisIDs that are the same as their lemma 
values.  Likewise, if you organize the Robinson morphology scheme as a 
sort of lexicon, you can look up entries and tag them with osisIDs that 
are identical to your morph value.

--Chris

Troy A. Griffitts wrote:

> NO!
> 
> 
> Chris Little wrote:
> 
>> Troy A. Griffitts wrote:
>>
>>> Hey guys.  It seems we may have messed up the regex on the morph 
>>> attribute of <w>.
>>>
>>> Here my line:
>>>
>>> <w xml:lang="grc" lemma="strongs:15" morph="robinsons:V-PAM-2P" 
>>> xlit="la:agaqopoieite">GREEK UTF8 TEXT HERE</w>
>>>
>>>
>>>
>>> Here's the MSV error output:
>>>
>>> Error at line:279, column:117 of 
>>> file:///space/home/scribe/msv/./lexcounts
>>>   attribute "morph" has a bad value: the value does not match the 
>>> regular expression 
>>> "((((\p{L}|\p{N}|_)+)(\.(\p{L}|\p{N}|_))*:)((((\p{L})|(\p{N})|_)+)(((\.(\p{L}|\p{N}|_)+)*))?))". 
>>
>>
>>
>>
>>
>> The value you give has never been valid.  Hyphens have never been 
>> allowed in morph or lemma attributes (nor have spaces and various 
>> other characters).  I think the decision we made before releasing 2.0 
>> was to force folks to transcode these as '_'.
>>
>> Does that work for you?
>>
>> --Chris