[osis-user] Re: [sword-devel] Yet another KJV markup question.

Mon Mar 6 13:54:18 MST 2006

DM Smith wrote:
> Troy will probably have to confirm, but it appears that the src 
> attribute is being used to indicate the position of the Greek word in a 
> particular module. If so, it would be good to be able to construct an 
> actual osisRef for it. While cp and s are defined as grain operators it 
> would be good to also have wp (i.e. word position), too.

Yeah, src on <w> in CrossWire's KJV refers to the nth word in the same 
verse within the Byz module. I don't think this what what this attribute 
was intended for, but I don't have a better suggestion of an attribute 
on which to hang such a value. Technically, you'd want a token 
attribute, which would require either extension or modification of OSIS.

We discussed a word grain, but decided against it. I recall the 
ambiguity of "word" as one of the issues. More useful would probably be 
a document that indicates osisIDs down to the word level (a word-level 
tagged Byz, in other words--which we may already have).

> The <w> element lets one indicate that two or more words are being 
> translated as in
> <w lemma="s:1 s:2">word</w>.
> Likewise, it is possible to have more than on morph as in
> <w morph="r:9 r:10">word</w>
> But in combining these,
> <w lemma="s:1 s:2" morph="r:9 r:10">word</w>
> it is not possible to know for certain, but only by convention that s:1 
> maps to r:9 and s:2 maps to r:10.
> Adding src to it makes the mapping even more difficult.
> <w src="7,9" lemma="s:1 s:2" morph="r:9 r:10">word</w>
> It appears that nesting is being used to indicate uniquely the origin.
> <w src="7" lemma="s:2" morph="r:9"><w src="9" lemma="s:1" 
> morph="r:10>word</w></w>
> 
> Is there any value in such a nesting construct? (Personally, I don't 
> think so.)

No. OSIS is already being used in CrossWire's KJV in ways other than 
what was intended. The lemma and morph attributes indicate the lemma and 
morphology of their content, but the CrossWire KJV uses them to indicate 
the lemma and morphology of words in a completely different document. 
(We have an English language document with lemmatization and morphology 
information for a Greek language document.)

If the English text does not distinguish two Greek language tokens, such 
that they must be translated using a single English token (or vice 
versa), then there is no reason to disambiguate which lemma is 
associated with which morphological information in the translation text. 
I think it is simply not relevant information, from the perspective of a 
KJV-reader.

--Chris