[osis-user] Re: [sword-devel] Yet another KJV markup question.
Chris Little
chrislit at crosswire.org
Mon Mar 6 13:54:18 MST 2006
DM Smith wrote:
> Troy will probably have to confirm, but it appears that the src
> attribute is being used to indicate the position of the Greek word in a
> particular module. If so, it would be good to be able to construct an
> actual osisRef for it. While cp and s are defined as grain operators it
> would be good to also have wp (i.e. word position), too.
Yeah, src on <w> in CrossWire's KJV refers to the nth word in the same
verse within the Byz module. I don't think this what what this attribute
was intended for, but I don't have a better suggestion of an attribute
on which to hang such a value. Technically, you'd want a token
attribute, which would require either extension or modification of OSIS.
We discussed a word grain, but decided against it. I recall the
ambiguity of "word" as one of the issues. More useful would probably be
a document that indicates osisIDs down to the word level (a word-level
tagged Byz, in other words--which we may already have).
> The <w> element lets one indicate that two or more words are being
> translated as in
> <w lemma="s:1 s:2">word</w>.
> Likewise, it is possible to have more than on morph as in
> <w morph="r:9 r:10">word</w>
> But in combining these,
> <w lemma="s:1 s:2" morph="r:9 r:10">word</w>
> it is not possible to know for certain, but only by convention that s:1
> maps to r:9 and s:2 maps to r:10.
> Adding src to it makes the mapping even more difficult.
> <w src="7,9" lemma="s:1 s:2" morph="r:9 r:10">word</w>
> It appears that nesting is being used to indicate uniquely the origin.
> <w src="7" lemma="s:2" morph="r:9"><w src="9" lemma="s:1"
> morph="r:10>word</w></w>
>
> Is there any value in such a nesting construct? (Personally, I don't
> think so.)
No. OSIS is already being used in CrossWire's KJV in ways other than
what was intended. The lemma and morph attributes indicate the lemma and
morphology of their content, but the CrossWire KJV uses them to indicate
the lemma and morphology of words in a completely different document.
(We have an English language document with lemmatization and morphology
information for a Greek language document.)
If the English text does not distinguish two Greek language tokens, such
that they must be translated using a single English token (or vice
versa), then there is no reason to disambiguate which lemma is
associated with which morphological information in the translation text.
I think it is simply not relevant information, from the perspective of a
KJV-reader.
--Chris
More information about the sword-devel
mailing list