[sword-devel] OSHB 2.1: valid content for the lemma attribute ?

pierre amadio amadio.pierre at gmail.com
Mon Dec 28 07:54:01 EST 2020


Hello.

I received the following feedback from Daniel Owens:

#############
When creating the OSHB, we ran in to the problem that Strong's numbers
did not have a place for a number of prefixed lemma, including the
inseparable prepositions and the vav conjunction. So we created some
additional Strong's "numbers" to be able to mark up such lemma. You
will notice that in the morph attribute, there are two parsings, "HR"
for "Hebrew Preposition" and "HNcfsa" for "Hebrew Noun common feminine
singular absolute". The preposition is the prefixed bet (בְּ). I hope
that answers your question.
#############

I understand the logic behind the choice, but it looks to me this is
is not behaving as expected with the Sword engine.

Hebrew can express in a single word things that require several words
in english.
In my previous mail mentioning genesis 1:1 bereshit (in a beginning)
is made out of 2 semantic units be/reshit:

Excerpt from morphhb/oxlos-import/wlc.txt (where i think is the "raw"
text used to build the module) from
https://github.com/openscriptures/morphhb
Gen 1:1.1    7225    בְּ/רֵאשִׁ֖ית

Here we can see that / is used as a separator (or is it a reverse \  ? :-) )

Let's look at an example with 3 elements, from genesis 12:1 in "Now
the Lord had said unto Abram, Get thee out of thy country"
the "out of your country" is a single word: from/earth-land/yours me-artze-ra
Excerpt from morphbb's wlc.txt
Gen 12:1.7    776    מֵ/אַרְצְ/ךָ֥

If i look at this word with
diatheke  -b OSHB -o avlmn -f OSIS -k Genesis 12:1

OSHB 1.4
<w lemma="strong:H0776" morph="oshm:HR/Ncbsc/Sp2ms">מֵאַרְצְךָ</w>
OSHB 2.1
<w lemma="strong:Hm strong:H0776" morph="oshm:HR/Ncbsc/Sp2ms">מֵאַרְצְךָ</w>

I see several problem:

1) As with bereshit (in a beginning), the strong number for the prefix
is not a number and does not exist in the strong dictionary.
This will probably result in unexpected behaviour from the frontend
trying to show strong's number definitions.

2) with genesis example 12:1, the resulting xml node mention only 2
elements (from/earth) מֵ/אַרְצְ and omit the "yours".
I would have expected 3 entry מֵ/אַרְצְ/ךָ֥ in order to be consistent
with how things are displayed with bereshit.

I try to see if this had an effect with diatheke, looking for strong
entry H0776:

OSHB 1.4
/usr/local/sword/bin/diatheke -b OSHB -s lucene -r Genesis -k "lemma:H0776"
252 matches
/usr/local/sword/bin/diatheke -b OSHB -s attribute -r Genesis -k
"Word//Lemma./H0776/"
292 matches

OSHB 2.1
/usr/local/sword/bin/diatheke -b OSHB -s lucene -r Genesis -k "lemma:H0776"
252 matches
/usr/local/sword/bin/diatheke -b OSHB -s attribute -r Genesis -k
"Word//Lemma./H0776/"
252 matches

It looks like diatheke still finds the entry, now, what i do not
understand is why the attribute search with version 1.4 find 292
matches instead of 252 (which all the other research seems to agree
on).


More information about the sword-devel mailing list