[sword-devel] OSHB 2.1: valid content for the lemma attribute ?

Daniel Owens dcowens76 at gmail.com
Tue Dec 29 04:20:16 EST 2020


Good question. Looking at their readme file, this bit is relevant:

*****
*TOTHT - Tyndale OT Hebrew Tagged text 
<https://github.com/tyndale/STEPBible-Data>*
The Leningrad codex based on Westminster via OpenScriptures, with full 
morphological and semantic tags for all words, prefixes and suffixes. 
Semantic tags use the extended Strongs linked to BDB by OS, is 
backwardly compatible with simple Strongs tags and includes all affixes 
(as defined in TBESH).

*****

If OS = Open Scriptures, then perhaps they got their Strongs linkings 
from us. That is at least plausible.

Daniel

On 12/28/20 8:15 PM, David Haslam wrote:
> David,
>
> How do the Open Scriptures extensions to Strong’s numbers relate (if 
> at all) to the augmented Strong’s numbers documented by Tyndale House, 
> Cambridge and implemented in STEP Bible ?
>
> https://github.com/tyndale/STEPBible-Data<https://github.com/tyndale/STEPBible-Data>
>
> Best regards,
>
> David Haslam
>
> Sent from ProtonMail Mobile
>
>
> On Mon, Dec 28, 2020 at 12:54, pierre amadio <amadio.pierre at gmail.com 
> <mailto:amadio.pierre at gmail.com>> wrote:
>> Hello.
>>
>> I received the following feedback from Daniel Owens:
>>
>> #############
>> When creating the OSHB, we ran in to the problem that Strong's numbers
>> did not have a place for a number of prefixed lemma, including the
>> inseparable prepositions and the vav conjunction. So we created some
>> additional Strong's "numbers" to be able to mark up such lemma. You
>> will notice that in the morph attribute, there are two parsings, "HR"
>> for "Hebrew Preposition" and "HNcfsa" for "Hebrew Noun common feminine
>> singular absolute". The preposition is the prefixed bet (בְּ). I hope
>> that answers your question.
>> #############
>>
>> I understand the logic behind the choice, but it looks to me this is
>> is not behaving as expected with the Sword engine.
>>
>> Hebrew can express in a single word things that require several words
>> in english.
>> In my previous mail mentioning genesis 1:1 bereshit (in a beginning)
>> is made out of 2 semantic units be/reshit:
>>
>> Excerpt from morphhb/oxlos-import/wlc.txt (where i think is the "raw"
>> text used to build the module) from
>> https://github.com/openscriptures/morphhb
>> Gen 1:1.1 7225 בְּ/רֵאשִׁ֖ית
>>
>> Here we can see that / is used as a separator (or is it a reverse \ ? 
>> :-) )
>>
>> Let's look at an example with 3 elements, from genesis 12:1 in "Now
>> the Lord had said unto Abram, Get thee out of thy country"
>> the "out of your country" is a single word: from/earth-land/yours 
>> me-artze-ra
>> Excerpt from morphbb's wlc.txt
>> Gen 12:1.7 776 מֵ/אַרְצְ/ךָ֥
>>
>> If i look at this word with
>> diatheke -b OSHB -o avlmn -f OSIS -k Genesis 12:1
>>
>> OSHB 1.4
>> <w lemma="strong:H0776" morph="oshm:HR/Ncbsc/Sp2ms">מֵאַרְצְךָ</w>
>> OSHB 2.1
>> <w lemma="strong:Hm strong:H0776" morph="oshm:HR/Ncbsc/Sp2ms">מֵאַרְצְךָ</w>
>>
>> I see several problem:
>>
>> 1) As with bereshit (in a beginning), the strong number for the prefix
>> is not a number and does not exist in the strong dictionary.
>> This will probably result in unexpected behaviour from the frontend
>> trying to show strong's number definitions.
>>
>> 2) with genesis example 12:1, the resulting xml node mention only 2
>> elements (from/earth) מֵ/אַרְצְ and omit the "yours".
>> I would have expected 3 entry מֵ/אַרְצְ/ךָ֥ in order to be consistent
>> with how things are displayed with bereshit.
>>
>> I try to see if this had an effect with diatheke, looking for strong
>> entry H0776:
>>
>> OSHB 1.4
>> /usr/local/sword/bin/diatheke -b OSHB -s lucene -r Genesis -k 
>> "lemma:H0776"
>> 252 matches
>> /usr/local/sword/bin/diatheke -b OSHB -s attribute -r Genesis -k
>> "Word//Lemma./H0776/"
>> 292 matches
>>
>> OSHB 2.1
>> /usr/local/sword/bin/diatheke -b OSHB -s lucene -r Genesis -k 
>> "lemma:H0776"
>> 252 matches
>> /usr/local/sword/bin/diatheke -b OSHB -s attribute -r Genesis -k
>> "Word//Lemma./H0776/"
>> 252 matches
>>
>> It looks like diatheke still finds the entry, now, what i do not
>> understand is why the attribute search with version 1.4 find 292
>> matches instead of 252 (which all the other research seems to agree
>> on).
>> _______________________________________________
>> sword-devel mailing list: sword-devel at crosswire.org
>> http://crosswire.org/mailman/listinfo/sword-devel
>> Instructions to unsubscribe/change your settings at above page
>
>
>
> _______________________________________________
> sword-devel mailing list: sword-devel at crosswire.org
> http://crosswire.org/mailman/listinfo/sword-devel
> Instructions to unsubscribe/change your settings at above page


More information about the sword-devel mailing list