<!DOCTYPE html>

<html>

  <head>

    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

  </head>

  <body>

    <br>

    <br>

    <div class="moz-cite-prefix">Le 02/10/2023 à 09:38, Timothy Allen a

      écrit :<br>

    </div>

    <blockquote type="cite"

      cite="mid:b0203b48-00b6-46af-be8b-971810123181@gmail.com">

      <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

      <p>Ah, thanks. I did look at that page when I started making my

        module, but I'd forgotten about it by the time I needed this

        more detailed advice. Thanks for reminding me! Using this to

        update the guesses from my original message:</p>

      <dl>

        <dt>gloss</dt>

        <dd>I *might* be able to try grabbing the first word from the

          BDB/Thayer gloss, but that seems error-prone and I probably

          won't bother unless somebody really wants it</dd>

        <dt>lemma</dt>

        <dd>This should be used for Strongs numbers, marked up as

          "strong:G123" or "strong:H123", but could also be used for

          storing the original source text as "lemma.BSB:בְּרֵאשִׁ֖ית"

          if we assume a hypothetical lexicon that indexes all the words

          in the BSB.</dd>

        <dt>morph</dt>

        <dd>This should be used for Robinson morphology codes, so I

          should not bother with this until I can figure out how to

          translate the BSB's codes to Robinson ones. The wiki page also

          has "strongMorph" codes in its examples, but I can't find any

          extra information on what system this might refer to.

          Apparently there aren't any Hebrew morphology lexicons

          available for SWORD; maybe someday I could make one?</dd>

      </dl>

    </blockquote>

    <br>

    For Hebrew we have OSHM module.<br>

    <blockquote type="cite"

      cite="mid:b0203b48-00b6-46af-be8b-971810123181@gmail.com">

      <dl>

        <dt>POS</dt>

        <dd>Still unclear to me, it's not mentioned on the wiki page</dd>

        <dt>src</dt>

        <dd>Apparently this is for word order in the source language,

          but it's not at all clear where "word 1" is. The start of the

          <w> element? The start of the verse? The start of the

          chapter? The start of the book? The start of the Bible? Does

          it not matter, because front-ends are intended to just sort

          the words they have?</dd>

        <dt>xlit</dt>

        <dd>Still for the transliteration, simply enough.</dd>

      </dl>

      <p>According to the wiki page, there's also an "n" attribute not

        mentioned in the official OSIS docs, which is for "marking

        enumerated words". I don't know what this means, and the wiki

        page doesn't include any examples. I'm going to guess I don't

        need it.<br>

      </p>

      <p><br>

      </p>

      <p>Do I have all that right? Is there anything I've misunderstood?<br>

      </p>

      <p>Also, would it be better to have "lemma.BSB:בְּרֵאשִׁ֖ית" and

        use the same "BSB" lexicon for every word in the entire text, or

        would it be more appropriate to use "lemma.WLC:בְּרֵאשִׁ֖ית" and

        use different lexicons to indicate the different sources used

        for the translation (Nestle1904, TR, NA, SBL, etc.)?</p>

      <p><br>

      </p>

      <p>Timothy</p>

      <p><br>

      </p>

      <div class="moz-cite-prefix">On 30/9/23 20:00, David Haslam wrote:<br>

      </div>

      <blockquote type="cite"

cite="mid:24HvfbzJn_xaJ3bn8r6wcBqqeTO8l2DTbD4l2kINtY2w5Nc6vRy88IsBJRUxge98NNIv0ZIclSi0X7Ly6KckvIRAtzG74d_6VxOshq31RHw=@protonmail.com">

        <meta http-equiv="content-type"

          content="text/html; charset=UTF-8">

        <div dir="auto">Hi Timothy,<caret></caret></div>

        <div dir="auto"><br>

        </div>

        <div dir="auto">Please consult the developers’ wiki</div>

        <div dir="auto"><br>

        </div>

        <div dir="auto"><a class="moz-txt-link-freetext"

            href="https://wiki.crosswire.org/" moz-do-not-send="true">https://wiki.crosswire.org/</a></div>

        <div dir="auto"><br>

        </div>

        <div dir="auto">And consult the page about OSIS Bibles. </div>

        <div dir="auto"><br>

        </div>

        <div dir="auto">David</div>

        <div><br>

        </div>

        <div id="protonmail_mobile_signature_block">

          <div>Sent from <a href="https://proton.me/mail/home"

              moz-do-not-send="true">Proton Mail</a> for iOS</div>

        </div>

        <div><br>

        </div>

        <div><br>

        </div>

        On Sat, Sep 30, 2023 at 10:54, Timothy Allen <<a class=""

href="mailto:On Sat, Sep 30, 2023 at 10:54, Timothy Allen <<a href="

          moz-do-not-send="true">thristian@gmail.com</a>> wrote:

        <blockquote type="cite" class="protonmail_quote">

          <p>The Berean Standard Bible is available in two

            machine-readable formats: USFM, and "translation tables", a

            40MB Excel spreadsheet with a row for every Hebrew or Greek

            word in their chosen source texts with the English text it's

            translated to. I would like to make one module with the nice

            formatting of the USFM sources and the metadata from the

            spreadsheet, so I've spent the last few weeks writing a

            script that runs through them both in parallel and makes

            sure everything lines up, so I'm now confident that I have

            an accurate mapping between them.</p>

          <p>My question now is, how can I translate the data from the

            spreadsheet into OSIS?</p>

          <p>Here's the information the spreadsheet gives me:</p>

          <table width="100%" cellspacing="2" cellpadding="2" border="1">

            <tbody>

              <tr>

                <th valign="top">Column<br>

                </th>

                <th valign="top">Example<br>

                </th>

                <th valign="top">Notes<br>

                </th>

              </tr>

              <tr>

                <td valign="top">he_ordinal<br>

                </td>

                <td valign="top">1<br>

                </td>

                <td valign="top">"Hebrew Ordinal", increments for each

                  spreadsheet row in the Old Testament, set to 999999

                  for each row in the New Testament<br>

                </td>

              </tr>

              <tr>

                <td valign="top">el_ordinal<br>

                </td>

                <td valign="top">0<br>

                </td>

                <td valign="top">"Greek Ordinal", set to 0 for each row

                  in the Old Testament, increments for each row in the

                  New Testament, except for Mark 1:1 which has a word

                  with the number 18379.5 (presumably something needed

                  to be inserted and they didn't want to renumber

                  everything else)<br>

                </td>

              </tr>

              <tr>

                <td valign="top">en_ordinal<br>

                </td>

                <td valign="top">1<br>

                </td>

                <td valign="top">"English Ordinal", increments for each

                  spreadsheet row (except for that word in Mark 1:1)<br>

                </td>

              </tr>

              <tr>

                <td valign="top">language<br>

                </td>

                <td valign="top">Hebrew<br>

                </td>

                <td valign="top">"Hebrew", "Greek", or sometimes

                  "Aramaic"<br>

                </td>

              </tr>

              <tr>

                <td valign="top">verse_ordinal<br>

                </td>

                <td valign="top">1<br>

                </td>

                <td valign="top">Increments for each verse in the Bible,

                  so every word in Genesis 1:1 has "1", etc.<br>

                </td>

              </tr>

              <tr>

                <td valign="top">source_word<br>

                </td>

                <td valign="top">בְּרֵאשִׁ֖ית<br>

                </td>

                <td valign="top">The word in the original source text.

                  Sometimes includes fancy brackets to mark sources

                  other than WLC or Nestle 1904: {TR} ⧼RP⧽ (WH) 〈NE〉

                  [NA] ‹SBL› [[ECM]]<br>

                </td>

              </tr>

              <tr>

                <td valign="top">transliteration<br>

                </td>

                <td valign="top">bə·rê·šîṯ<br>

                </td>

                <td valign="top">A transliteration of the source word

                  into the Latin alphabet<br>

                </td>

              </tr>

              <tr>

                <td valign="top">grammar_code<br>

                </td>

                <td valign="top">Prep-b | N-fs<br>

                </td>

                <td valign="top">A code describing the grammatical form

                  of the word; these don't appear to be Robinson codes,

                  but their own custom thing for Hebrew (<a

                    href="https://biblehub.com/hebrewparse.htm"

                    class="moz-txt-link-freetext" moz-do-not-send="true">https://biblehub.com/hebrewparse.htm</a>)

                  and Greek (<a href="https://biblehub.com/abbrev.htm"

                    class="moz-txt-link-freetext" moz-do-not-send="true">https://biblehub.com/abbrev.htm</a>)<br>

                </td>

              </tr>

              <tr>

                <td valign="top">grammar_description<br>

                </td>

                <td valign="top">Preposition-b | Noun - feminine

                  singular<br>

                </td>

                <td valign="top">The grammar code, unabbreviated<br>

                </td>

              </tr>

              <tr>

                <td valign="top">strongs_number<br>

                </td>

                <td valign="top">7225<br>

                </td>

                <td valign="top">The Strongs number of the basic form of

                  this word<br>

                </td>

              </tr>

              <tr>

                <td valign="top">translation<br>

                </td>

                <td valign="top">In the beginning<br>

                </td>

                <td valign="top">The English text that appears in the

                  BSB<br>

                </td>

              </tr>

              <tr>

                <td valign="top">gloss<br>

                </td>

                <td valign="top">1) first, beginning, best, chief<br>

                  1a) beginning<br>

                  1b) first<br>

                  1c) chief<br>

                  1d) choice part<br>

                </td>

                <td valign="top">A definition from the

                  Brown-Driver-Briggs Hebrew Lexicon, or Thayer's Greek

                  Definitions, as appropriate<br>

                </td>

              </tr>

            </tbody>

          </table>

          <p>Looking at the OSIS 2.1.1 User's Manual (and sniffing

            around in the KJVA module), to represent this information in

            OSIS I should use the <w> element, which supports the

            following attributes (copy/pasted from the Manual):</p>

          <ul>

            <li><b>gloss</b> Record comments on a particular word or its

              usage.</li>

            <li><b>lemma</b> Use to record the base form of a word.</li>

            <li><b>morph</b> Use to record grammatical information for a

              word.</li>

            <li><b>POS</b> Use to record the function of a word

              according to a particular view of the language's syntax.</li>

            <li><b>src</b> Use to record origin of the word.</li>

            <li><b>xlit</b> Use to record a transliteration of a word.</li>

          </ul>

          <p>The first problem is that sometimes multiple source words

            are translated into a single English span, and it's not made

            clear how to express that in these attributes. From poking

            around in the KJVA module, I get the impression these are

            supposed to be space-delimited lists. Is that correct?</p>

          <p>Assuming that's the case, here's my guesses at how to fill

            out these attributes for each span:</p>

          <ul>

            <li><b>gloss</b> can't be done, because each gloss contains

              spaces which means the displaying app can't figure out

              which part of the gloss goes with which word</li>

            <li><b>lemma</b> is where Strongs numbers go; Greek Strongs

              numbers should be prefixed with "G" and Hebrew/Aramaic

              ones with "H0"</li>

            <li><b>morph</b> might be used for the "grammar code"

              content, but I would probably need to figure out how to

              translate them into Robinson codes first, since that seems

              to be the only morphological dictionary module in the

              Crosswire repositories</li>

            <li><b>POS</b> is unclear to me, I don't see how it differs

              from the "morph" attribute</li>

            <li><b>src</b> is also unclear: is this for the word order

              (he_ordinal or el_ordinal, possibly numbered from the

              beginning of the verse rather than the beginning of the

              entire Bible) or the actual choice of source text

              (Nestle1904, TR, NA, SBL, etc.)?</li>

            <li><b>xlit</b> clearly comes from the "transliteration"

              field</li>

          </ul>

          <p>One thing that's clearly missing is where to put the source

            word. How does that work?<br>

          </p>

          <p>Is there other way to represent information that doesn't

            fit into the <w> element? I'd like this module to be

            as useful as possible, so I'm hesitant to toss out any

            information that can be usefully represented.</p>

          <p>Is there anything else I've missed or misunderstood?</p>

          <p><br>

          </p>

          <p>Timothy.<br>

          </p>

        </blockquote>

        <br>

        <fieldset class="moz-mime-attachment-header"></fieldset>

        <pre class="moz-quote-pre" wrap="">_______________________________________________

sword-devel mailing list: <a

        class="moz-txt-link-abbreviated moz-txt-link-freetext"

        href="mailto:sword-devel@crosswire.org" moz-do-not-send="true">sword-devel@crosswire.org</a>

<a class="moz-txt-link-freetext"

        href="http://crosswire.org/mailman/listinfo/sword-devel"

        moz-do-not-send="true">http://crosswire.org/mailman/listinfo/sword-devel</a>

Instructions to unsubscribe/change your settings at above page

</pre>

      </blockquote>

      <br>

      <fieldset class="moz-mime-attachment-header"></fieldset>

      <pre class="moz-quote-pre" wrap="">_______________________________________________

sword-devel mailing list: <a class="moz-txt-link-abbreviated" href="mailto:sword-devel@crosswire.org">sword-devel@crosswire.org</a>

<a class="moz-txt-link-freetext" href="http://crosswire.org/mailman/listinfo/sword-devel">http://crosswire.org/mailman/listinfo/sword-devel</a>

Instructions to unsubscribe/change your settings at above page

</pre>

    </blockquote>

    <br>

  </body>

</html>