[sword-devel] OSIS milestone markers

Greg Hellings greg.hellings at gmail.com
Sun Mar 15 13:20:36 EDT 2020


Here is the first example of a cQuote I can find in the NASB (the character
you indicated doesn't appear in the NASB output I can locate, but this one
does):

$ diatheke -b NASB -k Gen.3.4-Gen.3.5
Genesis 3:4: <w savlm="strong:H5175">The serpent</w> <w
savlm="strong:H559">said</w> <w savlm="strong:H802">to the woman</w>, “<w
savlm="strong:H4191">You surely</w> <w savlm="strong:H4191">will not
die</w>!
Genesis 3:5: <milestone marker="“" type="cQuote"/><w
savlm="strong:H430">For God</w> <w savlm="strong:H3045">knows</w> <w
savlm="strong:H3117">that in the day</w> <w savlm="strong:H398">you eat</w>
<w savlm="strong:H5869">from it your eyes</w> <w savlm="strong:H6491b">will
be opened</w>, and <w savlm="strong:H430">you will be like God</w>, <w
savlm="strong:H3045">knowing</w> <w savlm="strong:H2896b">good</w> <w
savlm="strong:H7451b">and evil</w>.”
(NASB)

Asking for the OSIS filter suppresses that character:
$ diatheke -b NASB -f OSIS -k Gen.3.4-Gen.3.5
Genesis 3:4: <w>The serpent</w> <w>said</w> <w>to the woman</w>, “<w>You
surely</w> <w>will not die</w>!<milestone type="line"/>
Genesis 3:5: <w>For God</w> <w>knows</w> <w>that in the day</w> <w>you
eat</w> <w>from it your eyes</w> <w>will be opened</w>, and <w>you will be
like God</w>, <w>knowing</w> <w>good</w> <w>and evil</w>.”<milestone
type="line"/>
(NASB)

Enabling all available filters still doesn't give back that raw output:
$ diatheke -b NASB -f OSIS -o nfmhcvaplsrbwgeixtM -k Gen.3.4-Gen.3.5
Genesis 3:4: <note n="A" osisID="Gen.٣.٤.xref.A"
type="crossReference"></note><w lemma="strong:H٥١٧٥">The serpent</w> <w
lemma="strong:H٥٥٩">said</w> <w lemma="strong:H٨٠٢">to the woman</w>, “<w
lemma="strong:H٤١٩١">You surely</w> <w lemma="strong:H٤١٩١">will not
die</w>!<milestone type="line"/>
Genesis 3:5: <w lemma="strong:H٤٣٠">For God</w> <w
lemma="strong:H٣٠٤٥">knows</w> <w lemma="strong:H٣١١٧">that in the day</w>
<w lemma="strong:H٣٩٨">you eat</w> <w lemma="strong:H٥٨٦٩">from it your
eyes</w> <w lemma="strong:H٦٤٩١b">will be opened</w>, and <note n="A"
osisID="Gen.٣.٥.xref.A" type="crossReference"></note><w
lemma="strong:H٤٣٠">you will be like God</w>, <w
lemma="strong:H٣٠٤٥">knowing</w> <w lemma="strong:H٢٨٩٦b">good</w> <w
lemma="strong:H٧٤٥١b">and evil</w>.”<milestone type="line"/>
(NASB)

Switching to plain text we get a more interesting result. Namely, a blank
line where the missing cQuote character is in the OSIS input.
$ diatheke -b NASB -f plain -k Gen.3.4-Gen.3.5
Genesis 3:4: The serpent said to the woman, “You surely will not die!
Genesis 3:5:
For God knows that in the day you eat from it your eyes will be opened, and
you will be like God, knowing good and evil.”
(NASB)

The HTML filter simply passes through the OSIS character unperturbed (this
seems like a bug, to me, as I'm unaware of any <milestone> elements in
HTML):
$ diatheke -b NASB -f HTML -k Gen.3.4-Gen.3.5
<html><head><meta http-equiv="content-type" content="text/html"
charset="UTF-8" lang="en" xml:lang="en"/>
<style type="text/css"></style></head><body>Genesis 3:4: <span
style="font:Gentium;" ><w savlm="strong:H5175">The serpent</w> <w
savlm="strong:H559">said</w> <w savlm="strong:H802">to the woman</w>, “<w
savlm="strong:H4191">You surely</w> <w savlm="strong:H4191">will not
die</w>!</span><br />
Genesis 3:5: <span style="font:Gentium;" ><milestone marker="“"
type="cQuote"/><w savlm="strong:H430">For God</w> <w
savlm="strong:H3045">knows</w> <w savlm="strong:H3117">that in the day</w>
<w savlm="strong:H398">you eat</w> <w savlm="strong:H5869">from it your
eyes</w> <w savlm="strong:H6491b">will be opened</w>, and <w
savlm="strong:H430">you will be like God</w>, <w
savlm="strong:H3045">knowing</w> <w savlm="strong:H2896b">good</w> <w
savlm="strong:H7451b">and evil</w>.”</span><br />
(NASB)
</body></html>

But the HTMLHREF filter does:
$ diatheke -b NASB -f HTMLHREF -k Gen.3.4-Gen.3.5
<html><head><meta http-equiv="content-type" content="text/html"
charset="UTF-8" lang="en" xml:lang="en"/>
<style type="text/css"></style></head><body>Genesis 3:4: <span
style="font:Gentium;" >The serpent said to the woman, “You surely will not
die!</span><br />
Genesis 3:5: <span style="font:Gentium;" >“For God knows that in the day
you eat from it your eyes will be opened, and you will be like God, knowing
good and evil.”</span><br />
(NASB)
</body></html>

Going from memory for the next two points:
1) HTMLHREF is the most common filter for our frontends to use. Even more
common than HTML, as it converts links to anchor tags the frontend can
capture
2) The purpose of the cQuote is to indicate that the character should be
used to indicate continuation of a quote. Thus, it should not appear in
/every/ circumstance that it exists in the input document. It should only
appear if the portion of the document being displayed does not include the
preceding text where the actual opening quote lives.

Assuming my two memory points above are correct: when I ask Diatheke for
Genesis 3:4-5, I should NOT see the cQuote character at the start of the
text of verse 5. Because verse 4 includes the opening quotation mark.
However, if I ask for just Genesis 3:5, then I should see the cQuote
character, because otherwise the reader does not have any way of knowing
the text at the start of the verse is part of a quotation until they reach
the close-quote character at the end of the verse. I have unbalanced
quotation marks.

So what we have are actually three different buggy behaviors that are
intertwined.
1) The HTMLHREF filter SHOULD NOT be displaying the quotation mark when I'm
asking for Gen.3.4-5
2) All other filters SHOULD be displaying the quotation mark when I'm
asking for Gen.3.5, by itself
3) The plain filter SHOULD NOT be displaying a newline character in place
of the cQuote at all

These are not trivial bugs to conceptualize because they would require the
filter to become context aware of the scripture and only process a cQuote
if it appears in the first verse that is being processed. Our filter is
relatively stateless, but I don't think it is entirely so. But, hopefully,
this report can help formulate both a test case and a fix.

--Greg

On Sun, Mar 15, 2020 at 10:15 AM David Haslam <dfhdfh at protonmail.com> wrote:

> It’s also apparent that the output of diatheke does not include these
> markers though front-ends such as PocketSword do display them.
>
> Puzzling!
>
> Should this become an issue in our tracker for MODTOOLS ?
>
> Best regards
>
> David
>
> Sent from ProtonMail Mobile
>
>
> On Sat, Mar 14, 2020 at 21:06, David Haslam <dfhdfh at protonmail.com> wrote:
>
> I have observed that the 3 modules in the Lockman repository make use of
> the following OSIS milestone element.
>
> <milestone marker="»" type="cQuote"/>
>
> The actual marker character varies between modules and even within a
> module.
>
> I assume that these mark the occurrence of a Continuation Quotation Mark
> of one form or another in the printed text.
>
> It seems a pity that in SWORD there is no corresponding
>
> GlobalOptionFilter=OSISMilestoneMarkers
>
> that front-ends could make use of in order to show/hide these characters.
>
>
> Best regards,
>
> David
>
> Sent with ProtonMail Secure Email.
>
>
>
> _______________________________________________
> sword-devel mailing list: sword-devel at crosswire.org
> http://www.crosswire.org/mailman/listinfo/sword-devel
> Instructions to unsubscribe/change your settings at above page
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.crosswire.org/pipermail/sword-devel/attachments/20200315/9dfc74fd/attachment-0001.html>


More information about the sword-devel mailing list