[sword-devel] OSIS milestone markers

David Haslam dfhdfh at protonmail.com
Sun Mar 15 13:28:49 EDT 2020


Thanks, Greg,

Very comprehensive analysis.

Blessing!

David

Sent from ProtonMail Mobile

On Sun, Mar 15, 2020 at 17:20, Greg Hellings <greg.hellings at gmail.com> wrote:

> Here is the first example of a cQuote I can find in the NASB (the character you indicated doesn't appear in the NASB output I can locate, but this one does):
>
> $ diatheke -b NASB -k Gen.3.4-Gen.3.5
> Genesis 3:4: <w savlm="strong:H5175">The serpent</w> <w savlm="strong:H559">said</w> <w savlm="strong:H802">to the woman</w>, “<w savlm="strong:H4191">You surely</w> <w savlm="strong:H4191">will not die</w>!
> Genesis 3:5: <milestone marker="“" type="cQuote"/><w savlm="strong:H430">For God</w> <w savlm="strong:H3045">knows</w> <w savlm="strong:H3117">that in the day</w> <w savlm="strong:H398">you eat</w> <w savlm="strong:H5869">from it your eyes</w> <w savlm="strong:H6491b">will be opened</w>, and <w savlm="strong:H430">you will be like God</w>, <w savlm="strong:H3045">knowing</w> <w savlm="strong:H2896b">good</w> <w savlm="strong:H7451b">and evil</w>.”
> (NASB)
>
> Asking for the OSIS filter suppresses that character:
> $ diatheke -b NASB -f OSIS -k Gen.3.4-Gen.3.5
> Genesis 3:4: <w>The serpent</w> <w>said</w> <w>to the woman</w>, “<w>You surely</w> <w>will not die</w>!<milestone type="line"/>
> Genesis 3:5: <w>For God</w> <w>knows</w> <w>that in the day</w> <w>you eat</w> <w>from it your eyes</w> <w>will be opened</w>, and <w>you will be like God</w>, <w>knowing</w> <w>good</w> <w>and evil</w>.”<milestone type="line"/>
> (NASB)
>
> Enabling all available filters still doesn't give back that raw output:
> $ diatheke -b NASB -f OSIS -o nfmhcvaplsrbwgeixtM -k Gen.3.4-Gen.3.5
> Genesis 3:4: <note n="A" osisID="Gen.٣.٤.xref.A" type="crossReference"></note><w lemma="strong:H٥١٧٥">The serpent</w> <w lemma="strong:H٥٥٩">said</w> <w lemma="strong:H٨٠٢">to the woman</w>, “<w lemma="strong:H٤١٩١">You surely</w> <w lemma="strong:H٤١٩١">will not die</w>!<milestone type="line"/>
> Genesis 3:5: <w lemma="strong:H٤٣٠">For God</w> <w lemma="strong:H٣٠٤٥">knows</w> <w lemma="strong:H٣١١٧">that in the day</w> <w lemma="strong:H٣٩٨">you eat</w> <w lemma="strong:H٥٨٦٩">from it your eyes</w> <w lemma="strong:H٦٤٩١b">will be opened</w>, and <note n="A" osisID="Gen.٣.٥.xref.A" type="crossReference"></note><w lemma="strong:H٤٣٠">you will be like God</w>, <w lemma="strong:H٣٠٤٥">knowing</w> <w lemma="strong:H٢٨٩٦b">good</w> <w lemma="strong:H٧٤٥١b">and evil</w>.”<milestone type="line"/>
> (NASB)
>
> Switching to plain text we get a more interesting result. Namely, a blank line where the missing cQuote character is in the OSIS input.
> $ diatheke -b NASB -f plain -k Gen.3.4-Gen.3.5
> Genesis 3:4: The serpent said to the woman, “You surely will not die!
> Genesis 3:5:
> For God knows that in the day you eat from it your eyes will be opened, and you will be like God, knowing good and evil.”
> (NASB)
>
> The HTML filter simply passes through the OSIS character unperturbed (this seems like a bug, to me, as I'm unaware of any <milestone> elements in HTML):
> $ diatheke -b NASB -f HTML -k Gen.3.4-Gen.3.5
> <html><head><meta http-equiv="content-type" content="text/html" charset="UTF-8" lang="en" xml:lang="en"/>
> <style type="text/css"></style></head><body>Genesis 3:4: <span style="font:Gentium;" ><w savlm="strong:H5175">The serpent</w> <w savlm="strong:H559">said</w> <w savlm="strong:H802">to the woman</w>, “<w savlm="strong:H4191">You surely</w> <w savlm="strong:H4191">will not die</w>!</span><br />
> Genesis 3:5: <span style="font:Gentium;" ><milestone marker="“" type="cQuote"/><w savlm="strong:H430">For God</w> <w savlm="strong:H3045">knows</w> <w savlm="strong:H3117">that in the day</w> <w savlm="strong:H398">you eat</w> <w savlm="strong:H5869">from it your eyes</w> <w savlm="strong:H6491b">will be opened</w>, and <w savlm="strong:H430">you will be like God</w>, <w savlm="strong:H3045">knowing</w> <w savlm="strong:H2896b">good</w> <w savlm="strong:H7451b">and evil</w>.”</span><br />
> (NASB)
> </body></html>
>
> But the HTMLHREF filter does:
> $ diatheke -b NASB -f HTMLHREF -k Gen.3.4-Gen.3.5
> <html><head><meta http-equiv="content-type" content="text/html" charset="UTF-8" lang="en" xml:lang="en"/>
> <style type="text/css"></style></head><body>Genesis 3:4: <span style="font:Gentium;" >The serpent said to the woman, “You surely will not die!</span><br />
> Genesis 3:5: <span style="font:Gentium;" >“For God knows that in the day you eat from it your eyes will be opened, and you will be like God, knowing good and evil.”</span><br />
> (NASB)
> </body></html>
>
> Going from memory for the next two points:
> 1) HTMLHREF is the most common filter for our frontends to use. Even more common than HTML, as it converts links to anchor tags the frontend can capture
> 2) The purpose of the cQuote is to indicate that the character should be used to indicate continuation of a quote. Thus, it should not appear in /every/ circumstance that it exists in the input document. It should only appear if the portion of the document being displayed does not include the preceding text where the actual opening quote lives.
>
> Assuming my two memory points above are correct: when I ask Diatheke for Genesis 3:4-5, I should NOT see the cQuote character at the start of the text of verse 5. Because verse 4 includes the opening quotation mark. However, if I ask for just Genesis 3:5, then I should see the cQuote character, because otherwise the reader does not have any way of knowing the text at the start of the verse is part of a quotation until they reach the close-quote character at the end of the verse. I have unbalanced quotation marks.
>
> So what we have are actually three different buggy behaviors that are intertwined.
> 1) The HTMLHREF filter SHOULD NOT be displaying the quotation mark when I'm asking for Gen.3.4-5
> 2) All other filters SHOULD be displaying the quotation mark when I'm asking for Gen.3.5, by itself
> 3) The plain filter SHOULD NOT be displaying a newline character in place of the cQuote at all
>
> These are not trivial bugs to conceptualize because they would require the filter to become context aware of the scripture and only process a cQuote if it appears in the first verse that is being processed. Our filter is relatively stateless, but I don't think it is entirely so. But, hopefully, this report can help formulate both a test case and a fix.
>
> --Greg
>
> On Sun, Mar 15, 2020 at 10:15 AM David Haslam <dfhdfh at protonmail.com> wrote:
>
>> It’s also apparent that the output of diatheke does not include these markers though front-ends such as PocketSword do display them.
>>
>> Puzzling!
>>
>> Should this become an issue in our tracker for MODTOOLS ?
>>
>> Best regards
>>
>> David
>>
>> Sent from ProtonMail Mobile
>>
>> On Sat, Mar 14, 2020 at 21:06, David Haslam <dfhdfh at protonmail.com> wrote:
>>
>>> I have observed that the 3 modules in the Lockman repository make use of the following OSIS milestone element.
>>>
>>> <milestone marker="»" type="cQuote"/>
>>>
>>> The actual marker character varies between modules and even within a module.
>>>
>>> I assume that these mark the occurrence of a Continuation Quotation Mark of one form or another in the printed text.
>>>
>>> It seems a pity that in SWORD there is no corresponding
>>>
>>> GlobalOptionFilter=OSISMilestoneMarkers
>>>
>>> that front-ends could make use of in order to show/hide these characters.
>>>
>>> Best regards,
>>>
>>> David
>>>
>>> Sent with ProtonMail Secure Email.
>>
>> _______________________________________________
>> sword-devel mailing list: sword-devel at crosswire.org
>> http://www.crosswire.org/mailman/listinfo/sword-devel
>> Instructions to unsubscribe/change your settings at above page
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.crosswire.org/pipermail/sword-devel/attachments/20200315/b4082e75/attachment.html>


More information about the sword-devel mailing list