[sword-devel] GlobalOptionFilter=UTF8GreekAccents
David Haslam
dfhdfh at protonmail.com
Mon Mar 17 11:44:37 EDT 2025
Hi DM,
One impact is on the StatResGNT module, in which both single and double left/right quotation marks have been added by the project leader.
Hiding Greek Accents has the bad effect of losing the end quotation mark for all the level 2 quotations in the text.
NB. It was seeing this project that prompted me to revisit this topic.
It would be a real benefit to this module to make the change that I proposed.
Further to my initial thoughts late last week, I now agree that U+2019 is the right codepoint choice to mark an elision.
I was somewhat misled by the wrong answer given by Leo AI, which mistakenly told me that it was a way to represent the iota subscript.
It's only since quizzing Grok AI that my thoughts have become clear. I admit that I should've known better, but I'm not a classicist.
Yet the "category mistake" still exists - since an elision marker is not a diacritic. And by definition, a Greek Accent is a diacritic!
Making the proposed change to the filter should have a minimal effect upon all the other Ancient Greek Bible modules.
The number of wordsthus affected in a Greek NT module is not huge!
There's really no downside to still displaying the "typographical apostrophe".
To illustrate, these are the only 21 words in TischMorph that end with U+2019.
> Word Count
> Δι’ 2
> Κατ’ 1
> δ’ 22
> δι’ 142
> καθ’ 61
> κατ’ 82
> μεθ’ 43
> μετ’ 132
> μηδ’ 1
> οὐδ’ 8
> παρ’ 59
> τοῦτ’ 17
> ἀλλ’ 220
> ἀνθ’ 5
> ἀπ’ 119
> ἀφ’ 44
> Ἀλλ’ 1
> ἐπ’ 143
> ἐφ’ 82
> ὑπ’ 25
> ὑφ’ 9
It's now my considered view that even when the Greek accents are hidden by the filter, the elision marks ought to be retained.
Best regards,
David
Sent with [Proton Mail](https://pr.tn/ref/SWXT9A5YZ67G) secure email.
On Monday, March 17th, 2025 at 3:06 PM, DM Smith <dmsmith at crosswire.org> wrote:
> David, I read your Grok 3 analysis.
>
> What is the impact of not having this change? What is the impact of making the change? Is it merely presentation of is there an issue with searching too?
>
> I’ve also been reading https://corp.unicode.org/pipermail/unicode/2019-January/007563.html which was referenced in a prior recent thread on U+2019 in Ancient Greek. This is long and worth reading to understand how it might impact SWORD. The thread is initiated by James Tauber.
>
> TL;DR:
> U+2019 (and in older texts U+0027) in Ancient Greek was never used for quotations and is only used for elision. It is considered the recommended character for elisions.
> The Unicode rules (when the thread was written in January 2019) of TR29 have that U+2019 is a word break when at the front or end of a word, but not within a word. It is not simply punctuation. These rules are not language aware.
> There is no zero width character in Unicode to join words.
> It is impossible for TR29 to distinguish between U+2019 used as a quotation mark and as an elision.
> There is no other character that is an appropriate replacement for U+2019.
>
> I haven’t yet looked at Unicode TR30 regarding folding rules as it pertains to this.
>
> In Him,
> DM
>
>> On Mar 17, 2025, at 8:46 AM, David Haslam <dfhdfh at protonmail.com> wrote:
>>
>> Dear SWORD developers,
>>
>> I asked about this topic several years ago, and I'm no longer convinced by what we were told back then.
>>
>> After doing further research, it's my understanding that U+2019 RIGHT SINGLE QUOTATION MARK ought not to be hidden by this SWORD filter.
>>
>> - This codepoint is not a diacritic that modifies the previous Greek letter. In other words, it's not a Greek accent.
>> - This codepoint has the Unicode properties of a punctuation mark.
>> - In Ancient Greek text, it's used to mark an elision, where the final vowel of a word is omitted when the next word begins with a vowel.
>>
>> To view my research, conducted with the help of Grok 3, please visit the following link.
>>
>> - https://grok.com/share/bGVnYWN5_43ff1922-3876-4d9a-9e42-6ae940007fd0
>>
>> I therefore recommend that SWORD developers revisit the specification for this filter, and update it so that U+2019 is never hidden.
>>
>> Best regards,
>>
>> David
>>
>> Sent with [Proton Mail](https://pr.tn/ref/SWXT9A5YZ67G) secure email.
>> _______________________________________________
>> sword-devel mailing list: sword-devel at crosswire.org
>> http://crosswire.org/mailman/listinfo/sword-devel
>> Instructions to unsubscribe/change your settings at above page
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://crosswire.org/pipermail/sword-devel/attachments/20250317/487ce6eb/attachment-0001.htm>
More information about the sword-devel
mailing list