[sword-devel] GlobalOptionFilter=UTF8GreekAccents

DM Smith dmsmith at crosswire.org
Mon Mar 17 11:06:13 EDT 2025


David, I read your Grok 3 analysis.

What is the impact of not having this change? What is the impact of making the change? Is it merely presentation of is there an issue with searching too?

I’ve also been reading https://corp.unicode.org/pipermail/unicode/2019-January/007563.html which was referenced in a prior recent thread on U+2019 in Ancient Greek. This is long and worth reading to understand how it might impact SWORD. The thread is initiated by James Tauber.

TL;DR:
U+2019 (and in older texts U+0027) in Ancient Greek was never used for quotations and is only used for elision. It is considered the recommended character for elisions.
The Unicode rules (when the thread was written in January 2019) of TR29 have that U+2019 is a word break when at the front or end of a word, but not within a word. It is not simply punctuation. These rules are not language aware.
There is no zero width character in Unicode to join words.
It is impossible for TR29 to distinguish between U+2019 used as a quotation mark and as an elision.
There is no other character that is an appropriate replacement for U+2019.

I haven’t yet looked at Unicode TR30 regarding folding rules as it pertains to this.

In Him,
	DM


> On Mar 17, 2025, at 8:46 AM, David Haslam <dfhdfh at protonmail.com> wrote:
> 
> Dear SWORD developers,
> 
> I asked about this topic several years ago, and I'm no longer convinced by what we were told back then.
> 
> After doing further research, it's my understanding that U+2019 RIGHT SINGLE QUOTATION MARK ought not to be hidden by this SWORD filter.
> 
> This codepoint is not a diacritic that modifies the previous Greek letter. In other words, it's not a Greek accent.
> This codepoint has the Unicode properties of a punctuation mark.
> In Ancient Greek text, it's used to mark an elision, where the final vowel of a word is omitted when the next word begins with a vowel.
> 
> To view my research, conducted with the help of Grok 3, please visit the following link.
> https://grok.com/share/bGVnYWN5_43ff1922-3876-4d9a-9e42-6ae940007fd0
> 
> I therefore recommend that SWORD developers revisit the specification for this filter, and update it so that U+2019 is never hidden.
> 
> Best regards,
> 
> David
> 
> Sent with Proton Mail <https://pr.tn/ref/SWXT9A5YZ67G> secure email.
> _______________________________________________
> sword-devel mailing list: sword-devel at crosswire.org
> http://crosswire.org/mailman/listinfo/sword-devel
> Instructions to unsubscribe/change your settings at above page

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://crosswire.org/pipermail/sword-devel/attachments/20250317/36c123b2/attachment.htm>


More information about the sword-devel mailing list