<html><body><div dir="ltr">I think this is not difficult per se, but it should be properly encoded. </div><div dir="ltr"><br></div><div dir="ltr"><w> seems correct, using zero with characters seems not correct. </div><div dir="ltr"><br></div><div dir="ltr">Peter</div><div id="ms-outlook-mobile-body-separator-line" dir="ltr"><br></div><div id="ms-outlook-mobile-signature">Sent from <a href="https://aka.ms/o0ukef">Outlook for iOS</a></div><div id="mail-editor-reference-message-container" class="ms-outlook-mobile-reference-message"><hr style="display: inline-block; width: 98%;"><div id="divRplyFwdMsg" dir="ltr"><span style="font-family: Calibri, sans-serif;"><b>From:</b> sword-devel <sword-devel-bounces@crosswire.org> on behalf of David Haslam <dfhdfh@protonmail.com><br><b>Sent:</b> Thursday, May 1, 2025 11:30 am<br><b>To:</b> sword-devel mailing list <sword-devel@crosswire.org><br><b>Cc:</b> David Haslam <df.haslam@btinternet.com><br><b>Subject:</b> [sword-devel] Proposal for a new SWORD filter to display word dividers</span><div style="font-family: Calibri, sans-serif;"> </div></div><div style="font-family: Arial, sans-serif; font-size: 14px;">I wish to propose that we design in a new SWORD filter.<br><br>The conf key would be:</div><ul style="margin-top: 0px; margin-bottom: 0px;"><li style="font-family: Arial, sans-serif; font-size: 14px; list-style-type: disc;"><b>GlobalOptionFilter=ShowWordDividers</b></li></ul><div dir="ltr" style="font-family: Arial, sans-serif; font-size: 14px;"><br></div><div style="font-family: Arial, sans-serif; font-size: 14px;">In the writing systems for the various languages of SE Asia (<b>Thai</b>, <b>Khmer</b>, <b>Lao</b>, <b>Myanmar</b>) there is [generally] <b>no space between words</b>.</div><div dir="ltr" style="font-family: Arial, sans-serif; font-size: 14px;"><br></div><div style="font-family: Arial, sans-serif; font-size: 14px;">In this respect, they are like many European languages before the start of <a href="https://www.amazon.com/Space-Between-Words-Origins-Medieval/dp/080474016X" title="silent reading">silent reading</a>. The descriptive term is <b><i>Scriptura Continua</i></b>.</div><div style="font-family: Arial, sans-serif; font-size: 14px;"><br>Some Bible translations for this region are already making use of one of the ZERO WIDTH characters to invisibly mark the divisions between lexical words.</div><div style="font-family: Arial, sans-serif; font-size: 14px;">Options include:</div><ul style="margin-top: 0px; margin-bottom: 0px;"><li style="font-family: Arial, sans-serif; font-size: 14px; list-style-type: disc;">U+200B ZERO WIDTH SPACE</li><li style="font-family: Arial, sans-serif; font-size: 14px; list-style-type: disc;">U+200C ZERO WIDTH NON-JOINER</li><li style="font-family: Arial, sans-serif; font-size: 14px; list-style-type: disc;">U+FEFF ZERO WIDTH NO BREAK SPACE</li></ul><div style="font-family: Arial, sans-serif; font-size: 14px;">They exclude:</div><ul style="margin-top: 0px; margin-bottom: 0px;"><li style="font-family: Arial, sans-serif; font-size: 14px; list-style-type: disc;">U+200D ZERO WIDTH JOINER</li></ul><div style="font-family: Arial, sans-serif; font-size: 14px;">A further possibility, even without requiring a full study Bible with Strong's, etc, is to simply wrap each lexical word within the OSIS <b>w</b> element.</div><div style="font-family: Arial, sans-serif; font-size: 14px;">One without any OSIS attributes would suffice for this purpose. Likewise, for the <b>seg</b> element.</div><div dir="ltr" style="font-family: Arial, sans-serif; font-size: 14px;"><br></div><div style="font-family: Arial, sans-serif; font-size: 14px;">My proposal is that we design a feature to <b>show/hide word dividers</b> by displaying them using a suitable visible but non-intrusive character.</div><div style="font-family: Arial, sans-serif; font-size: 14px;">My suggestion is to use this Unicode character by default:</div><div dir="ltr" style="font-family: Arial, sans-serif; font-size: 14px;"><br></div><ul style="margin-top: 0px; margin-bottom: 0px;"><li style="font-family: Arial, sans-serif; font-size: 14px; list-style-type: disc;">U+00B7 MIDDLE DOT</li></ul><div dir="ltr" style="font-family: Arial, sans-serif; font-size: 14px;"><br></div><div class="protonmail_signature_block-user" style="font-family: Arial, sans-serif; font-size: 14px;">We could even allow the actual visible character to be specified in a second conf key, thus:</div><div dir="ltr" class="protonmail_signature_block-user" style="font-family: Arial, sans-serif; font-size: 14px;"><br></div><ul style="margin-top: 0px; margin-bottom: 0px;"><li style="font-family: Arial, sans-serif; font-size: 14px; list-style-type: disc;">VisibleWordDivider=U+00B7</li></ul><div dir="ltr" class="protonmail_signature_block-user" style="font-family: Arial, sans-serif; font-size: 14px;"><br></div><div class="protonmail_signature_block-user" style="font-family: Arial, sans-serif; font-size: 14px;">Benefits would include:</div><ul style="margin-top: 0px; margin-bottom: 0px;"><li style="font-family: Arial, sans-serif; font-size: 14px;">Helps with language learning to know where lexical words start and end</li><li style="font-family: Arial, sans-serif; font-size: 14px;">Helps with front-end search for whole words, exact phrase or all words</li><li style="font-family: Arial, sans-serif; font-size: 14px;">Helps with checking the accuracy of Bible translations by clearly displaying lexical word boundaries at the touch of a single key in the front-end</li><li style="font-family: Arial, sans-serif; font-size: 14px;">Paves the way for Study Bible with the addition of Strong's mark-up, etc.</li></ul><div dir="ltr" class="protonmail_signature_block-user" style="font-family: Arial, sans-serif; font-size: 14px;"><br></div><div class="protonmail_signature_block-user" style="font-family: Arial, sans-serif; font-size: 14px;">Here's a sample of Khmer verse text with the MIDDLE DOT as the visible word divider:</div><blockquote style="padding-left: 10px; border-left-width: 3px; border-left-style: solid; border-left-color: rgb(200, 200, 200);"><div class="protonmail_signature_block-user" style="font-family: Arial, sans-serif; font-size: 14px; color: rgb(102, 102, 102);"><b>Obad.1.1</b> </div><table class="protonmail_signature_block-user" style="width: 561pt; border-collapse: collapse; border-spacing: 0px; box-sizing: border-box;"><tbody><tr><td class="protonmail_signature_block-user" align="left" style="width: 561pt; height: 29.25pt; border-width: 0.5pt medium 0.5pt 0.5pt; border-style: solid none solid solid; border-color: white currentcolor white white; background-color: rgb(184, 204, 228); padding-top: 1px; padding-right: 1px; padding-left: 1px; vertical-align: top; color: black;"><div class="protonmail_signature_block-user" style="font-family: Calibri, sans-serif; font-size: 11pt;">នេះ·ជា·សុបិន·និមិត្ដ·របស់·លោក·អូបាឌា
ព្រះអម្ចាស់·ជា·ព្រះ·មាន·បន្ទូល·ពី·ក្រុង·អេដំម ។
យើង·បាន·ឮ·ដំណឹង·មក·ពី·ព្រះអម្ចាស់ គឺ·មាន·ទូត·ម្នាក់·បាន·បញ្ជូន·ឲ្យ·ទៅ
ក្នុង·ចំណោម·ជន·ជាតិ·ទាំង·ឡាយ·ដោយ·ពាក្យ·ថា "ចូរ·ក្រោក·ឡើង !
ចូរ·យើង·ក្រោក·ឡើង·ធ្វើ·ចម្បាំង·ទាស់·និង·គេ"</div></td></tr></tbody></table></blockquote><div dir="ltr" class="protonmail_signature_block-user" style="font-family: Arial, sans-serif; font-size: 14px;"><br></div><div class="protonmail_signature_block-user" style="font-family: Arial, sans-serif; font-size: 14px;">cf. Here's what it looks like with the ZWSP as the in<span style="background-color: rgb(255, 255, 255);">visible word </span>divider:</div><blockquote style="padding-left: 10px; border-left-width: 3px; border-left-style: solid; border-left-color: rgb(200, 200, 200);"><div class="protonmail_signature_block-user" style="font-family: Arial, sans-serif; font-size: 14px; color: rgb(102, 102, 102);"><b>Obad.1.1</b></div><table class="protonmail_signature_block-user" style="width: 561pt; border-collapse: collapse; border-spacing: 0px; box-sizing: border-box;"><tbody><tr><td class="protonmail_signature_block-user" align="left" style="width: 561pt; height: 29.25pt; border-width: 0.5pt medium 0.5pt 0.5pt; border-style: solid none solid solid; border-color: white currentcolor white white; background-color: rgb(184, 204, 228); padding-top: 1px; padding-right: 1px; padding-left: 1px; vertical-align: top; color: black;"><div class="protonmail_signature_block-user" style="font-family: Calibri, sans-serif; font-size: 11pt;">នេះជាសុបិននិមិត្ដរបស់លោកអូបាឌា
ព្រះអម្ចាស់ជាព្រះមានបន្ទូលពីក្រុងអេដំម ។
យើងបានឮដំណឹងមកពីព្រះអម្ចាស់ គឺមានទូតម្នាក់បានបញ្ជូនឲ្យទៅ
ក្នុងចំណោមជនជាតិទាំងឡាយដោយពាក្យថា "ចូរក្រោកឡើង !
ចូរយើងក្រោកឡើងធ្វើចម្បាំងទាស់និងគេ"</div></td></tr></tbody></table></blockquote><div dir="ltr" class="protonmail_signature_block-user" style="font-family: Arial, sans-serif; font-size: 14px;"><br></div><div class="protonmail_signature_block-user" style="font-family: Arial, sans-serif; font-size: 14px;">If SWORD developers agree that my proposal merits consideration, please would you start on the software development.</div><div dir="ltr" class="protonmail_signature_block-user" style="font-family: Arial, sans-serif; font-size: 14px;"><br></div><div dir="ltr" class="protonmail_signature_block-user" style="font-family: Arial, sans-serif; font-size: 14px;"><br></div><div class="protonmail_signature_block-user" style="font-family: Arial, sans-serif; font-size: 14px;">
Best regards,<br><br>David
</div><div dir="ltr" class="protonmail_signature_block" style="font-family: Arial, sans-serif; font-size: 14px;"><br></div><div class="protonmail_signature_block-proton" style="font-family: Arial, sans-serif; font-size: 14px;">
Sent with <a href="https://pr.tn/ref/SWXT9A5YZ67G">Proton Mail</a> secure email.
</div></div><div> </div></body></html>