<div>Hi DM,<br></div><div><br></div><div>Please clarify in more detail what was actually changed in SWORD SVN.<br><br>Did you also patch <b>mkfastmod</b> such that this now also has a <b>-N</b> option in its <u>command line</u> syntax?</div><div><br></div><div>Or is it just somewhere in the SWORD API that <b>mkfastmod</b> makes a call to?<br></div><div><br></div><div>cf. Many front-ends can create a Lucene search index <u>from within the app</u>, so the internal command would be hidden from the user.<br></div><div><br></div><div>If we provide better support for modules with <b>Normalization=Custom</b>, then front-ends may need to be enhanced to offer such an option when the index is generated.<br></div><div><br></div><div>For such front-ends, if the proposed key exists in the module conf file, this option could be chosen automatically, thus not requiring user knowledge or intervention.<br><br><i>Is my case now beginning to make more sense?</i><br></div><div><br></div><div class="protonmail_signature_block"><div class="protonmail_signature_block-user"><div>Best regards,<br></div><div><br></div><div>David<br></div></div><div><br></div><div class="protonmail_signature_block-proton">Sent with <a href="https://protonmail.com">ProtonMail</a> Secure Email.<br></div></div><div><br></div><blockquote class="protonmail_quote" type="cite"><div>-------- Original Message --------<br></div><div>Subject: Re: [sword-devel] Module .conf files, Unicode Normalization<br></div><div>Local Time: 7 January 2018 1:52 PM<br></div><div>UTC Time: 7 January 2018 13:52<br></div><div>From: dfhdfh@protonmail.com<br></div><div>To: sword-devel mailing list <sword-devel@crosswire.org><br></div><div><br></div><div>In other words, it tells the library something significant about the module that the engine needs to know about in order to perform a certain function correctly.<br></div><div><br></div><div>Aside: I didn't mean to take this offline. I simply forgot to edit the addresses. <br></div><div><br></div><div>David<br></div><div><br></div><div>Sent from ProtonMail Mobile<br></div><div><div><br></div><div><div><br></div><div>On Sun, Jan 7, 2018 at 13:46, DM Smith <<a class="" href="mailto:dmsmith@crosswire.org">dmsmith@crosswire.org</a>> wrote:<br></div></div><blockquote type="cite" class="protonmail_quote"><div>Library uses it to convert bytes to string. <br></div><div> <br></div><div> <br></div><div><div>— DM Smith <br></div><div>From my phone. Brief. Weird autocorrections. <br></div></div><div><div><br></div><div>On Jan 7, 2018, at 8:44 AM, David Haslam <<a href="mailto:dfhdfh@protonmail.com">dfhdfh@protonmail.com</a>> wrote: <br></div><div> <br></div></div><blockquote type="cite"><div><div>Then answer this question, please.<br></div><div><br></div><div>What value has the Encoding key ?<br></div><div><br></div><div>David<br></div><div><br></div><div>Sent from ProtonMail Mobile<br></div><div><div><br></div><div><div><br></div><div>On Sun, Jan 7, 2018 at 12:52, DM Smith <<a class="" href="mailto:dmsmith@crosswire.org">dmsmith@crosswire.org</a>> wrote:<br></div></div><blockquote type="cite" class="protonmail_quote"><div>SWORD too. <br></div><div><br></div><div><div>I don’t yet see a value in the suggested conf entry. <br></div><div> <br></div><div> <br></div><div><div>— DM Smith <br></div><div>From my phone. Brief. Weird autocorrections. <br></div></div><div><div><br></div><div>On Jan 7, 2018, at 4:03 AM, David Haslam <<a href="mailto:dfhdfh@protonmail.com">dfhdfh@protonmail.com</a>> wrote: <br></div><div> <br></div></div><blockquote type="cite"><div><div>You mean in the JSword API ?<br></div><div><br></div><div>If so, that a start. Thanks, DM. :)<br></div><div><br></div><div>Does that mean you now support the proposed new config key being accepted and documented?<br></div><div><br></div><div>Best regards,<br></div><div><br></div><div>David<br></div><div><br></div><div>Sent from ProtonMail Mobile<br></div><div><div><br></div><div><div><br></div><div>On Sat, Jan 6, 2018 at 23:43, DM Smith <<a class="" href="mailto:dmsmith@crosswire.org">dmsmith@crosswire.org</a>> wrote:<br></div></div><blockquote type="cite" class="protonmail_quote"><div>I added -N. To make search work. <br></div><div> <br></div><div> <br></div><div><div>— DM Smith <br></div><div>From my phone. Brief. Weird autocorrections. <br></div></div><div><div><br></div><div>On Jan 6, 2018, at 4:41 PM, David Haslam <<a href="mailto:dfhdfh@protonmail.com">dfhdfh@protonmail.com</a>> wrote: <br></div><div> <br></div></div><blockquote type="cite"><div><div>Thanks DM.<br></div><div><br></div><div>Interesting observations.<br></div><div><br></div><div>It prompts the question whether either engine includes the capability to normalize the search index (assuming that it does normalize the search key).<br></div><div>And that it does this by default ????<br></div><div>Or does indexing assume that all modules were made without using the -N option and are therefore already in NFC.<br></div><div>Yet it also remains the case that some front-ends also provide for non-indexed search options.<br></div><div><br></div><div>Moreover, it raise questions as to how the front-end actually displays the set of search results when all or part of the underlying module is not NFC.<br></div><div><br></div><div>It must be the case that the developers of osis2mod had a valid reason to provide the -N option.<br></div><div>Are those involved back then still with CrossWire?<br></div><div><br></div><div>Best regards,<br></div><div><br></div><div>David<br></div><div><br></div><div><br></div><div>Sent from ProtonMail Mobile<br></div><div><div><br></div><div><div><br></div><div>On Sat, Jan 6, 2018 at 21:20, DM Smith <<a class="" href="mailto:dmsmith@crosswire.org">dmsmith@crosswire.org</a>> wrote:<br></div></div><blockquote type="cite" class="protonmail_quote"><div>The purpose of normalization was for the sake of search. Only when the search index and the search request are normalized to the same form can a result be found. <br></div><div class=""><br></div><div class="">It doesn’t matter if the normalized form is not readable. If SWORD (or JSword) normalizes both the same, then it doesn’t matter what Unicode Normalization or lack of it is used for displaying
the text. <br></div><div class=""><br></div><div class="">Assuming that SWORD (or JSword) handles search properly, the only advantage of canonical over decomposed in the module itself is space.<br></div><div class=""><br></div><div class="">In Him,<br></div><div class=""><span style="white-space:pre" class="Apple-tab-span"></span>DM<br></div><div class=""><div><div><br></div><blockquote class="" type="cite"><div class="">On Jan 6, 2018, at 2:26 PM, David Haslam <<a class="" href="mailto:dfhdfh@protonmail.com">dfhdfh@protonmail.com</a>> wrote:<br></div><div><br></div><div class=""><div class="">Good question, Tom.<br></div><div class=""><br></div><div class="">Assuming that the Latin script part of the source text actually required normalization to NFC,<br></div><div class="">and that at least some of the Biblical Hebrew should not be converted to NFC,<br></div><div class="">you'd build the module using the -N switch of osis2mod, after first applying a script <br></div><div class="">to the source text to ensure that both the requirements were implemented.<br></div><div class=""><br></div><div class="">It would be a very simple task for a bespoke TextPipe filter with a restrict filter <br></div><div class="">designed to limit the Convert to NFC subfilter to the text that was not Hebrew.<br></div><div class=""><br></div><div class="">Ignoring alphabetical presentation forms, all the Hebrew characters are in one Unicode block.<br></div><div class="">A PCRE to exclude the Hebrew would be very simple.<br></div><div class="">I could almost do it in my sleep after 17 years using TextPipe.<br></div><div class="">No doubt other programmers could do likewise with Perl or Python, etc.<br></div><div class=""><br></div><div class="">Best regards,<br></div><div class=""><br></div><div class="">David<br></div><div class=""><br></div><div class="">Sent from ProtonMail Mobile<br></div><div class=""><div><br></div><div class=""><div class=""><br></div><div>On Sat, Jan 6, 2018 at 19:14, Tom Sullivan <<a class="" href="mailto:info@beforgiven.info">info@beforgiven.info</a>> wrote:<br></div></div><blockquote type="cite" class="protonmail_quote">Y'all: For text, such as in a commentary, which includes both Hebrew and English (or another modern Latin script using language), what do you put for the normalization? Tom
Tom Sullivan <a class="" href="mailto:info@BeForgiven.INFO">info@BeForgiven.INFO</a>FAX: 815-301-2835 --------------------- Great News! God created you, owns you and gave you commands
to obey. You have disobeyed God - as your conscience very well attests to you. God's holiness and justice compel Him to punish you in Hell. Jesus Christ became Man, was
crucified, buried and rose from the dead as a substitute for all who trust in Him, redeeming them from Hell. If you repent (turn from your sin) and believe (trust) in
Jesus Christ, you will go to Heaven. Otherwise you will go to Hell. Warning! Good works are a result, not cause, of saving trust. More info is at <a class="" href="http://www.esig.beforgiven.info">www.esig.beforgiven.info</a>Do you believe this? Copy this signature into your email program and use the Internet to
spread the Great News every time you email. On 01/06/2018 12:32 PM, David Haslam wrote: > Hi Greg, > > One area where it might turn out to be useful is for the
search features > of front-end apps. > It could be important to know that the underlying module text is _not_ > *NFC*. > > That's not to lay down a requirement
as to how search features should be > designed, > but at least to provide the information in case it does matter for some > types of search option. > >
Like other things in .conf files, a key can also be _educational_. > It may prompt developers and users to ask, /*Why did they do this?*/ > > cf. It was _almost
by accident_ that in 2014, I first came across this > aspect of using Unicode for Biblical Hebrew. > /It applies only to texts with _both_ vowel accents and cantillation./
> > Even though it's mentioned in our developers' wiki, it's all too easily > missed by other CrossWire volunteers. > > Best regards, > > David >
> Sent with ProtonMail
Secure Email. > >> -------- Original Message -------- >> Subject: Re: [sword-devel] Module .conf files, Unicode Normalization >> Local Time: 6 January
2018 5:19 PM >> UTC Time: 6 January 2018 17:19 >> From: <a class="" href="mailto:greg.hellings@gmail.com">greg.hellings@gmail.com</a> >> To: David Haslam
, SWORD Developers' >> Collaboration Forum
>> >> Why would the front end or engine need to know this information? Would >> it help the front end developers or users to know it? What do we gain >> by adding this? (I'm not implying it wouldn't be beneficial. But the >>
only thing I know about Unicode is how the different UTF encodings >> work, so I have no idea what use this information could be. I also >> think
changes to formats and information standards should be >> conservative instead of liberal) >> >> --Greg >> >> On Jan 6, 2018
11:01, "David Haslam"
>
> wrote: >> >> Dear all, >> >> We've known for quite a few years that there are aspects of >> *Biblical Hebrew* that mean
we should _avoid_ converting the >> Unicode source text to *NFC* when we build a module. >> >> This prompts me to suggest that we
ought to define a new *key* for >> .conf files. >> >> *Normalization=NFC* (this would be the default, and may be >> _omitted_
for the vast majority of modules) >> *Normalization=Custom* (we should include this in certain Biblical >> Hebrew modules) >> >>
This would make it clear to front-end developers and users alike >> that the source text was _not_ converted to NFC during module build. >>
i.e. *osis2mod* was used intentionally with the *-N* switch, in >> _accordance with the requirements of the source text provider_. >> >>
The Unicode source text may already be encoded in *UTF-8* ; this >> memo is /only /about normalization. >> >> In the rare eventuality
that there could arise a requrement for >> any of the other three normalization forms (*NFD*, *NFKC*, *NFKD*) >> defined by the Unicode
Consortium, >> these would also be permitted values for the conf file key. >> >> A further benefit arises when a module needs to
be updated. >> If the modules team sees that the .conf file includes the line >> *Normalization=Custom* >> they would be forewarned
against converting to NFC through >> /inadvertently/ omitting the *-N* switch during module build. >> >> _Aside_: Another language
with a need for non-standard >> normalization is *Tibetan*. We don't yet have a module in that script. >> >> Best regards, >>
>> David >> >> Sent with ProtonMail
Secure Email. >> >> >> _______________________________________________ >> sword-devel mailing list: <a class="" href="mailto:sword-devel@crosswire.org">sword-devel@crosswire.org</a> >>
>> <a class="" href="http://www.crosswire.org/mailman/listinfo/sword-devel">http://www.crosswire.org/mailman/listinfo/sword-devel</a> >>
>> Instructions to unsubscribe/change your settings at above page > > > ______________________________________________________________________ > This email has been scanned by the Symantec Email Security.cloud service. > For more
information please visit <a class="" href="http://www.symanteccloud.com">http://www.symanteccloud.com</a>> ______________________________________________________________________ > > > _______________________________________________
> sword-devel mailing list: <a class="" href="mailto:sword-devel@crosswire.org">sword-devel@crosswire.org</a>> <a class="" href="http://www.crosswire.org/mailman/listinfo/sword-devel">http://www.crosswire.org/mailman/listinfo/sword-devel</a> > Instructions to unsubscribe/change your settings at above page
> _______________________________________________ sword-devel mailing list: <a class="" href="mailto:sword-devel@crosswire.org">sword-devel@crosswire.org</a> <a class="" href="http://www.crosswire.org/mailman/listinfo/sword-devel">http://www.crosswire.org/mailman/listinfo/sword-devel</a>Instructions to unsubscribe/change your settings at above page<br></blockquote></div><div>_______________________________________________ <br></div><div>sword-devel mailing list: <a class="" href="mailto:sword-devel@crosswire.org">sword-devel@crosswire.org</a> <br></div><div><a class="" href="http://www.crosswire.org/mailman/listinfo/sword-devel">http://www.crosswire.org/mailman/listinfo/sword-devel</a> <br></div><div>Instructions to unsubscribe/change your settings at above page<br></div></div></blockquote></div></div></blockquote></div></div></blockquote><blockquote type="cite"><div><div><span>_______________________________________________</span> <br></div><div><span>sword-devel mailing list: <a href="mailto:sword-devel@crosswire.org">sword-devel@crosswire.org</a></span> <br></div><div><span><a href="http://www.crosswire.org/mailman/listinfo/sword-devel">http://www.crosswire.org/mailman/listinfo/sword-devel</a></span> <br></div><div><span>Instructions to unsubscribe/change your settings at above page</span><br></div></div></blockquote></blockquote></div></div></blockquote><blockquote type="cite"><div><div><span>_______________________________________________</span> <br></div><div><span>sword-devel mailing list: <a href="mailto:sword-devel@crosswire.org">sword-devel@crosswire.org</a></span> <br></div><div><span><a href="http://www.crosswire.org/mailman/listinfo/sword-devel">http://www.crosswire.org/mailman/listinfo/sword-devel</a></span> <br></div><div><span>Instructions to unsubscribe/change your settings at above page</span><br></div></div></blockquote></div></blockquote></div></div></blockquote></blockquote></div></blockquote><div><br></div>