<html><head><meta http-equiv="Content-Type" content="text/html charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class="">No need to re-invent the wheel… The folk at Lucene have done the invention. I don’t know if it has made its way into CLucene.<div class=""><br class=""></div><div class="">The issue is not what is seen or entered by a user but what is stored in the index. The typical search mechanism has to normalize the search request the same as the text that was put into the search index. In Lucene speak handling such as ß => ss is called folding. See the comments in the Lucene issue: <a href="https://issues.apache.org/jira/browse/LUCENE-1343" class="">https://issues.apache.org/jira/browse/LUCENE-1343</a> Robert Muir is the one who has a strong grasp on it. You’ll also see me in the thread :) .<div class=""><br class=""></div><div class="">In some languages, the “marks” are vowels and in others are not. Folding has to be done by language.<br class=""><div class=""><br class=""></div><div class="">The issue I see is that one might change the semantic meaning of a word. Two different words may fold into the same. Such would produce false hits.</div><div class=""><br class=""></div><div class="">In Him,</div><div class=""><span class="Apple-tab-span" style="white-space:pre">        </span>DM<br class=""><div class=""><br class=""></div><div class=""><div class=""><br class=""></div><div class=""><div><blockquote type="cite" class=""><div class="">On Feb 1, 2016, at 6:24 AM, Peter von Kaehne <<a href="mailto:refdoc@gmx.net" class="">refdoc@gmx.net</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><div class="">No, we would call it an "sz", but spell it "ss" as the alternative to the letter ß. Talk about confusing the kids...<br class=""><br class="">Peter <br class=""><br class=""><blockquote type="cite" class="">Gesendet: Montag, 01. Februar 2016 um 10:54 Uhr<br class="">Von: "David Haslam" <<a href="mailto:dfhmch@googlemail.com" class="">dfhmch@googlemail.com</a>><br class="">An: <a href="mailto:sword-devel@crosswire.org" class="">sword-devel@crosswire.org</a><br class="">Betreff: Re: [sword-devel] Latin diacritics<br class=""><br class="">Corrigendum:<br class=""><br class="">....with an Eszett "ß" by entering "sz" ? <br class=""><br class="">though of course, some words that used to have "ß" now have "ss", so it gets<br class="">even harder.<br class=""><br class="">David<br class=""><br class=""><br class=""><br class="">--<br class="">View this message in context: <a href="http://sword-dev.350566.n4.nabble.com/Latin-diacritics-tp4655927p4655937.html" class="">http://sword-dev.350566.n4.nabble.com/Latin-diacritics-tp4655927p4655937.html</a><br class="">Sent from the SWORD Dev mailing list archive at <a href="http://nabble.com" class="">Nabble.com</a>.<br class=""><br class="">_______________________________________________<br class="">sword-devel mailing list: <a href="mailto:sword-devel@crosswire.org" class="">sword-devel@crosswire.org</a><br class=""><a href="http://www.crosswire.org/mailman/listinfo/sword-devel" class="">http://www.crosswire.org/mailman/listinfo/sword-devel</a><br class="">Instructions to unsubscribe/change your settings at above page<br class=""></blockquote><br class="">_______________________________________________<br class="">sword-devel mailing list: <a href="mailto:sword-devel@crosswire.org" class="">sword-devel@crosswire.org</a><br class=""><a href="http://www.crosswire.org/mailman/listinfo/sword-devel" class="">http://www.crosswire.org/mailman/listinfo/sword-devel</a><br class="">Instructions to unsubscribe/change your settings at above page</div></div></blockquote></div><br class=""></div></div></div></div></div></body></html>