<div dir="ltr"><div>Unicode replacement characters typically indicate a font issue, and would not normally be represented as such within the internals of a program. Have you tried using one of the command line utilities or examples directly?<br><br></div>--Greg<br></div><div class="gmail_extra"><br><div class="gmail_quote">On Wed, Apr 26, 2017 at 2:48 PM, David Haslam <span dir="ltr"><<a href="mailto:dfhmch@googlemail.com" target="_blank">dfhmch@googlemail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">If you examine the result preview pane in the Xiphos Advanced Search dialog,<br>
the problem becomes apparent.<br>
<br>
Most Coptic Unicode characters are not displayed correctly.<br>
<br>
<br>
<br>
The remainder seem to have been converted to U+FFFD REPLACEMENT CHARACTER.<br>
<br>
i.e. All these Coptic letters are basically not handled aright by this part<br>
of the software:<br>
<br>
U+2C81 ⲁ COPTIC SMALL LETTER ALFA<br>
U+2C83 ⲃ COPTIC SMALL LETTER VIDA<br>
U+2C85 ⲅ COPTIC SMALL LETTER GAMMA<br>
U+2C87 ⲇ COPTIC SMALL LETTER DALDA<br>
U+2C89 ⲉ COPTIC SMALL LETTER EIE<br>
U+2C8B ⲋ COPTIC SMALL LETTER SOU<br>
U+2C8D ⲍ COPTIC SMALL LETTER ZATA<br>
U+2C8F ⲏ COPTIC SMALL LETTER HATE<br>
U+2C91 ⲑ COPTIC SMALL LETTER THETHE<br>
U+2C93 ⲓ COPTIC SMALL LETTER IAUDA<br>
U+2C95 ⲕ COPTIC SMALL LETTER KAPA<br>
U+2C97 ⲗ COPTIC SMALL LETTER LAULA<br>
U+2C99 ⲙ COPTIC SMALL LETTER MI<br>
U+2C9B ⲛ COPTIC SMALL LETTER NI<br>
U+2C9D ⲝ COPTIC SMALL LETTER KSI<br>
U+2C9F ⲟ COPTIC SMALL LETTER O<br>
U+2CA1 ⲡ COPTIC SMALL LETTER PI<br>
U+2CA3 ⲣ COPTIC SMALL LETTER RO<br>
U+2CA5 ⲥ COPTIC SMALL LETTER SIMA<br>
U+2CA7 ⲧ COPTIC SMALL LETTER TAU<br>
U+2CA9 ⲩ COPTIC SMALL LETTER UA<br>
U+2CAB ⲫ COPTIC SMALL LETTER FI<br>
U+2CAD ⲭ COPTIC SMALL LETTER KHI<br>
U+2CAF ⲯ COPTIC SMALL LETTER PSI<br>
U+2CB1 ⲱ COPTIC SMALL LETTER OOU<br>
U+2CC1 ⳁ COPTIC SMALL LETTER SAMPI<br>
U+2CE8 ⳨ COPTIC SYMBOL TAU RO<br>
<br>
Only the few Coptic letters in the block U+03E2 to U+03EF are displayed<br>
aright.<br>
<br>
It's no wonder that a search has so many spurious results if most of the<br>
search space has been squashed into Unicode replacement characters.<br>
<br>
I'm a Windows user, as most of you know already.<br>
Does the same thing happen in Xiphos under Linux?<br>
<br>
Is this an issue common to all SWORD based front-ends?<br>
The fact that we see similar results in PocketSword strongly suggests it is.<br>
<br>
Best regards,<br>
<br>
David<br>
<br>
<br>
<br>
--<br>
View this message in context: <a href="http://sword-dev.350566.n4.nabble.com/Lucene-search-index-and-Coptic-tp4657103p4657106.html" rel="noreferrer" target="_blank">http://sword-dev.350566.n4.<wbr>nabble.com/Lucene-search-<wbr>index-and-Coptic-<wbr>tp4657103p4657106.html</a><br>
<div class="HOEnZb"><div class="h5">Sent from the SWORD Dev mailing list archive at Nabble.com.<br>
<br>
______________________________<wbr>_________________<br>
sword-devel mailing list: <a href="mailto:sword-devel@crosswire.org">sword-devel@crosswire.org</a><br>
<a href="http://www.crosswire.org/mailman/listinfo/sword-devel" rel="noreferrer" target="_blank">http://www.crosswire.org/<wbr>mailman/listinfo/sword-devel</a><br>
Instructions to unsubscribe/change your settings at above page</div></div></blockquote></div><br></div>