[sword-devel] Soft hyphens - a question about mkfastmod and Lucene search
David Haslam
dfhdfh at protonmail.com
Thu Jun 11 07:01:06 EDT 2020
If the text of a SWORD module has words that contain a soft hyphen (U+00AD) what happens to these when the Lucene search index is created?
Are such soft hyphens stripped by mkfastmod ?
My understanding is that words that contain an ordinary hyphen U+2010 (or hyphen/minus U+002D) are treated as multiple words.
i.e. As if the hyphen were a space.
IMHO, the same procedure should not apply to soft hyphens, but at this stage, I'm first interested to learn what currently happens.
Best regards,
David
Sent with ProtonMail Secure Email.
More information about the sword-devel
mailing list