[sword-devel] Soft hyphens

David Haslam dfhmch at googlemail.com
Thu Nov 2 02:04:58 MST 2017


Update: Research results of SWORD search for soft hyphens:

In Xiphos there is a problem with the exact search.

If the same word occurs in the text both with and without a soft hyphen,

- A search for the word with a soft hyphen will find only those instances
- A search for the word without a soft hyphen will find only those instances

To find all occurrences of the word, with or without the soft hyphen, the
user would need to enter the search key with the soft hyphen replaced by a ?
and use the regular expression search.

If the user is unaware that there's a superfluity of soft hyphens in the
module, they wouldn't have a clue that this would even be necessary.

If they somehow managed to discern that soft hyphens did exist, they'd need
to research where in the word it was located before they could use the
regular expression technique.

Example: 

- regexp search `mpa?mba` gives 505 hits, and they all display as mpamba.
- exact search for `mpamba` (without the hyphen) gives 411 hits
- exact search for `mpa­mba` (with a soft hyphen) gives 50 hits 
- exact search for `mpa­­mba` (with 2 soft hyphens) gave 3 hits


NB. These results were based on the module Fr Cyrille made before the
multiple and useless soft hyphens were removed from the source text.

Further research ideas: PocketSword has a fuzzy search option. Would this
help?

Aside: Why is there no fuzzy search option in Xiphos?

Best regards,

David



--
Sent from: http://sword-dev.350566.n4.nabble.com/



More information about the sword-devel mailing list