[jsword-devel] [JIRA] Moved: (JS-160) Case-insensitive search

DM Smith (JIRA) jira at crosswire.org
Sun Feb 6 13:39:54 MST 2011


     [ http://www.crosswire.org/bugs/browse/JS-160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

DM Smith moved BD-153 to JS-160:
--------------------------------

       Project: JSword  (was: Bible Desktop)
           Key: JS-160  (was: BD-153)
    Issue Type: New Feature  (was: Bug)

> Case-insensitive search
> -----------------------
>
>                 Key: JS-160
>                 URL: http://www.crosswire.org/bugs/browse/JS-160
>             Project: JSword
>          Issue Type: New Feature
>         Environment: All platforms
>            Reporter: YINGJIE LAN
>            Assignee: DM Smith
>            Priority: Minor
>
> If you search for "Ram", you got "ram" as well, but "Ram" is a man's name, so giving me the verses containing "ram" are really not what I meant -- and there are a lot more verses containing "ram".
> The problem here is two fold:
> 1) Names are not specifically identified in any of our modules/books. So it is not possible to find only those references to the name Ram. It would be great to have a listing of names per verse so that we could mark them up some how.
> 2) Our search, as you notice, is case-insensitive. Typically, a case insensitive search is what a user wants. Any word can be capitalized if it starts a sentence. Sentence identification is not hard for English Bibles, but English commentaries often have abbreviations ending with a period, making sentence identification problematic. And in some languages, such as German, words in middle of a sentence may be capitalized. Other languages don't have upper and lower case (e.g. Arabic, Hebrew, Chinese, ....)
> The best we could do is double index each word. Right now we normalize each word to lower case. We could also store the word as-is. This would require searching a special field to get that behavior.
> You might not have noticed but searching "Ram" will also find "rams" and "rammed" as we also do stemming by default for those languages for which we have stemming rules.
> It might be nice for a user to control whether the search index uses stemming or not.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



More information about the jsword-devel mailing list