[jsword-devel] [JIRA] Created: (JS-195) Conflict between translations of iso639.properties and language names used in ConfigurableSnowballAnalyzer
Martin Denham (JIRA)
jira at crosswire.org
Wed Jun 1 13:13:50 MST 2011
Conflict between translations of iso639.properties and language names used in ConfigurableSnowballAnalyzer
----------------------------------------------------------------------------------------------------------
Key: JS-195
URL: http://www.crosswire.org/bugs/browse/JS-195
Project: JSword
Issue Type: Bug
Components: i18n - Translation, o.c.common.util, o.c.jsword.index
Affects Versions: 1.6.1
Environment: All
Reporter: Martin Denham
Assignee: DM Smith
After applying fix JS-192 (iso639full.properties was always used and iso639.properties always ignored)
I now find that JS-189 (SnowballAnalyzer configured for unavailable stemmer Spanish (Español)) is occurring again.
Reason
The reason appears to be that iso639full.properties contains
es=Spanish
But iso639_en.properties contains
es=Spanish (Espa\u00F1ol)
Also iso639.properties contains
es=Espa\u00F1ol
(There are also a lot of other differences e.g. French, German, ..)
ConfigurableSnowballAnalyzer contains a list of language stemmers that only match the language names in iso369full.properties and no other iso* file:
private static Pattern allowedStemmers = Pattern.compile("(Danish|Dutch|English|Finnish|French|German2|German|Italian|Kp|Lovins|Norwegian|Porter|Portuguese|Russian|Spanish|Swedish)");
which only matches the country names in iso369full.properties.
The fix looks non-trivial; I tried using the language code instead of the name but got the error:
java.lang.ClassNotFoundException: org.tartarus.snowball.ext.esStemmer
I am going to roll back the fix for JS-192 until DM has a chance to look at this.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
More information about the jsword-devel
mailing list