I tried switching to the cn.ChineseAnalyzer and this one has no out-of-memory problems but it does not return any results. It returns instantly almost as if it is doing nothing. <div><br></div><div>I regenerated the index then I tried to search for 'Mark' in Chinese which I pasted it in from BibleNames_zh.properties and got no results after 0 seconds.<div>
<br></div><div>I might try the CJKAnalyzer tomorrow.</div><div><br></div><div>Best regards</div><div>Martin</div><div><br><br><div class="gmail_quote">On 11 November 2010 22:16, DM Smith <span dir="ltr"><<a href="mailto:dmsmith@crosswire.org" target="_blank">dmsmith@crosswire.org</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Martin,<br>
<br>
In the lucene-analyzers jar try either: (let org.apache.lucene.analysis be o.a.l.a)<br>
o.a.l.a.cn.ChineseAnalyzer or o.a.l.a.cjk.CJKAnalyzer<br>
The latter searches bigrams and thus has a bigger index size.<br>
<br>
Hope this helps.<br>
<div><br>
In Him,<br>
DM<br>
<br>
On Nov 11, 2010, at 3:54 PM, Martin Denham wrote:<br>
<br>
</div><div><div></div><div>> Does anybody know if there is a Chinese Lucene Analyzer that is more lightweight than smartcn or if it is possible to configure smartcn to use less memory?<br>
><br>
> Smart Chinese Analyzer will not run on Android because it attempts to load up a large dictionary in order to split phrases and runs out of memory. Here is a stack trace:<br>
><br>
> 11-11 20:38:28.296: ERROR/AndroidRuntime(8925): java.lang.ExceptionInInitializerError<br>
> 11-11 20:38:28.296: ERROR/AndroidRuntime(8925): at org.apache.lucene.analysis.cn.smart.hhmm.HHMMSegmenter.process(HHMMSegmenter.java:201)<br>
> 11-11 20:38:28.296: ERROR/AndroidRuntime(8925): at org.apache.lucene.analysis.cn.smart.WordSegmenter.segmentSentence(WordSegmenter.java:50)<br>
> 11-11 20:38:28.296: ERROR/AndroidRuntime(8925): at org.apache.lucene.analysis.cn.smart.WordTokenFilter.incrementToken(WordTokenFilter.java:69)<br>
> 11-11 20:38:28.296: ERROR/AndroidRuntime(8925): at org.apache.lucene.analysis.PorterStemFilter.incrementToken(PorterStemFilter.java:53)<br>
> 11-11 20:38:28.296: ERROR/AndroidRuntime(8925): at org.apache.lucene.analysis.StopFilter.incrementToken(StopFilter.java:225)<br>
> 11-11 20:38:28.296: ERROR/AndroidRuntime(8925): at org.apache.lucene.analysis.CachingTokenFilter.fillCache(CachingTokenFilter.java:87)<br>
> 11-11 20:38:28.296: ERROR/AndroidRuntime(8925): at org.apache.lucene.analysis.CachingTokenFilter.incrementToken(CachingTokenFilter.java:61)<br>
> 11-11 20:38:28.296: ERROR/AndroidRuntime(8925): at org.apache.lucene.queryParser.QueryParser.getFieldQuery(QueryParser.java:599)<br>
> 11-11 20:38:28.296: ERROR/AndroidRuntime(8925): at org.apache.lucene.queryParser.QueryParser.Term(QueryParser.java:1449)<br>
> 11-11 20:38:28.296: ERROR/AndroidRuntime(8925): at org.apache.lucene.queryParser.QueryParser.Clause(QueryParser.java:1337)<br>
> 11-11 20:38:28.296: ERROR/AndroidRuntime(8925): at org.apache.lucene.queryParser.QueryParser.Query(QueryParser.java:1265)<br>
> 11-11 20:38:28.296: ERROR/AndroidRuntime(8925): at org.apache.lucene.queryParser.QueryParser.TopLevelQuery(QueryParser.java:1254)<br>
> 11-11 20:38:28.296: ERROR/AndroidRuntime(8925): at org.apache.lucene.queryParser.QueryParser.parse(QueryParser.java:200)<br>
> 11-11 20:38:28.296: ERROR/AndroidRuntime(8925): at org.crosswire.jsword.index.lucene.LuceneIndex.find(Unknown Source)<br>
> <deleted a bit of the stack trace here><br>
> 11-11 20:38:28.296: ERROR/AndroidRuntime(8925): Caused by: java.lang.OutOfMemoryError<br>
> 11-11 20:38:28.296: ERROR/AndroidRuntime(8925): at java.lang.reflect.Array.newInstance(Array.java:492)<br>
> 11-11 20:38:28.296: ERROR/AndroidRuntime(8925): at java.io.ObjectInputStream.readNewArray(ObjectInputStream.java:1637)<br>
> 11-11 20:38:28.296: ERROR/AndroidRuntime(8925): at java.io.ObjectInputStream.readNonPrimitiveContent(ObjectInputStream.java:927)<br>
> 11-11 20:38:28.296: ERROR/AndroidRuntime(8925): at java.io.ObjectInputStream.readObject(ObjectInputStream.java:2285)<br>
> 11-11 20:38:28.296: ERROR/AndroidRuntime(8925): at java.io.ObjectInputStream.readObject(ObjectInputStream.java:2240)<br>
> 11-11 20:38:28.296: ERROR/AndroidRuntime(8925): at org.apache.lucene.analysis.cn.smart.hhmm.BigramDictionary.loadFromInputStream(BigramDictionary.java:99)<br>
> 11-11 20:38:28.296: ERROR/AndroidRuntime(8925): at org.apache.lucene.analysis.cn.smart.hhmm.BigramDictionary.load(BigramDictionary.java:120)<br>
> 11-11 20:38:28.296: ERROR/AndroidRuntime(8925): at org.apache.lucene.analysis.cn.smart.hhmm.BigramDictionary.getInstance(BigramDictionary.java:71)<br>
> 11-11 20:38:28.296: ERROR/AndroidRuntime(8925): at org.apache.lucene.analysis.cn.smart.hhmm.BiSegGraph.<clinit>(BiSegGraph.java:46)<br>
> 11-11 20:38:28.296: ERROR/AndroidRuntime(8925): ... 35 more<br>
><br>
> For now I will have to disable searching in Chinese texts.<br>
><br>
> Kind regards<br>
> Martin<br>
><br>
><br>
</div></div><div><div></div><div>> _______________________________________________<br>
> jsword-devel mailing list<br>
> <a href="mailto:jsword-devel@crosswire.org" target="_blank">jsword-devel@crosswire.org</a><br>
> <a href="http://www.crosswire.org/mailman/listinfo/jsword-devel" target="_blank">http://www.crosswire.org/mailman/listinfo/jsword-devel</a><br>
<br>
<br>
_______________________________________________<br>
jsword-devel mailing list<br>
<a href="mailto:jsword-devel@crosswire.org" target="_blank">jsword-devel@crosswire.org</a><br>
<a href="http://www.crosswire.org/mailman/listinfo/jsword-devel" target="_blank">http://www.crosswire.org/mailman/listinfo/jsword-devel</a><br>
</div></div></blockquote></div><br>
</div></div>