[jsword-devel] [sword-devel] Replacement Lucene Analyzer for Japanese

DM Smith dmsmith at crosswire.org
Tue Feb 12 13:35:35 MST 2013


We'll try to get to Lucene 3.6.2 for the release if at all possible. To go to Lucene 4.x requires Java 6.

Regarding Java 6: The hold up is Mac OSX. The first availability of it is to those with a 64-bit Intel processor running Leopard (10.5).  It is not installed by default and when installed requires the user to set a preference in order to be able to use it. (Not something mac users are inclined to do.) So the first Mac OS where we can reasonably expect users to have Java 6 is Snow Leopard.

BTW, I have a Mac of this vintage that's running just fine. It originally it had a 32-bit dual core, which I upgraded to a 64-bit processor. Just to get Java 6 under Leopard.

Anyway, after this release we can reconsider Java 6. We'll create a maintenance branch for the release to be able to do bug fixes.

-- DM



On Feb 12, 2013, at 1:01 PM, DM Smith <dmsmith at crosswire.org> wrote:

> It is a lot of work. The analyzers and filters that we have written would need to be re-written. The code no longer uses String but rather char[] (or equivalent).
> 
> This happened well before 4.0. Typically w/ Lucene you don't want to directly upgrade from an early version of a prior release but only from the x.9 release. The difference between 3.9 and earlier is that lots of stuff is deprecated. The difference between 3.9 and 4.0 is that the deprecations are gone.
> 
> This has been very helpful in identifying how to go from one major release to the next.
> 
> We have custom language converters because theirs do too much. For example, they remove stop words. While this is generally nice. There are theological phrases in which stop words are significant, e.g. "in Christ"
> 
> Also most are built on StandardAnalyzer, which is slow and it's features are not appropriate. We use a very simple analyzer from Lucene.
> 
> There are some new Filters and Analyzers that we should be using.
> 
> I'd like to do this before we release or shortly after.
> 
> BTW, I want to get back to a release often practice.
> 
> In Him,
> 	DM
> 
> On Feb 12, 2013, at 10:15 AM, Chris Burrell <chris at burrell.me.uk> wrote:
> 
>> So on the JSword front, it would be good to move up to Lucene 4 at some stage. Are we saying this will need more work than just a simple upgrade?
>> 
>> Also, why do we have our custom language converters. Lucene seems to have most of the ones we're using, and we seem to simply wrap around the Filters in the library?
>> 
>> Chris
>> 
>> 
>> On 12 February 2013 15:12, DM Smith <dmsmith at crosswire.org> wrote:
>> Reposting to JSword-devel.
>> 
>> On Feb 12, 2013, at 6:47 AM, David Haslam  wrote:
>> 
>> > Some languages, like Japanese and Chinese, are configured in JSword to use
>> > the SmartCN Lucene Analyzer.
>> >
>> > SmartCN contains a massive dictionary which is too large for most mobiles.
>> >
>> > We don't package SmartCN with And Bible so somebody needs to do some work to
>> > find a replacement Lucene Analyzer for Japanese.
>> >
>> > cf. For Chinese we now use mmseg4j.
>> >
>> > David (on behalf of Martin)
>> >
>> > https://code.google.com/p/and-bible/issues/detail?id=160
>> >
>> >
>> >
>> > --
>> > View this message in context: http://sword-dev.350566.n4.nabble.com/Replacement-Lucene-Analyzer-for-Japanese-tp4651942.html
>> > Sent from the SWORD Dev mailing list archive at Nabble.com.
>> >
>> > _______________________________________________
>> > sword-devel mailing list: sword-devel at crosswire.org
>> > http://www.crosswire.org/mailman/listinfo/sword-devel
>> > Instructions to unsubscribe/change your settings at above page
>> 
>> 
>> _______________________________________________
>> jsword-devel mailing list
>> jsword-devel at crosswire.org
>> http://www.crosswire.org/mailman/listinfo/jsword-devel
>> 
>> _______________________________________________
>> jsword-devel mailing list
>> jsword-devel at crosswire.org
>> http://www.crosswire.org/mailman/listinfo/jsword-devel
> 
> _______________________________________________
> jsword-devel mailing list
> jsword-devel at crosswire.org
> http://www.crosswire.org/mailman/listinfo/jsword-devel

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.crosswire.org/pipermail/jsword-devel/attachments/20130212/de7234de/attachment.html>


More information about the jsword-devel mailing list