[jsword-devel] Lucene upgrade

Sijo Cherian sijo.cherian at gmail.com
Mon May 12 21:03:31 MST 2014


Thanks Chris.

>From my test, existing lucene index is working with this
pull-request/branch, since we specify LUCENE_30 to writer/readers (set in
IndexMetadata.LUCENE_IDXVERSION_FOR_INDEXING)
But the lucene binaries dependency is newer. After a transition phase or
stable release of 2.0, I am hoping index can be upgraded.

When you get some breathing room, give it a try.
I just updated this pull-req to use lucene 4.7.0 , since 4.8.0 require
java7.

sijo



On Sun, May 11, 2014 at 4:23 PM, Chris Burrell <christopher at burrell.me.uk>wrote:

> Hi Sijo
>
> Thanks for all the good work. I'm not going to be able to test this much
> until early June I think, as we're hoping to have a big release going out
> the door with STEP. Upgrading the Lucene index for STEP will be non-trivial
> as we use it internally for all our other data sources. Only had a very
> very quick glance, but looks good. Thanks!
>
>
> Chris
>
>
>
> On 11 May 2014 05:09, Sijo Cherian <sijo.cherian at gmail.com> wrote:
>
>> Using this codebase, I did some tests using:
>> - With version = LUCENE_31 : Tested with French FreSegond & Chinese
>> ChiNCVs bible. Some differences found with existing index. So index won't
>> be back compatible
>> - With version = LUCENE_48, QueryParser yielded same output as LUCENE_31
>> for French/Chinese tests
>>
>> After a transition phase, I am thinking we will switch this version to
>> late and give a new version to JSword-indexing-schema as well. The the
>> downstream projects can prompt user to rebuild/download index of the one
>> book (that is being search).
>>
>> /sijo
>>
>>
>>
>> On Sun, May 11, 2014 at 12:05 AM, Sijo Cherian <sijo.cherian at gmail.com>wrote:
>>
>>>
>>> This new pull request is intended as a new branch:
>>> https://github.com/crosswire/jsword/pull/82
>>>
>>> This pull-request includes code changes to use the new index/query api
>>> in Lucene 4.8.0.
>>> (I called jsword release.version = 2.0.1-luceneupgrade-alpha)
>>> It uses LUCENE_30 version (set in
>>> IndexMetadata.LUCENE_IDXVERSION_FOR_INDEXING) in IndexWriter/QueryParser
>>> for compatibility.
>>> After a transition phase, I am thinking we will switch this version to
>>> latest (to use newer features, less RAM usage in newer index format).
>>>
>>> As far as I could test, it seem to be back compatible to existing index
>>> when using LUCENE_30. We need folks to test this for european/asian
>>> language bibles using existing index (since English is using
>>> SimpleLuceneAnalyzer with no stemming etc it is unlikely to have issues in
>>> English bible index).
>>>
>>> All feedbacks/testing efforts are very much appreciated.
>>> /sijo
>>>
>>>
>>>
>>>
>>> On Thu, May 8, 2014 at 12:36 AM, Sijo Cherian <sijo.cherian at gmail.com>wrote:
>>>
>>>>
>>>> I started with changing jsword usage of Lucene api to latest in 4.x (I
>>>> think this is the easier piece & I am close to 75%).
>>>>
>>>> Issue is with existing index upgrade. Currently we're using Lucene 3.0.3,
>>>> but index is build with 2.9. The 4.x lets you read older version index
>>>> upto 3.0. If I can figure out difference between 2.9 vs 3.0 version, then
>>>> we can decide if a hop to lucene 3.6 is necessary. I want to dig a little
>>>> more into it, before a full blown discussion inside this version universe.
>>>>
>>>> I am hoping to provide a transition phase (using config or two plugin
>>>> option), so that existing index is not forced to upgrade. If we keep
>>>> content of current field unchanged, it will reduce one variable.
>>>>
>>>> Meanwhile if you guys can test the indexversion upgrade pullreq (
>>>> https://github.com/crosswire/jsword/pull/79) & see if it meets the
>>>> requirements of atleast your respective app's index upgrades, that will be
>>>> great.
>>>> In AndBible's usecase, it should help to prompt user to
>>>> rebuild/download index of the one book (that is being search).
>>>>
>>>> /sijo
>>>>
>>>>
>>>> On Mon, May 5, 2014 at 3:17 PM, Chris Burrell <
>>>> christopher at burrell.me.uk> wrote:
>>>>
>>>>> Hi
>>>>>
>>>>> Which version are we going for?
>>>>>
>>>>> Not sure what you mean by 'best not to change the index structure'? Do
>>>>> you mean best to keep the current fields indexed with the current content?
>>>>> (if so, I agree, let's do this a step at a time).
>>>>>
>>>>> Chris
>>>>>
>>>>>
>>>>>
>>>>> On 3 May 2014 21:32, Sijo Cherian <sijo.cherian at gmail.com> wrote:
>>>>>
>>>>>> FYI
>>>>>> I just started working on upgrading code to newer lucene version, &
>>>>>> keeping back compatibility.
>>>>>>
>>>>>> More updates after some progress.
>>>>>> It is best to not change index structure in default plugged-in
>>>>>> version, for now.
>>>>>> /sijo
>>>>>>
>>>>>> _______________________________________________
>>>>>> jsword-devel mailing list
>>>>>> jsword-devel at crosswire.org
>>>>>> http://www.crosswire.org/mailman/listinfo/jsword-devel
>>>>>>
>>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> jsword-devel mailing list
>>>>> jsword-devel at crosswire.org
>>>>> http://www.crosswire.org/mailman/listinfo/jsword-devel
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Regards,
>>>> Sijo
>>>>
>>>
>>>
>>>
>>> --
>>> Regards,
>>> Sijo
>>>
>>
>>
>>
>> --
>> Regards,
>> Sijo
>>
>> _______________________________________________
>> jsword-devel mailing list
>> jsword-devel at crosswire.org
>> http://www.crosswire.org/mailman/listinfo/jsword-devel
>>
>>
>
> _______________________________________________
> jsword-devel mailing list
> jsword-devel at crosswire.org
> http://www.crosswire.org/mailman/listinfo/jsword-devel
>
>


-- 
Regards,
Sijo
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.crosswire.org/pipermail/jsword-devel/attachments/20140513/d3994682/attachment.html>


More information about the jsword-devel mailing list