[sword-devel] CLucene and Sword

DM Smith dmsmith555 at yahoo.com
Thu May 17 14:44:30 MST 2007


I'll give my 2 cents on index versioning.

We need to tackle the whole question of index versioning. It is 
something that I am going through right now with JSword and 
inadvertently didn't think about it when I made a change to Sword.

As you noted Java Lucene is ahead of CLucene. Right now it is by 2 
releases, with Lucene at 2.1 and CLucene at 1.4.3. Lucene 2.1 is a 
significant improvement over 1.4.3, being significantly faster and 
having lots of bug fixes. Also, it can read indexes built with 1.4. I 
don't think it can write to them. However, 1.4 cannot read indexes built 
with 2.1. Looking at the CLucene project, it does not look like they are 
going to release 2.x anytime soon.

With JSword, I have recently added the ability to index Strong's 
numbers, notes, osisIDs, cross references and headings. To do this I 
needed to build a custom analyzer, which is compatible with current 
indexes. (In essence, each field has it's own analyzer, with existing 
fields having the same analyzer as before.) Before I release this, I am 
going to give the user some control over which of these is included in 
the index.

For the features that I plan to add that will take advantage of these 
new fields, I'll need to know whether an index has a field present or 
not. (I think there is a way to look at the index to see what fields are 
used.)

So my thought is to write a "property" file to the directory that holds 
the lucene index. In essence this is represents the version of the index 
for the module.

When I am done, I'd like to add it to the Sword API. The greatest thing 
about shared code is the ability to gain from each other's contributions.

Joachim Ansorg wrote:
> Hi,
>
>   
>> When, I don't know. But it's used exactly as in frontends: it's a search
>> engine for module texts.
>>
>>     
>>> Can it be used by client applications for searching?
>>>       
>> It's integrated into sword API, it's just one of the search engines.
>> I have tested it with modified diatheke. I built the indexes with
>> BibleTime, moved them into module directories, and it just worked.
>>     
>
> BibleTime is not using Sword's support for CLucene, it has it's own 
> implementation of module indexing, searching, etc.
>
> It should be possible for your to use Sword's lucene support, but as soon as 
> you have to change things (e.g. the indexer's type, the types of documents in 
> the index) or you need more information about the internals (e.g. sword 
> doesn't support versioned indexes yet) you might have to change Sword or 
> implement your own solution.
> Also be aware that CLucene is always behind Java Lucene in development.
>
> Just my thoughts,
> Joachim
>   




More information about the sword-devel mailing list