[sword-devel] CLucene and Sword
DM Smith
dmsmith555 at yahoo.com
Thu May 17 14:44:30 MST 2007
I'll give my 2 cents on index versioning.
We need to tackle the whole question of index versioning. It is
something that I am going through right now with JSword and
inadvertently didn't think about it when I made a change to Sword.
As you noted Java Lucene is ahead of CLucene. Right now it is by 2
releases, with Lucene at 2.1 and CLucene at 1.4.3. Lucene 2.1 is a
significant improvement over 1.4.3, being significantly faster and
having lots of bug fixes. Also, it can read indexes built with 1.4. I
don't think it can write to them. However, 1.4 cannot read indexes built
with 2.1. Looking at the CLucene project, it does not look like they are
going to release 2.x anytime soon.
With JSword, I have recently added the ability to index Strong's
numbers, notes, osisIDs, cross references and headings. To do this I
needed to build a custom analyzer, which is compatible with current
indexes. (In essence, each field has it's own analyzer, with existing
fields having the same analyzer as before.) Before I release this, I am
going to give the user some control over which of these is included in
the index.
For the features that I plan to add that will take advantage of these
new fields, I'll need to know whether an index has a field present or
not. (I think there is a way to look at the index to see what fields are
used.)
So my thought is to write a "property" file to the directory that holds
the lucene index. In essence this is represents the version of the index
for the module.
When I am done, I'd like to add it to the Sword API. The greatest thing
about shared code is the ability to gain from each other's contributions.
Joachim Ansorg wrote:
> Hi,
>
>
>> When, I don't know. But it's used exactly as in frontends: it's a search
>> engine for module texts.
>>
>>
>>> Can it be used by client applications for searching?
>>>
>> It's integrated into sword API, it's just one of the search engines.
>> I have tested it with modified diatheke. I built the indexes with
>> BibleTime, moved them into module directories, and it just worked.
>>
>
> BibleTime is not using Sword's support for CLucene, it has it's own
> implementation of module indexing, searching, etc.
>
> It should be possible for your to use Sword's lucene support, but as soon as
> you have to change things (e.g. the indexer's type, the types of documents in
> the index) or you need more information about the internals (e.g. sword
> doesn't support versioned indexes yet) you might have to change Sword or
> implement your own solution.
> Also be aware that CLucene is always behind Java Lucene in development.
>
> Just my thoughts,
> Joachim
>
More information about the sword-devel
mailing list