[sword-devel] CLucene and Sword

Troy A. Griffitts scribe at crosswire.org
Fri May 18 10:56:13 MST 2007


Manfred,
    I believe Will's reason for not using CLucene in SWORD was because 
he couldn't easily get CLucene compiled on the Mac.  Using SWORD's 
CLucene implementation has many advantages, and I'm not sure any real 
world disadvantages.  But, of course, I'm biased.

o   You get to share indexes between frontends
o   You get the implementation for free
o   Your features continue to improve for free when others contribute
o   You get to benefit others if you add features

Currently, to my knowledge, SWORD's implementation of CLucene supports 
MORE features than any frontend exposes (with the possible exception of 
DM's latest JSword work):

o    Full SWORD VerseKey Range parsing support (e.g., Search only in 
Paul's Epistles, "Rom-Phile", or "Jo;1jo-3jo;rev")
o    Choose verse or chapter granularity for a hit (e.g., Find all these 
words within the same [verse | chapter])
o    Search in any SWORD module type (Bibles, General Books, 
Commentaries, Lexica, Devotionals, etc.)
o   Advantage of using SWORD's filter facility to massage data before 
indexing:
       - Ignore accents and diacritics in Greek and Hebrew
       - Ignore critical markup in transcriptions.
o   Currently supported doc fields:
       - key: The SWORD Key (e.g., in a lexicon "Adam", in a Bible, the 
osisID)
       - content: The body of the entry
       - lemma: Strong's numbers or other lemma data included in the module
o   Seamless integration with other SWORD search mechanisms:
        - ability to search WITHOUT creating indexes.  This is 
frustrating for me with the newest version of Bibletime.  There are 
often times when I don't want to create a lucene index on a module.  I 
seldom search most modules and an unindexed search average 5 second wait 
time is perfectly acceptable to me on these modules.  I neither want the 
disk overhead nor the initial index creation time.
       - Regular Expression searching
       - Searching in ANY EntryAttribute which existing filters, or your 
custom filters, might decide to add.  Some of these currently include: 
footnotes, headings, lemma, morph, AVPhrase (Greek lexicon, Authorized 
Version translation choices for a Greek entry), src (interlinear data 
which links a translation to original), refList (footnotes 
crossreference verses), morpheme (WLC Hebrew morpheme breakdown).  (DM: 
This seems a logic place to add the ability to create new CLucene doc 
fields based on these modular filters)

In conclusion, it seems to me that utilizing and extending the current 
search support in SWORD benefits everyone and leverages an already 
existing solid set of features.

    -Troy.


Manfred Bergmann wrote:
> Hi.
>
> Since when is CLucene integrated in Sword and for what exactly is it  
> used?
> Can it be used by client applications for searching?
>
> I'm not really satisfied with using Java Lucene in Objective-C in  
> MacSword.
> It is possible to use Java classes in Objective-C but it is not very  
> straight forward and difficult to debug.
> So I'm wondering if we could get rid of Lucene and use the Sword  
> integrated CLucene.
>
>
>
> Regards,
> Manfred
>
>
> _______________________________________________
> sword-devel mailing list: sword-devel at crosswire.org
> http://www.crosswire.org/mailman/listinfo/sword-devel
> Instructions to unsubscribe/change your settings at above page
>   




More information about the sword-devel mailing list