[sword-devel] CLucene and Sword
Troy A. Griffitts
scribe at crosswire.org
Fri May 18 10:56:13 MST 2007
Manfred,
I believe Will's reason for not using CLucene in SWORD was because
he couldn't easily get CLucene compiled on the Mac. Using SWORD's
CLucene implementation has many advantages, and I'm not sure any real
world disadvantages. But, of course, I'm biased.
o You get to share indexes between frontends
o You get the implementation for free
o Your features continue to improve for free when others contribute
o You get to benefit others if you add features
Currently, to my knowledge, SWORD's implementation of CLucene supports
MORE features than any frontend exposes (with the possible exception of
DM's latest JSword work):
o Full SWORD VerseKey Range parsing support (e.g., Search only in
Paul's Epistles, "Rom-Phile", or "Jo;1jo-3jo;rev")
o Choose verse or chapter granularity for a hit (e.g., Find all these
words within the same [verse | chapter])
o Search in any SWORD module type (Bibles, General Books,
Commentaries, Lexica, Devotionals, etc.)
o Advantage of using SWORD's filter facility to massage data before
indexing:
- Ignore accents and diacritics in Greek and Hebrew
- Ignore critical markup in transcriptions.
o Currently supported doc fields:
- key: The SWORD Key (e.g., in a lexicon "Adam", in a Bible, the
osisID)
- content: The body of the entry
- lemma: Strong's numbers or other lemma data included in the module
o Seamless integration with other SWORD search mechanisms:
- ability to search WITHOUT creating indexes. This is
frustrating for me with the newest version of Bibletime. There are
often times when I don't want to create a lucene index on a module. I
seldom search most modules and an unindexed search average 5 second wait
time is perfectly acceptable to me on these modules. I neither want the
disk overhead nor the initial index creation time.
- Regular Expression searching
- Searching in ANY EntryAttribute which existing filters, or your
custom filters, might decide to add. Some of these currently include:
footnotes, headings, lemma, morph, AVPhrase (Greek lexicon, Authorized
Version translation choices for a Greek entry), src (interlinear data
which links a translation to original), refList (footnotes
crossreference verses), morpheme (WLC Hebrew morpheme breakdown). (DM:
This seems a logic place to add the ability to create new CLucene doc
fields based on these modular filters)
In conclusion, it seems to me that utilizing and extending the current
search support in SWORD benefits everyone and leverages an already
existing solid set of features.
-Troy.
Manfred Bergmann wrote:
> Hi.
>
> Since when is CLucene integrated in Sword and for what exactly is it
> used?
> Can it be used by client applications for searching?
>
> I'm not really satisfied with using Java Lucene in Objective-C in
> MacSword.
> It is possible to use Java classes in Objective-C but it is not very
> straight forward and difficult to debug.
> So I'm wondering if we could get rid of Lucene and use the Sword
> integrated CLucene.
>
>
>
> Regards,
> Manfred
>
>
> _______________________________________________
> sword-devel mailing list: sword-devel at crosswire.org
> http://www.crosswire.org/mailman/listinfo/sword-devel
> Instructions to unsubscribe/change your settings at above page
>
More information about the sword-devel
mailing list