[sword-devel] CLucene and Sword

Troy A. Griffitts scribe at crosswire.org
Fri May 25 11:00:33 MST 2007


Manfred,
	Have a look at the source for sword/utilities/mkfastmod.cpp and 
sword/examples/cmdline/search.cpp

	Checking whether or not the indecies are created is the most confusing 
part.  Originally, the plan was to let the SWModule::search method 
return whether or not a search was supported by the search type 
requested.  So, if you called SWModule::search requesting CLucene type, 
and passed a bool * to justCheckIfSupported, it would set your bool to 
true if the indecies were created, and false otherwise.  This would 
allow search engine plugins to create different indecies depending on 
the search string features passed in and such.  There are routines to 
see if a driver even is compiled with code which CAN create a fast 
index, and also if it HAS created the index.

	Anyway, it's all too complicated and impractical.  Hopefully we will 
change it to something much more straightforward, like: bool 
hasIndex(int searchType), when we do the 2.0 refactoring soon.

	The place to look for the current interface is 
sword/include/swsearchable.h  Someone else wrote the comment in there, 
who didn't understand how things worked.  I can't blame them, as I 
hadn't written ANY comments, so they at least tried.  I've updated them 
slightly and committed just now.

	Currently, the best way to 'make it work' is to use the search dialog 
from BibleCS as an example.  It shows the [Create Index] button to the 
user if the indecies have not been created, and if they have, it hides 
the button and adds the "Optimized Search" option to the user choices if 
the index is there.

	Here's a direct link to the file in svn.  In your browser, search for 
all occurances of: toggleIndex
That should get you into all the blocks of code you need to lift.

http://crosswire.org/svn/biblecs/trunk/searchfrm.cpp

('target' is any SWModule *)


	Hope this helps,

		-Troy.



Manfred Bergmann wrote:
> Troy,
> 
> that's great.
> I finally compiled sword with clucene support for the Mac.  
> Unfortunately currently only for PPC platform because cross-compiling  
> clucene for Intel didn't work. Maybe I need someone with an Intel Mac  
> for this.
> 
> However, there are some question to using the sword clucene  
> implementation.
> 
> - where are the index files stored?
> - are there some API examples on how this works or is it straight  
> forward with looking at the API docs?
> 
> 
> Regards,
> Manfred
> 
> 
> 
> Am 18.05.2007 um 19:56 schrieb Troy A. Griffitts:
> 
>> Manfred,
>>     I believe Will's reason for not using CLucene in SWORD was because
>> he couldn't easily get CLucene compiled on the Mac.  Using SWORD's
>> CLucene implementation has many advantages, and I'm not sure any real
>> world disadvantages.  But, of course, I'm biased.
>>
>> o   You get to share indexes between frontends
>> o   You get the implementation for free
>> o   Your features continue to improve for free when others contribute
>> o   You get to benefit others if you add features
>>
>> Currently, to my knowledge, SWORD's implementation of CLucene supports
>> MORE features than any frontend exposes (with the possible  
>> exception of
>> DM's latest JSword work):
>>
>> o    Full SWORD VerseKey Range parsing support (e.g., Search only in
>> Paul's Epistles, "Rom-Phile", or "Jo;1jo-3jo;rev")
>> o    Choose verse or chapter granularity for a hit (e.g., Find all  
>> these
>> words within the same [verse | chapter])
>> o    Search in any SWORD module type (Bibles, General Books,
>> Commentaries, Lexica, Devotionals, etc.)
>> o   Advantage of using SWORD's filter facility to massage data before
>> indexing:
>>        - Ignore accents and diacritics in Greek and Hebrew
>>        - Ignore critical markup in transcriptions.
>> o   Currently supported doc fields:
>>        - key: The SWORD Key (e.g., in a lexicon "Adam", in a Bible,  
>> the
>> osisID)
>>        - content: The body of the entry
>>        - lemma: Strong's numbers or other lemma data included in  
>> the module
>> o   Seamless integration with other SWORD search mechanisms:
>>         - ability to search WITHOUT creating indexes.  This is
>> frustrating for me with the newest version of Bibletime.  There are
>> often times when I don't want to create a lucene index on a module.  I
>> seldom search most modules and an unindexed search average 5 second  
>> wait
>> time is perfectly acceptable to me on these modules.  I neither  
>> want the
>> disk overhead nor the initial index creation time.
>>        - Regular Expression searching
>>        - Searching in ANY EntryAttribute which existing filters, or  
>> your
>> custom filters, might decide to add.  Some of these currently include:
>> footnotes, headings, lemma, morph, AVPhrase (Greek lexicon, Authorized
>> Version translation choices for a Greek entry), src (interlinear data
>> which links a translation to original), refList (footnotes
>> crossreference verses), morpheme (WLC Hebrew morpheme breakdown).   
>> (DM:
>> This seems a logic place to add the ability to create new CLucene doc
>> fields based on these modular filters)
>>
>> In conclusion, it seems to me that utilizing and extending the current
>> search support in SWORD benefits everyone and leverages an already
>> existing solid set of features.
>>
>>     -Troy.
>>
>>
>> Manfred Bergmann wrote:
>>> Hi.
>>>
>>> Since when is CLucene integrated in Sword and for what exactly is it
>>> used?
>>> Can it be used by client applications for searching?
>>>
>>> I'm not really satisfied with using Java Lucene in Objective-C in
>>> MacSword.
>>> It is possible to use Java classes in Objective-C but it is not very
>>> straight forward and difficult to debug.
>>> So I'm wondering if we could get rid of Lucene and use the Sword
>>> integrated CLucene.
>>>
>>>
>>>
>>> Regards,
>>> Manfred
>>>
>>>
>>> _______________________________________________
>>> sword-devel mailing list: sword-devel at crosswire.org
>>> http://www.crosswire.org/mailman/listinfo/sword-devel
>>> Instructions to unsubscribe/change your settings at above page
>>>
>>
>> _______________________________________________
>> sword-devel mailing list: sword-devel at crosswire.org
>> http://www.crosswire.org/mailman/listinfo/sword-devel
>> Instructions to unsubscribe/change your settings at above page
> 
> 
> _______________________________________________
> sword-devel mailing list: sword-devel at crosswire.org
> http://www.crosswire.org/mailman/listinfo/sword-devel
> Instructions to unsubscribe/change your settings at above page




More information about the sword-devel mailing list