[sword-devel] Search up to 5.8 times faster now :)

Chris Little chrislit at crosswire.org
Wed Jun 2 18:38:32 MST 2004

Does this only affect the multi-word search (not the phrase or regex 
searches)?  It seems like we could achieve a similar gain in performance 
for the phrase searches by splitting phrases into individual words, 
applying your algorithm (search raw, then strip, then search again) to 
limit the pool to those verses that include all of the words (regardless 
of order), and then performing the current phrase search algorithm 
(strip filters, then search) on that pool.  Just a thought.  There might 
be some flawed logic that hasn't occurred to me.


Joachim Ansorg wrote:
> Hash: SHA1
> Hi,
> the standard search function is now up to 5 times faster than before.
> Let me explain.
> A search in a module did the following:
> 	1. Get the text of a key by calling all the strip filters ()
> 	2. Search the search words in the stripped down text
> 	3. If it was found add it to the result
> We assume a module with 6 strip filters.
> This means the expensive StripText() function got called 30000*6=180000 times.
> Now we check for the words in the raw text and only check keys which had a 
> valid match in the raw text if they match in the stripped down text.
> If we assume a normal query returns 100 results the StripText function gets 
> called 100*6=600 times which saves a lot of time.
> Old/new comparision:
> 	time ./old/examples/cmdline/search KJV Revelation
> 		real    0m18.912s
> 		user    0m18.090s
> 		sys     0m0.780s
> 	time ./new/examples/cmdline/search KJV Revelation
> 		real    0m3.396s
> 		user    0m2.540s
> 		sys     0m0.830s
> Which is an improvement factor of 5.6 :)
> 	./new/examples/cmdline/search WEB God
> only takes 2.1 secs now.
> Another example:
> 	time ./old/examples/cmdline/search KJV God
> 		real    0m20.371s
> 		user    0m18.130s
> 		sys     0m0.950s
> 	time ./new/examples/cmdline/search KJV God
> 		real    0m5.566s
> 		user    0m4.730s
> 		sys     0m0.810s	
> This is "only" 3.7 times faster, because searching in the raw text gives more 
> hits which means more calls to StripText(). I tested it with a search for " " 
> which means all verses and it's as slow as the old one. Which ones usual 
> search queries are a lot faster than before.
> The fix is in CVS now.
> Joachim
> - -- 
> <>< Re: deemed!
> Version: GnuPG v1.2.4 (GNU/Linux)
> iD8DBQFAvlP4EyRIb2AZBB0RAqF0AKC+VgR5O3Ex9kmgtP8U6vlOgD82GwCfTapO
> yCdN4G7E22dFk6oz09wAXXY=
> =gqKO
> _______________________________________________
> sword-devel mailing list
> sword-devel at crosswire.org
> http://www.crosswire.org/mailman/listinfo/sword-devel

More information about the sword-devel mailing list