[sword-devel] Search up to 5.8 times faster now :)
Troy A. Griffitts
scribe at crosswire.org
Wed Jun 2 14:41:45 MST 2004
Joachim,
Great job! I haven't looked too closely at the code, but enough to get
the idea. Chris, I think Joachim added some logic for phrase search, as
well, though I didn't follow it when I read it in the patch briefly.
Excited to post 1.5.8 someday. Starting a new job has really been
draining.
-Troy.
Chris Little wrote:
> Does this only affect the multi-word search (not the phrase or regex
> searches)? It seems like we could achieve a similar gain in performance
> for the phrase searches by splitting phrases into individual words,
> applying your algorithm (search raw, then strip, then search again) to
> limit the pool to those verses that include all of the words (regardless
> of order), and then performing the current phrase search algorithm
> (strip filters, then search) on that pool. Just a thought. There might
> be some flawed logic that hasn't occurred to me.
>
> --Chris
>
> Joachim Ansorg wrote:
>
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA1
>>
>> Hi,
>> the standard search function is now up to 5 times faster than before.
>>
>> Let me explain.
>> A search in a module did the following:
>> 1. Get the text of a key by calling all the strip filters ()
>> 2. Search the search words in the stripped down text
>> 3. If it was found add it to the result
>> We assume a module with 6 strip filters.
>> This means the expensive StripText() function got called
>> 30000*6=180000 times.
>>
>> Now we check for the words in the raw text and only check keys which
>> had a valid match in the raw text if they match in the stripped down
>> text.
>> If we assume a normal query returns 100 results the StripText function
>> gets called 100*6=600 times which saves a lot of time.
>>
>> Old/new comparision:
>> time ./old/examples/cmdline/search KJV Revelation
>> real 0m18.912s
>> user 0m18.090s
>> sys 0m0.780s
>>
>> time ./new/examples/cmdline/search KJV Revelation
>> real 0m3.396s
>> user 0m2.540s
>> sys 0m0.830s
>> Which is an improvement factor of 5.6 :)
>>
>> ./new/examples/cmdline/search WEB God
>> only takes 2.1 secs now.
>>
>> Another example:
>> time ./old/examples/cmdline/search KJV God
>> real 0m20.371s
>> user 0m18.130s
>> sys 0m0.950s
>>
>> time ./new/examples/cmdline/search KJV God
>> real 0m5.566s
>> user 0m4.730s
>> sys 0m0.810s
>> This is "only" 3.7 times faster, because searching in the raw text
>> gives more hits which means more calls to StripText(). I tested it
>> with a search for " " which means all verses and it's as slow as the
>> old one. Which ones usual search queries are a lot faster than before.
>>
>> The fix is in CVS now.
>>
>> Joachim
>> - -- <>< Re: deemed!
>> -----BEGIN PGP SIGNATURE-----
>> Version: GnuPG v1.2.4 (GNU/Linux)
>>
>> iD8DBQFAvlP4EyRIb2AZBB0RAqF0AKC+VgR5O3Ex9kmgtP8U6vlOgD82GwCfTapO
>> yCdN4G7E22dFk6oz09wAXXY=
>> =gqKO
>> -----END PGP SIGNATURE-----
>> _______________________________________________
>> sword-devel mailing list
>> sword-devel at crosswire.org
>> http://www.crosswire.org/mailman/listinfo/sword-devel
>
>
> _______________________________________________
> sword-devel mailing list
> sword-devel at crosswire.org
> http://www.crosswire.org/mailman/listinfo/sword-devel
More information about the sword-devel
mailing list