[sword-devel] Optimizing index time Was: Re: module modtime -vs- CLucene index out-of-date-ness

DM Smith dmsmith555 at yahoo.com
Wed May 2 19:38:32 MST 2007


Karl fixed the bugs in my patch and I am attaching a new patch.
His statistics under cygwin on Windows XP against all modules:
Before:	old mkfastmod: 344.577u 129.499s 9:08.59 86.4%
After:	new mkfastmod: 328.452u 29.749s 6:20.30 94.1%
(The three values are: user, system, wall and cpu)
So there was nearly a 30% gain.


Chris has volunteered to benchmark under Windows.


On May 2, 2007, at 7:52 PM, DM Smith wrote:

> Attached is a patch that uses the RAMDirectory. It parallels the  
> JSword code and it compiles, but other than that I have not tested it.
>
> Would any of you mind testing it, especially in Windows with Virus  
> scanning on and also off. There should be negligible difference  
> between the two. Also, measure RAM usage when indexing a Bible with  
> Strong's numbers, like the KJV.
>
> <patch.zip>
>
> In His Service,
> 	DM
>
> On May 2, 2007, at 4:54 PM, DM Smith wrote:
>
>> Chris Little wrote:
>>> Unfortunately, that's impractical. With a virus scanner on, the
>>> compression takes 5 minutes for a single Bible (OT+NT) on my Win32
>>> system (2GHz Pent-M, 7200RPM drive), due to the constant disk  
>>> access. We
>>> would either have to tell users to disable virus protection or  
>>> deal with
>>> people complaining that their systems freeze every time they add/ 
>>> update
>>> a module.
>>> --Chris
>>>
>>
>> Actually, Lucene has an implementation of a RamDirectory to which the
>> index can be written. And once completed it can be copied to the  
>> local
>> file system. We've done it in JSword and the results were  
>> phenomenal. I
>> presume that the CLucene implementation is sufficiently similar to
>> Lucene to have it. It is less than 10 lines of additional code in  
>> Java.
>>
>> The only problem is that it eats RAM proportional to the size of the
>> final index. I have not measured it to see how big it is, but since
>> Win98SE with all the updates on an old Pentium laptop is hardly  
>> usable
>> with less than 64M RAM, I think that most machines have enough RAM.
>> After ugrading my old laptop to 128M ram, JSword can index in about 4
>> minutes, whereas I never had the patience to let it complete before.
>>
>> That aside, it shifts from being disk bound to cpu bound and the  
>> machine
>> is still practically unresponsive. So I think that it will still be
>> impractical.
>>
>>>
>>> Kahunapule Michael Johnson wrote:
>>>
>>>> What about updating the Sword engine to index each module as it is
>>>> installed, if the indexing can be used. That way, you get small
>>>> downloads for everyone, faster searches for those who can use  
>>>> indexes,
>>>> and a little more module installation time.
>>>>
>>>> Just a thought...
>>>>
>>>> Michael
>>>>
>>>>
>>>> _______________________________________________
>>>> sword-devel mailing list: sword-devel at crosswire.org
>>>> http://www.crosswire.org/mailman/listinfo/sword-devel
>>>> Instructions to unsubscribe/change your settings at above page
>>>>
>>>
>>> _______________________________________________
>>> sword-devel mailing list: sword-devel at crosswire.org
>>> http://www.crosswire.org/mailman/listinfo/sword-devel
>>> Instructions to unsubscribe/change your settings at above page
>>>
>>>
>>
>>
>> _______________________________________________
>> sword-devel mailing list: sword-devel at crosswire.org
>> http://www.crosswire.org/mailman/listinfo/sword-devel
>> Instructions to unsubscribe/change your settings at above page
>
> _______________________________________________
> sword-devel mailing list: sword-devel at crosswire.org
> http://www.crosswire.org/mailman/listinfo/sword-devel
> Instructions to unsubscribe/change your settings at above page

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.crosswire.org/pipermail/sword-devel/attachments/20070502/935b9bb6/attachment-0002.html 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: patch.zip
Type: application/zip
Size: 1195 bytes
Desc: not available
Url : http://www.crosswire.org/pipermail/sword-devel/attachments/20070502/935b9bb6/attachment-0001.zip 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.crosswire.org/pipermail/sword-devel/attachments/20070502/935b9bb6/attachment-0003.html 


More information about the sword-devel mailing list