[sword-devel] Optimizing index time Was: Re: module modtime -vs- CLucene index out-of-date-ness
Chris Little
chrislit at crosswire.org
Wed May 2 20:42:27 MST 2007
My benchmark system is a 2.0GHz Pentium-M with a 7200RPM drive and
1.25GB ram. These are times for compressing KJV (which takes
significantly longer than most other Bibles).
Old mkfastmod (mcafee protection & google indexing on):
5m33.007s
Old mkfastmod (mcafee protection & google indexing off):
4m16.322s
New mkfastmod (virus protection/search indexing made insignificant
differences):
1m46.252s
--Chris
DM Smith wrote:
> Karl fixed the bugs in my patch and I am attaching a new patch.
> His statistics under cygwin on Windows XP against all modules:
> Before: old mkfastmod: 344.577u 129.499s 9:08.59 86.4%
> After: new mkfastmod: 328.452u 29.749s 6:20.30 94.1%
> (The three values are: user, system, wall and cpu)
> So there was nearly a 30% gain.
>
>
> ------------------------------------------------------------------------
>
>
> Chris has volunteered to benchmark under Windows.
>
>
> On May 2, 2007, at 7:52 PM, DM Smith wrote:
>
>> Attached is a patch that uses the RAMDirectory. It parallels the
>> JSword code and it compiles, but other than that I have not tested it.
>>
>> Would any of you mind testing it, especially in Windows with Virus
>> scanning on and also off. There should be negligible difference
>> between the two. Also, measure RAM usage when indexing a Bible with
>> Strong's numbers, like the KJV.
>>
>> <patch.zip>
>>
>> In His Service,
>> DM
>>
>> On May 2, 2007, at 4:54 PM, DM Smith wrote:
>>
>>> Chris Little wrote:
>>>> Unfortunately, that's impractical. With a virus scanner on, the
>>>> compression takes 5 minutes for a single Bible (OT+NT) on my Win32
>>>> system (2GHz Pent-M, 7200RPM drive), due to the constant disk access. We
>>>> would either have to tell users to disable virus protection or deal with
>>>> people complaining that their systems freeze every time they add/update
>>>> a module.
>>>> --Chris
>>>>
>>>
>>> Actually, Lucene has an implementation of a RamDirectory to which the
>>> index can be written. And once completed it can be copied to the local
>>> file system. We've done it in JSword and the results were phenomenal. I
>>> presume that the CLucene implementation is sufficiently similar to
>>> Lucene to have it. It is less than 10 lines of additional code in Java.
>>>
>>> The only problem is that it eats RAM proportional to the size of the
>>> final index. I have not measured it to see how big it is, but since
>>> Win98SE with all the updates on an old Pentium laptop is hardly usable
>>> with less than 64M RAM, I think that most machines have enough RAM.
>>> After ugrading my old laptop to 128M ram, JSword can index in about 4
>>> minutes, whereas I never had the patience to let it complete before.
>>>
>>> That aside, it shifts from being disk bound to cpu bound and the machine
>>> is still practically unresponsive. So I think that it will still be
>>> impractical.
>>>
>>>>
>>>> Kahunapule Michael Johnson wrote:
>>>>
>>>>> What about updating the Sword engine to index each module as it is
>>>>> installed, if the indexing can be used. That way, you get small
>>>>> downloads for everyone, faster searches for those who can use indexes,
>>>>> and a little more module installation time.
>>>>>
>>>>> Just a thought...
>>>>>
>>>>> Michael
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> sword-devel mailing list: sword-devel at crosswire.org
>>>>> <mailto:sword-devel at crosswire.org>
>>>>> http://www.crosswire.org/mailman/listinfo/sword-devel
>>>>> Instructions to unsubscribe/change your settings at above page
>>>>>
>>>>
>>>> _______________________________________________
>>>> sword-devel mailing list: sword-devel at crosswire.org
>>>> <mailto:sword-devel at crosswire.org>
>>>> http://www.crosswire.org/mailman/listinfo/sword-devel
>>>> Instructions to unsubscribe/change your settings at above page
>>>>
>>>>
>>>
>>>
>>> _______________________________________________
>>> sword-devel mailing list: sword-devel at crosswire.org
>>> <mailto:sword-devel at crosswire.org>
>>> http://www.crosswire.org/mailman/listinfo/sword-devel
>>> Instructions to unsubscribe/change your settings at above page
>>
>> _______________________________________________
>> sword-devel mailing list: sword-devel at crosswire.org
>> <mailto:sword-devel at crosswire.org>
>> http://www.crosswire.org/mailman/listinfo/sword-devel
>> Instructions to unsubscribe/change your settings at above page
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> sword-devel mailing list: sword-devel at crosswire.org
> http://www.crosswire.org/mailman/listinfo/sword-devel
> Instructions to unsubscribe/change your settings at above page
More information about the sword-devel
mailing list