[jsword-devel] Out of Memory Issues Loading repo module lists

DM Smith dmsmith at crosswire.org
Mon Jan 11 12:28:07 MST 2016


I have been thinking about this a bit more. I was knew there was a need to prevent stale confs. The time performance is something that I’m not able to test. My machine has an SSD, a fast 4 core CPU and gobs of RAM. So I need you to keep me in line. ;)

The easiest way to keep it pristine is to unpack it into a temporary folder, rename the old folder and then rename the new folder. Finally deleting the old folder. By doing it in this order it minimizes the time that mods.d is unavailable. Important for multi-threaded apps and multiple apps that share the same machine simultaneously.

Right now the SwordBookMetaData remembers the File for the conf of installed modules and is able to re-read it easily. But it does not store anything about a conf’s location when it is from mods.d.tar.gz. I suppose I could have it remember the location of mods.d.tar.gz and the name of the conf entry and create a method to extract a that conf out of the compressed archive. This would need to be done for each module that the user requests info. To do this is quite expensive as it means inflating the file then iterating over the contents until the desired conf is found.

I think that it would be better to see how much time it adds to extract the files and store them on disk. The fluffing of them would only be when the user wants to browse a description of the module.

I’d like to modify sbmd.toOSIS to check if the sbmd is partial or full and if not full re-read the conf fully and then continue as before. I think that is how JSword is designed to retreive the conf for presentation to the end user. Does AndBible use that or some other mechanism to get what it wants for presentation?

I think I’ll add a “fluff” method to BookMetaData that will do this. This could be called to get it to fluff at another time.

DM

> On Jan 11, 2016, at 1:00 PM, Martin Denham <mjdenham at gmail.com> wrote:
> 
> My rough estimates have the total size of conf files in all repos at about 5Mb which is not too different to the size of a module like ESV so the impact should not be significant and it should not be a problem if this is required.
> 
> Other things to consider that come to mind i) would need to remove conf files no longer in mods.d.tar.gz or delete and re-extract everything after a refresh ii) Time taken to save files - loading the list is already slow.
> 
> I can't think of any major reason not to do as you describe.
> 
> However, would an easier approach be to find files in the zip a bit like this <http://stackoverflow.com/questions/11123528/finding-a-file-in-zipentry-java>.  Speed would not be an issue because it would only be done once or twice after fetching the list e.g. to view About or to actually download.  The mod.conf file name/path could be saved in SBMD if required.
> 
> Martin
> 
> 
> On 11 January 2016 at 01:39, DM Smith <dmsmith at crosswire.org <mailto:dmsmith at crosswire.org>> wrote:
> I’m trying to figure out how to reload a conf from a remote source (to go from a partial load to a full load).  The problem is that the AbstractSwordInstaller sits over top of mods.d.tar.gz, which it does not unpack. Instead, it iterates over all the entries in that binary file and handles each entry (i.e. a conf) in core. It doesn’t hit the disk. I’m wondering whether it would be alright to unpack the file in the same folder? That would allow a SwordBookMetaData to reload the file. It would also mean that SwordBookMetaData would only need one means of reading a conf as it’d be a file and not a byte array.
> 
> It isn’t a problem with desktop or server apps, but it might be for AndBible.
> 
> — DM
> 
> 
> 
>> On Jan 10, 2016, at 3:31 PM, DM Smith <dmsmith at crosswire.org <mailto:dmsmith at crosswire.org>> wrote:
>> 
>> The problem you encountered was 2 bugs:
>> When the module is not UTF-8 the remote repository’s conf is re-read, but the filter wasn’t passed.
>> Not intended, but IniSection required a filter, rather than saying a null filter meant everything passed.
>> 
>> I’ve checked in that fix. Still trying to make the memory less….
>> 
>> — DM
>> 
>>> On Jan 10, 2016, at 1:18 PM, DM Smith <dmsmith at crosswire.org <mailto:dmsmith at crosswire.org>> wrote:
>>> 
>>> The “Partial load of conf file.’ was to load all of the things in a conf that the JSword engine needs to work with a module. I don’t know why the CrossWire repo is working for me but not for you. I’ll keep working on it today. The problem with the previous commit was fixed with the last commit. I wasn’t “adjusting” the module after loading to fill in things like BookDriver and BookCategory.
>>> 
>>> I’m wondering whether getting the list of Books from the installer creates a deep rather than a shallow copy of them.
>>> 
>>> Today I hope to make SwordBookMetaData even more lazy. It has a BookDriver and validates its storage when the repo is loaded. I plan to break one of my modules by renaming one of the files and see the impact. Chris and I have noticed that the FileState objects are not fully released. This actually is part of the design.
>>> 
>>> Anyway, I think it is going in the right direction. Reducing the memory 4x is a  good thing. The data structures within the IniSection may be too heavy. I may relax the requirement that it maintains the SWORD confs order. The idea was to be able to modify the provided conf, retaining its order. However, now we never modify that conf.
>>> 
>>> configAll was a deep clone of configSword. configAll adds in the contents of configJSword and then configFrontend. These last two are created even if not needed. We could make them lazy as well.
>>> 
>>> DM
>>> 
>>>> On Jan 10, 2016, at 11:07 AM, Martin Denham <mjdenham at gmail.com <mailto:mjdenham at gmail.com>> wrote:
>>>> 
>>>> Thanks for the quick response.  I have had a brief look at the new commits.
>>>> 
>>>> A lot of the attributes aren't being returned now so it is tricky to test and there are various errors but running the current tip 'Partial load of conf file. <https://github.com/crosswire/jsword/commit/80020f51c6a762d458ce8ae70007b78eadee1fb3>' the SBMD for eBible is now only a quarter of the original size at 10Mb which is fine but I still don't understand why it is so large for the minimal attribute set now being returned.
>>>> 
>>>> I get a lot of errors like:
>>>> SwordBookMetaData(492): Book not supported: malformed conf file for [BBE] no ModDrv found.
>>>> SwordBookMetaData(492): Malformed conf file: missing [BBE]Description=. Using BBE
>>>> 
>>>> and peculiarly the eBible repo seems to be the only repo I can use because all the others error.
>>>> 
>>>> I also tried the previous commit Cut the memory requirements of a SwordBookMetaData in half. <https://github.com/crosswire/jsword/commit/cc32ba8f1bb245932a747390d03874b2be70e9a1> but it did not work because basic attributes like language were not being returned.
>>>> 
>>>> I still don't understand why removing configSword should reduce memory by half because it should just be removing references to data that is also referenced from configAll, so it would reduce memory slightly but not much.
>>>> 
>>>> Martin
>>>> 
>>>> 
>>>> 
>>>> On 10 January 2016 at 04:14, DM Smith <dmsmith at crosswire.org <mailto:dmsmith at crosswire.org>> wrote:
>>>> OK. That’s done. Also accidentally introduced a bug with the last commit. It is noticeably fast.
>>>> 
>>>> Next up, allow for *a* SwordBookMetaData to be reloaded fully. This is needed to bring in all the other elements which are information only, such as About, in order to display info to the end user. Since the user will only look at one modules info at a time, it will load that one. You may need to change your code (hope not) to force that one to reload.
>>>> 
>>>> Give the code a try to see if it solves your out of memory error.
>>>> 
>>>> DM
>>>> 
>>>> 
>>>>> On Jan 9, 2016, at 9:06 PM, DM Smith <dmsmith at crosswire.org <mailto:dmsmith at crosswire.org>> wrote:
>>>>> 
>>>>> I’ll be adding a filter to IniSection. Something like:
>>>>> if  (filter.test(key)) {
>>>>> 	use the key
>>>>> } else {
>>>>> 	do nothing
>>>>> }
>>>>> 
>>>>> SwordBookMetaData will be responsible for building the filter. At least for a first go around. A single object should do.
>>>>> 
>>>>> DM
>>>>> 
>>>>>> On Jan 9, 2016, at 6:29 PM, DM Smith <dmsmith at crosswire.org <mailto:dmsmith at crosswire.org>> wrote:
>>>>>> 
>>>>>>> 
>>>>>>> Yes, like you I have thought of streamlining conf loading for repo lists.  One idea I had was to enable specification of a filter to SwordBookMetaData to limit the conf values that are stored.
>>>>>> 
>>>>>> I was thinking of something similar. My ideas aren’t good enough to be put into practice, but some kind of flag indicating empty, partially or fully loaded. Empty would mean that it hasn’t gone to disk to get the conf. Partial means that it read everything, but threw away most as not interesting (since the conf does not have order you have to read and parse it all). Full would mean that nothing was pitched. SwordBookMetaData.getProperty would need to be changed to determine whether the key is in memory or might be on disk and do the right thing. Or we could keep getProperty as it is and if you want one of the fields that is not stored (e.g. About) you have to call reload().
>>>>>> 
>>>>>> Maybe we could also cache that info into a separate file(s)? When mods.d.tar.gz is updated then the cache would be recomputed. In doing the computation, each conf would be read then pitched. Basically, the storage would be o.c.c.utils.Ini, if one file or IniSection, if many files.
>>>>>> 
>>>>>> What do you think?
>>>>>> 
>>>>> 
>>>>> _______________________________________________
>>>>> jsword-devel mailing list
>>>>> jsword-devel at crosswire.org <mailto:jsword-devel at crosswire.org>
>>>>> http://www.crosswire.org/mailman/listinfo/jsword-devel <http://www.crosswire.org/mailman/listinfo/jsword-devel>
>>>> 
>>>> 
>>>> _______________________________________________
>>>> jsword-devel mailing list
>>>> jsword-devel at crosswire.org <mailto:jsword-devel at crosswire.org>
>>>> http://www.crosswire.org/mailman/listinfo/jsword-devel <http://www.crosswire.org/mailman/listinfo/jsword-devel>
>>>> 
>>>> 
>>>> _______________________________________________
>>>> jsword-devel mailing list
>>>> jsword-devel at crosswire.org <mailto:jsword-devel at crosswire.org>
>>>> http://www.crosswire.org/mailman/listinfo/jsword-devel <http://www.crosswire.org/mailman/listinfo/jsword-devel>
>>> 
>>> _______________________________________________
>>> jsword-devel mailing list
>>> jsword-devel at crosswire.org <mailto:jsword-devel at crosswire.org>
>>> http://www.crosswire.org/mailman/listinfo/jsword-devel <http://www.crosswire.org/mailman/listinfo/jsword-devel>
>> 
>> _______________________________________________
>> jsword-devel mailing list
>> jsword-devel at crosswire.org <mailto:jsword-devel at crosswire.org>
>> http://www.crosswire.org/mailman/listinfo/jsword-devel <http://www.crosswire.org/mailman/listinfo/jsword-devel>
> 
> 
> _______________________________________________
> jsword-devel mailing list
> jsword-devel at crosswire.org <mailto:jsword-devel at crosswire.org>
> http://www.crosswire.org/mailman/listinfo/jsword-devel <http://www.crosswire.org/mailman/listinfo/jsword-devel>
> 
> 
> _______________________________________________
> jsword-devel mailing list
> jsword-devel at crosswire.org
> http://www.crosswire.org/mailman/listinfo/jsword-devel

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.crosswire.org/pipermail/jsword-devel/attachments/20160111/9f641462/attachment-0001.html>


More information about the jsword-devel mailing list