[jsword-devel] Out of Memory Issues Loading repo module lists

Martin Denham mjdenham at gmail.com
Wed Jan 13 15:30:23 MST 2016


The latest code seems to be running quite smoothly.
eBible 680 modules 10Mb ram

I could not notice any new major pauses.
sbmd.reload works well and AB can show the About dialog.  I can see a
slight but insignificant pause as the full attributes are loaded.
I have installed various modules from different repositories.
The above tests were done on a fairly low spec Android 2.2 AVD with 64Mb
heap.

The only issue I have noticed is that non-Bibles are appearing in the list
of Bibles so I think there may be an issue with Category.  I can see some
commentaries and GenBooks in a list that should just contain Bibles.  If
the problem is not obvious I can investigate further later.

Thanks
Martin

On 12 January 2016 at 13:58, DM Smith <dmsmith at crosswire.org> wrote:

> I’m working on transforming the tar.gz to a zip. The zip has much faster
> access to files in it. The same amount of time per file. The tar.gz is Tape
> ARchive and is fast to get the first and slow to get the last.
>
> I just did some computations and unpacking the tar.gz is not good for your
> app. But you told me that…. :)
>
> In Him,
> DM
>
> On Jan 12, 2016, at 8:16 AM, Martin Denham <mjdenham at gmail.com> wrote:
>
> I like this idea: "A file in a jar has an URL that is something like
> …/fred.jar!file"
>
> Martin
>
> On 11 January 2016 at 23:29, DM Smith <dmsmith at crosswire.org> wrote:
>
>>
>> On Jan 11, 2016, at 6:07 PM, Martin Denham <mjdenham at gmail.com> wrote:
>>
>> My estimate of file size might be too low because I forgot to take into
>> account block size.  Quickly playing around with my android adds about 40%
>> making it at least 7Mb for the conf files.
>>
>>
>> Understand.
>>
>>
>> By 'fluff' do you mean extract all the files from mods.d.tar.gz and write
>> them all to disk.  I am a little concerned about writing and deleting
>> hundreds of small files to the sd card repeatedly.  SD cards are not as
>> good at high r/w as normal disks or flash drives.  That is the reason I do
>> not store the AB database on the SD card.
>>
>>
>> By fluff, I meant that the conf file would be re-read without a filter,
>> thus getting everything.
>>
>>
>> The process described would also make viewing a description (in AB
>> right-click About) an unexpectedly expensive operation involving writing
>> hundreds of files to the sd card.
>>
>>
>> It would involve re-reading the one file without a filter. It should
>> happen fast.
>>
>>
>> I did not know about sbmd.toOSIS() and have not used it.  AB just pops up
>> a little dialog with a few fields like About, copyright, licence, version,
>> versification.
>>
>>
>> Ok. Then you’ll need to call “fluff” before retreiving those fields. The
>> code for fluff (or whatever we call it) would be something like:
>> public void fluff() {
>>   if (partiallyLoaded) {
>> re-read and process the conf without a filter
>> partiallyLoaded = false;
>>    }
>> }
>>
>>
>> For the 2 reasons above my preference would be to avoid writing hundreds
>> of files to the SD card but I can't think of a perfect solution.  While
>> grappling with this last week I was just trying to get the original code to
>> work more efficiently (but failed).  I am not very experienced in Memory
>> Analysis but suspected the memory use was higher than it might have been.
>>
>> By design, which files do you write to SD card? If they are only written
>> when the mods.d.tar.gz is downloaded, would that help?
>>
>>
>> If the tar.gz was searched each time for the conf it would be more
>> expensive to process the tar.gz every time a Description is requested but
>> the first time it would be quicker than writing hundreds of .conf files and
>> to be honest I think a lot of people do not know about the long-press menu
>> in AB so probably just the initial list of modules would be used most of
>> the time.
>>
>>
>> I don’t know about the long-press menu. In BibleDesktop, it is easy to
>> navigate from one available to the next and each time it shows the full
>> conf.
>>
>>
>> Coincidentally my android slowed to a crawl when I tried to copy all of
>> eBible's .conf files to it just now - initially fast then 1 file per 3
>> secs, after 10 minutes I unplugged it, although that probably is not a
>> realistic test and there is probably an explanation for the issue.
>>
>>
>> I’ve nearly got the code written to unpack the conf. Let me zip up the
>> files that have changed and send them to you.
>>
>> Basically, if you delete mods.d.tar.gz, it will fetch a new one (current
>> behavior). If you delete mods.d/ it will unpack mods.d.tar.gz into it. If
>> you fetch mods.d.tar.gz it will unpack it into mods.d. All of this takes
>> place in the folder that mods.d.tar.gz is present.
>>
>> I tried adding new confs to mods.d that weren’t in mods.d.tar.gz to
>> simulate a takedown and that works as well.
>>
>> If this code is no good for you, I’ve another thought. A file in a jar
>> has an URL that is something like …/fred.jar!file. Maybe we can transform
>> the mods.d.tar.gz into mods.d.tar and use that addressing mechanism to
>> fetch the file? I’ll take a look at how the JRE does that. Maybe, I’ll roll
>> the same for JSword over a tar.gz file.
>>
>> DM
>>
>>
>> Martin
>>
>> On 11 January 2016 at 19:28, DM Smith <dmsmith at crosswire.org> wrote:
>>
>>> I have been thinking about this a bit more. I was knew there was a need
>>> to prevent stale confs. The time performance is something that I’m not able
>>> to test. My machine has an SSD, a fast 4 core CPU and gobs of RAM. So I
>>> need you to keep me in line. ;)
>>>
>>> The easiest way to keep it pristine is to unpack it into a temporary
>>> folder, rename the old folder and then rename the new folder. Finally
>>> deleting the old folder. By doing it in this order it minimizes the time
>>> that mods.d is unavailable. Important for multi-threaded apps and multiple
>>> apps that share the same machine simultaneously.
>>>
>>> Right now the SwordBookMetaData remembers the File for the conf of
>>> installed modules and is able to re-read it easily. But it does not store
>>> anything about a conf’s location when it is from mods.d.tar.gz. I suppose I
>>> could have it remember the location of mods.d.tar.gz and the name of the
>>> conf entry and create a method to extract a that conf out of the compressed
>>> archive. This would need to be done for each module that the user requests
>>> info. To do this is quite expensive as it means inflating the file then
>>> iterating over the contents until the desired conf is found.
>>>
>>> I think that it would be better to see how much time it adds to extract
>>> the files and store them on disk. The fluffing of them would only be when
>>> the user wants to browse a description of the module.
>>>
>>> I’d like to modify sbmd.toOSIS to check if the sbmd is partial or full
>>> and if not full re-read the conf fully and then continue as before. I think
>>> that is how JSword is designed to retreive the conf for presentation to the
>>> end user. Does AndBible use that or some other mechanism to get what it
>>> wants for presentation?
>>>
>>> I think I’ll add a “fluff” method to BookMetaData that will do this.
>>> This could be called to get it to fluff at another time.
>>>
>>> DM
>>>
>>> On Jan 11, 2016, at 1:00 PM, Martin Denham <mjdenham at gmail.com> wrote:
>>>
>>> My rough estimates have the total size of conf files in all repos at
>>> about 5Mb which is not too different to the size of a module like ESV so
>>> the impact should not be significant and it should not be a problem if this
>>> is required.
>>>
>>> Other things to consider that come to mind i) would need to remove conf
>>> files no longer in mods.d.tar.gz or delete and re-extract everything after
>>> a refresh ii) Time taken to save files - loading the list is already slow.
>>>
>>> I can't think of any major reason not to do as you describe.
>>>
>>> However, would an easier approach be to find files in the zip a bit like
>>> this
>>> <http://stackoverflow.com/questions/11123528/finding-a-file-in-zipentry-java>.
>>> Speed would not be an issue because it would only be done once or twice
>>> after fetching the list e.g. to view About or to actually download.  The
>>> mod.conf file name/path could be saved in SBMD if required.
>>>
>>> Martin
>>>
>>>
>>> On 11 January 2016 at 01:39, DM Smith <dmsmith at crosswire.org> wrote:
>>>
>>>> I’m trying to figure out how to reload a conf from a remote source (to
>>>> go from a partial load to a full load).  The problem is that the
>>>> AbstractSwordInstaller sits over top of mods.d.tar.gz, which it does not
>>>> unpack. Instead, it iterates over all the entries in that binary file and
>>>> handles each entry (i.e. a conf) in core. It doesn’t hit the disk. I’m
>>>> wondering whether it would be alright to unpack the file in the same
>>>> folder? That would allow a SwordBookMetaData to reload the file. It would
>>>> also mean that SwordBookMetaData would only need one means of reading a
>>>> conf as it’d be a file and not a byte array.
>>>>
>>>> It isn’t a problem with desktop or server apps, but it might be for
>>>> AndBible.
>>>>
>>>> — DM
>>>>
>>>>
>>>>
>>>> On Jan 10, 2016, at 3:31 PM, DM Smith <dmsmith at crosswire.org> wrote:
>>>>
>>>> The problem you encountered was 2 bugs:
>>>> When the module is not UTF-8 the remote repository’s conf is re-read,
>>>> but the filter wasn’t passed.
>>>> Not intended, but IniSection required a filter, rather than saying a
>>>> null filter meant everything passed.
>>>>
>>>> I’ve checked in that fix. Still trying to make the memory less….
>>>>
>>>> — DM
>>>>
>>>> On Jan 10, 2016, at 1:18 PM, DM Smith <dmsmith at crosswire.org> wrote:
>>>>
>>>> The “Partial load of conf file.’ was to load all of the things in a
>>>> conf that the JSword engine needs to work with a module. I don’t know why
>>>> the CrossWire repo is working for me but not for you. I’ll keep working on
>>>> it today. The problem with the previous commit was fixed with the last
>>>> commit. I wasn’t “adjusting” the module after loading to fill in things
>>>> like BookDriver and BookCategory.
>>>>
>>>> I’m wondering whether getting the list of Books from the installer
>>>> creates a deep rather than a shallow copy of them.
>>>>
>>>> Today I hope to make SwordBookMetaData even more lazy. It has a
>>>> BookDriver and validates its storage when the repo is loaded. I plan to
>>>> break one of my modules by renaming one of the files and see the impact.
>>>> Chris and I have noticed that the FileState objects are not fully released.
>>>> This actually is part of the design.
>>>>
>>>> Anyway, I think it is going in the right direction. Reducing the memory
>>>> 4x is a  good thing. The data structures within the IniSection may be too
>>>> heavy. I may relax the requirement that it maintains the SWORD confs order.
>>>> The idea was to be able to modify the provided conf, retaining its order.
>>>> However, now we never modify that conf.
>>>>
>>>> configAll was a deep clone of configSword. configAll adds in the
>>>> contents of configJSword and then configFrontend. These last two are
>>>> created even if not needed. We could make them lazy as well.
>>>>
>>>> DM
>>>>
>>>> On Jan 10, 2016, at 11:07 AM, Martin Denham <mjdenham at gmail.com> wrote:
>>>>
>>>> Thanks for the quick response.  I have had a brief look at the new
>>>> commits.
>>>>
>>>> A lot of the attributes aren't being returned now so it is tricky to
>>>> test and there are various errors but running the current tip 'Partial
>>>> load of conf file.
>>>> <https://github.com/crosswire/jsword/commit/80020f51c6a762d458ce8ae70007b78eadee1fb3>'
>>>> the SBMD for eBible is now only a quarter of the original size at 10Mb
>>>> which is fine but I still don't understand why it is so large for the
>>>> minimal attribute set now being returned.
>>>>
>>>> I get a lot of errors like:
>>>> SwordBookMetaData(492): Book not supported: malformed conf file for
>>>> [BBE] no ModDrv found.
>>>> SwordBookMetaData(492): Malformed conf file: missing [BBE]Description=.
>>>> Using BBE
>>>>
>>>> and peculiarly the eBible repo seems to be the only repo I can use
>>>> because all the others error.
>>>>
>>>> I also tried the previous commit Cut the memory requirements of a
>>>> SwordBookMetaData in half.
>>>> <https://github.com/crosswire/jsword/commit/cc32ba8f1bb245932a747390d03874b2be70e9a1> but
>>>> it did not work because basic attributes like language were not being
>>>> returned.
>>>>
>>>> I still don't understand why removing configSword should reduce memory
>>>> by half because it should just be removing references to data that is also
>>>> referenced from configAll, so it would reduce memory slightly but not much.
>>>>
>>>> Martin
>>>>
>>>>
>>>>
>>>> On 10 January 2016 at 04:14, DM Smith <dmsmith at crosswire.org> wrote:
>>>>
>>>>> OK. That’s done. Also accidentally introduced a bug with the last
>>>>> commit. It is noticeably fast.
>>>>>
>>>>> Next up, allow for *a* SwordBookMetaData to be reloaded fully. This is
>>>>> needed to bring in all the other elements which are information only, such
>>>>> as About, in order to display info to the end user. Since the user will
>>>>> only look at one modules info at a time, it will load that one. You may
>>>>> need to change your code (hope not) to force that one to reload.
>>>>>
>>>>> Give the code a try to see if it solves your out of memory error.
>>>>>
>>>>> DM
>>>>>
>>>>>
>>>>> On Jan 9, 2016, at 9:06 PM, DM Smith <dmsmith at crosswire.org> wrote:
>>>>>
>>>>> I’ll be adding a filter to IniSection. Something like:
>>>>> if  (filter.test(key)) {
>>>>> use the key
>>>>> } else {
>>>>> do nothing
>>>>> }
>>>>>
>>>>> SwordBookMetaData will be responsible for building the filter. At
>>>>> least for a first go around. A single object should do.
>>>>>
>>>>> DM
>>>>>
>>>>> On Jan 9, 2016, at 6:29 PM, DM Smith <dmsmith at crosswire.org> wrote:
>>>>>
>>>>>
>>>>> Yes, like you I have thought of streamlining conf loading for repo
>>>>> lists.  One idea I had was to enable specification of a filter to
>>>>> SwordBookMetaData to limit the conf values that are stored.
>>>>>
>>>>>
>>>>> I was thinking of something similar. My ideas aren’t good enough to be
>>>>> put into practice, but some kind of flag indicating empty, partially or
>>>>> fully loaded. Empty would mean that it hasn’t gone to disk to get the conf.
>>>>> Partial means that it read everything, but threw away most as not
>>>>> interesting (since the conf does not have order you have to read and parse
>>>>> it all). Full would mean that nothing was pitched.
>>>>> SwordBookMetaData.getProperty would need to be changed to determine whether
>>>>> the key is in memory or might be on disk and do the right thing. Or we
>>>>> could keep getProperty as it is and if you want one of the fields that is
>>>>> not stored (e.g. About) you have to call reload().
>>>>>
>>>>> Maybe we could also cache that info into a separate file(s)? When
>>>>> mods.d.tar.gz is updated then the cache would be recomputed. In doing the
>>>>> computation, each conf would be read then pitched. Basically, the storage
>>>>> would be o.c.c.utils.Ini, if one file or IniSection, if many files.
>>>>>
>>>>> What do you think?
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> jsword-devel mailing list
>>>>> jsword-devel at crosswire.org
>>>>> http://www.crosswire.org/mailman/listinfo/jsword-devel
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> jsword-devel mailing list
>>>>> jsword-devel at crosswire.org
>>>>> http://www.crosswire.org/mailman/listinfo/jsword-devel
>>>>>
>>>>>
>>>> _______________________________________________
>>>> jsword-devel mailing list
>>>> jsword-devel at crosswire.org
>>>> http://www.crosswire.org/mailman/listinfo/jsword-devel
>>>>
>>>>
>>>> _______________________________________________
>>>> jsword-devel mailing list
>>>> jsword-devel at crosswire.org
>>>> http://www.crosswire.org/mailman/listinfo/jsword-devel
>>>>
>>>>
>>>> _______________________________________________
>>>> jsword-devel mailing list
>>>> jsword-devel at crosswire.org
>>>> http://www.crosswire.org/mailman/listinfo/jsword-devel
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> jsword-devel mailing list
>>>> jsword-devel at crosswire.org
>>>> http://www.crosswire.org/mailman/listinfo/jsword-devel
>>>>
>>>>
>>> _______________________________________________
>>> jsword-devel mailing list
>>> jsword-devel at crosswire.org
>>> http://www.crosswire.org/mailman/listinfo/jsword-devel
>>>
>>>
>>>
>>> _______________________________________________
>>> jsword-devel mailing list
>>> jsword-devel at crosswire.org
>>> http://www.crosswire.org/mailman/listinfo/jsword-devel
>>>
>>>
>> _______________________________________________
>> jsword-devel mailing list
>> jsword-devel at crosswire.org
>> http://www.crosswire.org/mailman/listinfo/jsword-devel
>>
>>
>>
>> _______________________________________________
>> jsword-devel mailing list
>> jsword-devel at crosswire.org
>> http://www.crosswire.org/mailman/listinfo/jsword-devel
>>
>>
> _______________________________________________
> jsword-devel mailing list
> jsword-devel at crosswire.org
> http://www.crosswire.org/mailman/listinfo/jsword-devel
>
>
>
> _______________________________________________
> jsword-devel mailing list
> jsword-devel at crosswire.org
> http://www.crosswire.org/mailman/listinfo/jsword-devel
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.crosswire.org/pipermail/jsword-devel/attachments/20160113/95743384/attachment-0001.html>


More information about the jsword-devel mailing list