[jsword-devel] Out of Memory Issues Loading repo module lists

Martin Denham mjdenham at gmail.com
Mon Jan 11 16:07:50 MST 2016


My estimate of file size might be too low because I forgot to take into
account block size.  Quickly playing around with my android adds about 40%
making it at least 7Mb for the conf files.

By 'fluff' do you mean extract all the files from mods.d.tar.gz and write
them all to disk.  I am a little concerned about writing and deleting
hundreds of small files to the sd card repeatedly.  SD cards are not as
good at high r/w as normal disks or flash drives.  That is the reason I do
not store the AB database on the SD card.

The process described would also make viewing a description (in AB
right-click About) an unexpectedly expensive operation involving writing
hundreds of files to the sd card.

I did not know about sbmd.toOSIS() and have not used it.  AB just pops up a
little dialog with a few fields like About, copyright, licence, version,
versification.

For the 2 reasons above my preference would be to avoid writing hundreds of
files to the SD card but I can't think of a perfect solution.  While
grappling with this last week I was just trying to get the original code to
work more efficiently (but failed).  I am not very experienced in Memory
Analysis but suspected the memory use was higher than it might have been.

If the tar.gz was searched each time for the conf it would be more
expensive to process the tar.gz every time a Description is requested but
the first time it would be quicker than writing hundreds of .conf files and
to be honest I think a lot of people do not know about the long-press menu
in AB so probably just the initial list of modules would be used most of
the time.

Coincidentally my android slowed to a crawl when I tried to copy all of
eBible's .conf files to it just now - initially fast then 1 file per 3
secs, after 10 minutes I unplugged it, although that probably is not a
realistic test and there is probably an explanation for the issue.

Martin

On 11 January 2016 at 19:28, DM Smith <dmsmith at crosswire.org> wrote:

> I have been thinking about this a bit more. I was knew there was a need to
> prevent stale confs. The time performance is something that I’m not able to
> test. My machine has an SSD, a fast 4 core CPU and gobs of RAM. So I need
> you to keep me in line. ;)
>
> The easiest way to keep it pristine is to unpack it into a temporary
> folder, rename the old folder and then rename the new folder. Finally
> deleting the old folder. By doing it in this order it minimizes the time
> that mods.d is unavailable. Important for multi-threaded apps and multiple
> apps that share the same machine simultaneously.
>
> Right now the SwordBookMetaData remembers the File for the conf of
> installed modules and is able to re-read it easily. But it does not store
> anything about a conf’s location when it is from mods.d.tar.gz. I suppose I
> could have it remember the location of mods.d.tar.gz and the name of the
> conf entry and create a method to extract a that conf out of the compressed
> archive. This would need to be done for each module that the user requests
> info. To do this is quite expensive as it means inflating the file then
> iterating over the contents until the desired conf is found.
>
> I think that it would be better to see how much time it adds to extract
> the files and store them on disk. The fluffing of them would only be when
> the user wants to browse a description of the module.
>
> I’d like to modify sbmd.toOSIS to check if the sbmd is partial or full and
> if not full re-read the conf fully and then continue as before. I think
> that is how JSword is designed to retreive the conf for presentation to the
> end user. Does AndBible use that or some other mechanism to get what it
> wants for presentation?
>
> I think I’ll add a “fluff” method to BookMetaData that will do this. This
> could be called to get it to fluff at another time.
>
> DM
>
> On Jan 11, 2016, at 1:00 PM, Martin Denham <mjdenham at gmail.com> wrote:
>
> My rough estimates have the total size of conf files in all repos at about
> 5Mb which is not too different to the size of a module like ESV so the
> impact should not be significant and it should not be a problem if this is
> required.
>
> Other things to consider that come to mind i) would need to remove conf
> files no longer in mods.d.tar.gz or delete and re-extract everything after
> a refresh ii) Time taken to save files - loading the list is already slow.
>
> I can't think of any major reason not to do as you describe.
>
> However, would an easier approach be to find files in the zip a bit like
> this
> <http://stackoverflow.com/questions/11123528/finding-a-file-in-zipentry-java>.
> Speed would not be an issue because it would only be done once or twice
> after fetching the list e.g. to view About or to actually download.  The
> mod.conf file name/path could be saved in SBMD if required.
>
> Martin
>
>
> On 11 January 2016 at 01:39, DM Smith <dmsmith at crosswire.org> wrote:
>
>> I’m trying to figure out how to reload a conf from a remote source (to go
>> from a partial load to a full load).  The problem is that the
>> AbstractSwordInstaller sits over top of mods.d.tar.gz, which it does not
>> unpack. Instead, it iterates over all the entries in that binary file and
>> handles each entry (i.e. a conf) in core. It doesn’t hit the disk. I’m
>> wondering whether it would be alright to unpack the file in the same
>> folder? That would allow a SwordBookMetaData to reload the file. It would
>> also mean that SwordBookMetaData would only need one means of reading a
>> conf as it’d be a file and not a byte array.
>>
>> It isn’t a problem with desktop or server apps, but it might be for
>> AndBible.
>>
>> — DM
>>
>>
>>
>> On Jan 10, 2016, at 3:31 PM, DM Smith <dmsmith at crosswire.org> wrote:
>>
>> The problem you encountered was 2 bugs:
>> When the module is not UTF-8 the remote repository’s conf is re-read, but
>> the filter wasn’t passed.
>> Not intended, but IniSection required a filter, rather than saying a null
>> filter meant everything passed.
>>
>> I’ve checked in that fix. Still trying to make the memory less….
>>
>> — DM
>>
>> On Jan 10, 2016, at 1:18 PM, DM Smith <dmsmith at crosswire.org> wrote:
>>
>> The “Partial load of conf file.’ was to load all of the things in a conf
>> that the JSword engine needs to work with a module. I don’t know why the
>> CrossWire repo is working for me but not for you. I’ll keep working on it
>> today. The problem with the previous commit was fixed with the last commit.
>> I wasn’t “adjusting” the module after loading to fill in things like
>> BookDriver and BookCategory.
>>
>> I’m wondering whether getting the list of Books from the installer
>> creates a deep rather than a shallow copy of them.
>>
>> Today I hope to make SwordBookMetaData even more lazy. It has a
>> BookDriver and validates its storage when the repo is loaded. I plan to
>> break one of my modules by renaming one of the files and see the impact.
>> Chris and I have noticed that the FileState objects are not fully released.
>> This actually is part of the design.
>>
>> Anyway, I think it is going in the right direction. Reducing the memory
>> 4x is a  good thing. The data structures within the IniSection may be too
>> heavy. I may relax the requirement that it maintains the SWORD confs order.
>> The idea was to be able to modify the provided conf, retaining its order.
>> However, now we never modify that conf.
>>
>> configAll was a deep clone of configSword. configAll adds in the contents
>> of configJSword and then configFrontend. These last two are created even if
>> not needed. We could make them lazy as well.
>>
>> DM
>>
>> On Jan 10, 2016, at 11:07 AM, Martin Denham <mjdenham at gmail.com> wrote:
>>
>> Thanks for the quick response.  I have had a brief look at the new
>> commits.
>>
>> A lot of the attributes aren't being returned now so it is tricky to test
>> and there are various errors but running the current tip 'Partial load
>> of conf file.
>> <https://github.com/crosswire/jsword/commit/80020f51c6a762d458ce8ae70007b78eadee1fb3>'
>> the SBMD for eBible is now only a quarter of the original size at 10Mb
>> which is fine but I still don't understand why it is so large for the
>> minimal attribute set now being returned.
>>
>> I get a lot of errors like:
>> SwordBookMetaData(492): Book not supported: malformed conf file for [BBE]
>> no ModDrv found.
>> SwordBookMetaData(492): Malformed conf file: missing [BBE]Description=.
>> Using BBE
>>
>> and peculiarly the eBible repo seems to be the only repo I can use
>> because all the others error.
>>
>> I also tried the previous commit Cut the memory requirements of a
>> SwordBookMetaData in half.
>> <https://github.com/crosswire/jsword/commit/cc32ba8f1bb245932a747390d03874b2be70e9a1> but
>> it did not work because basic attributes like language were not being
>> returned.
>>
>> I still don't understand why removing configSword should reduce memory by
>> half because it should just be removing references to data that is also
>> referenced from configAll, so it would reduce memory slightly but not much.
>>
>> Martin
>>
>>
>>
>> On 10 January 2016 at 04:14, DM Smith <dmsmith at crosswire.org> wrote:
>>
>>> OK. That’s done. Also accidentally introduced a bug with the last
>>> commit. It is noticeably fast.
>>>
>>> Next up, allow for *a* SwordBookMetaData to be reloaded fully. This is
>>> needed to bring in all the other elements which are information only, such
>>> as About, in order to display info to the end user. Since the user will
>>> only look at one modules info at a time, it will load that one. You may
>>> need to change your code (hope not) to force that one to reload.
>>>
>>> Give the code a try to see if it solves your out of memory error.
>>>
>>> DM
>>>
>>>
>>> On Jan 9, 2016, at 9:06 PM, DM Smith <dmsmith at crosswire.org> wrote:
>>>
>>> I’ll be adding a filter to IniSection. Something like:
>>> if  (filter.test(key)) {
>>> use the key
>>> } else {
>>> do nothing
>>> }
>>>
>>> SwordBookMetaData will be responsible for building the filter. At least
>>> for a first go around. A single object should do.
>>>
>>> DM
>>>
>>> On Jan 9, 2016, at 6:29 PM, DM Smith <dmsmith at crosswire.org> wrote:
>>>
>>>
>>> Yes, like you I have thought of streamlining conf loading for repo
>>> lists.  One idea I had was to enable specification of a filter to
>>> SwordBookMetaData to limit the conf values that are stored.
>>>
>>>
>>> I was thinking of something similar. My ideas aren’t good enough to be
>>> put into practice, but some kind of flag indicating empty, partially or
>>> fully loaded. Empty would mean that it hasn’t gone to disk to get the conf.
>>> Partial means that it read everything, but threw away most as not
>>> interesting (since the conf does not have order you have to read and parse
>>> it all). Full would mean that nothing was pitched.
>>> SwordBookMetaData.getProperty would need to be changed to determine whether
>>> the key is in memory or might be on disk and do the right thing. Or we
>>> could keep getProperty as it is and if you want one of the fields that is
>>> not stored (e.g. About) you have to call reload().
>>>
>>> Maybe we could also cache that info into a separate file(s)? When
>>> mods.d.tar.gz is updated then the cache would be recomputed. In doing the
>>> computation, each conf would be read then pitched. Basically, the storage
>>> would be o.c.c.utils.Ini, if one file or IniSection, if many files.
>>>
>>> What do you think?
>>>
>>>
>>> _______________________________________________
>>> jsword-devel mailing list
>>> jsword-devel at crosswire.org
>>> http://www.crosswire.org/mailman/listinfo/jsword-devel
>>>
>>>
>>>
>>> _______________________________________________
>>> jsword-devel mailing list
>>> jsword-devel at crosswire.org
>>> http://www.crosswire.org/mailman/listinfo/jsword-devel
>>>
>>>
>> _______________________________________________
>> jsword-devel mailing list
>> jsword-devel at crosswire.org
>> http://www.crosswire.org/mailman/listinfo/jsword-devel
>>
>>
>> _______________________________________________
>> jsword-devel mailing list
>> jsword-devel at crosswire.org
>> http://www.crosswire.org/mailman/listinfo/jsword-devel
>>
>>
>> _______________________________________________
>> jsword-devel mailing list
>> jsword-devel at crosswire.org
>> http://www.crosswire.org/mailman/listinfo/jsword-devel
>>
>>
>>
>> _______________________________________________
>> jsword-devel mailing list
>> jsword-devel at crosswire.org
>> http://www.crosswire.org/mailman/listinfo/jsword-devel
>>
>>
> _______________________________________________
> jsword-devel mailing list
> jsword-devel at crosswire.org
> http://www.crosswire.org/mailman/listinfo/jsword-devel
>
>
>
> _______________________________________________
> jsword-devel mailing list
> jsword-devel at crosswire.org
> http://www.crosswire.org/mailman/listinfo/jsword-devel
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.crosswire.org/pipermail/jsword-devel/attachments/20160111/3c4a727e/attachment-0001.html>


More information about the jsword-devel mailing list