[jsword-devel] New versification mapping - preload mappings to prevent pauses

Chris Burrell chris at burrell.me.uk
Sat Jan 18 14:05:38 MST 2014


I think to encourage the use of the Passages directly we need to make those
constructors public. That's the main reason I have stayed away from using
it.

Obviously a fast way to construct the versification mappings would be to
turn them into verse numbers. But that means losing readability. The
obvious thought here is to generate an intermediary file as part of the
build from what we have already code (custom maven plugin/ant task). But
that seems rather hacky, but would line up with the way the versifications
are defined in code.

So basically what we want is:
* an easy way of reading in a verse range/passage from text form into
Verse/VerseRanges

Basically, the way the code works is to iterate through the properties file
and map and reverse-map every verse present to the maximum size available
(i.e. 1 verse to a Passage). The idea behind this is that we store less
that way.

So if a mapping specifies as follows: Gen.1.1=Gen.1.1-2, then we only need
to store
left to right: Gen.1.1=>Gen.1.1-2
right to left: Gen.1=>Gen.1.1, Gen.1.2=>Gen.1.1

In that way, we've saved a mapping, and therefore memory (assuming we
represent those ranges correctly).
Chris



On 18 January 2014 20:48, DM Smith <dmsmith at crosswire.org> wrote:

>
> On Jan 18, 2014, at 3:19 PM, Chris Burrell <chris at burrell.me.uk> wrote:
>
> Great stuff - helps really explain things. So I think for the
> versification mappings, seen as they don't change much as they are mostly
> contiguous ranges a RangePassage might be the best bet.
>
>
> A RangedPassage is really good once it is constructed. Building it takes a
> lot of time. Certainly not good for canonical ordered search results.
>
>
>
> I think the other thing that's quite expensive
> is: PassageKeyFactory.instance().getKey(targetVersification,
> qualifiedKey.getKey().getOsisRef())
>
>
> I think this is expensive because of how it is done.
> From memory: getOsisRef() returns a string, which needs to be parsed into
> what getKey() already is. The parser does not handle osisRefs. It throws an
> exception and the caller then mungs the reference such that the parser can
> handle it. JSword needs an osisRef parser, which would be much simpler and
> faster than the current parser that tries to understand what an end user is
> trying to provide.
>
> If the reference is a string (e.g. from a user, from a module, these
> methods are appropriate), these are what should be used. If not a string,
> then stay away. It is expensive to convert a Passage to a String and it is
> expensive to parse a String and to build the Passage.
>
> It'd be better to get an empty key list and add qualifiedKey.getKey().
>
>
>
> I'm not sure what's the best way of dealing with this but I suspect that
> we may want to have specific cases for Verses as I believe the above is
> quite heavy as well.
>
> I think I also often
> use: PassageKeyFactory.instance().createEmptyKeyList(versification) which
> probably should use a RangePassage as well
>
>
> It should use the PassageType that the program finds best. A server side
> application has very different requirements from a desktop and that from a
> mobile app.
>
> If you need a specific implementation, don't use the factory. Use what is
> best for the particular location.
>
> Just note that if two different implementations are used (like adding one
> passage to another) the result will probably be the program default.
>
>
> Would that make sense?
> Chris
>
>
>
> On 18 January 2014 20:11, DM Smith <dmsmith at crosswire.org> wrote:
>
>> There are several different implementations for Passage. They have
>> trade-offs.
>> BitwisePassage is strictly a BitMap, with one bit for each verse in the
>> v11n. It is very space heavy but fast for most operations. Especially
>> adding or removing Verses. Iteration is fast as it uses NextSetBit (or
>> whatever it is called to find the next set bit).
>> RocketPassage is a very heavy implementation. Not appropriate for mobile.
>> It occupies lots more space than a BitwisePassage
>> PassageTally is basically a weighted set of verses such that you can
>> order by weights. This is used to return weighted search results. It is
>> probably not much use otherwise. It is used to return a ranked Lucene
>> search.
>> RangedPassage keeps ranges (start verse and length) ordered in a TreeSet.
>> It probably is a good candidate for mobile. Maybe the best regarding
>> storage. Works well if the RangedPassage is not modified much. If you add a
>> verse (not in the RangedPassage) it may cause two VerseRanges to be merged
>> into one, otherwise it'll extend one or create a new one. That computation
>> takes time. Removing a verse is also complex.
>> DistinctPassage keeps verses ordered in a TreeSet. This can be a good
>> candidate for mobile, if there are not many verses in the Passage. If there
>> are lots it gets bigger than BitwisePassage.
>>
>> The PassageKeyFactory is used throughout JSword to create the preferred
>> type of Passage. You can set it to one of the above via PassageType
>> typically at program startup.
>>
>> I wrote BitwisePassage; Joe wrote the others.
>>
>> I've been thinking that there might be another variation of
>> BitwisePassage that might make sense. One with testament or book bitmaps.
>>
>> Hope this helps.
>>
>> In Him,
>> DM
>>
>>
>> On Jan 18, 2014, at 2:42 PM, Chris Burrell <chris at burrell.me.uk> wrote:
>>
>> The reason I went for passages is for less memory consumption. I also
>> thought it would provide better parsing and we're parsing less.
>>
>> But I think the way we store passages is to represent every verse by a
>> bit. means that iterating through ranges, calculating cardinality or
>> initialising them means is expensive.
>>
>> I wonder if we could have a passage that just stores boundaries rather
>> than all its content. That would be very fast to parse and very quick for
>> operations on it.
>>
>> Chris
>>  On 18 Jan 2014 19:17, "Martin Denham" <mjdenham at gmail.com> wrote:
>>
>>> I think I have found a workaround for this problem (using a background
>>> thread makes it run very quickly) but it seems to point toward a possible
>>> point of improvement in the new JSword mapping data structure.
>>>
>>> In the old AB v11n mapping code I stored maps of Verses.  However the
>>> new JSword code seems to store maps of RocketPassage/BitwisePassage.  Can
>>> you verify this is the best data structure for what will normally be a
>>> single verse.  I don't know an awful lot about this so I am just asking,
>>> but maybe a Passage will always consume more memory than a Verse.
>>>
>>> Thanks
>>> Martin
>>>
>>>
>>>
>>>
>>> On 18 January 2014 13:30, Martin Denham <mjdenham at gmail.com> wrote:
>>>
>>>> I tried your patch and it did noticeably improve things, so it is a
>>>> worthwhile patch, but did not solve the actual problem.  Still viewing the
>>>> RusSynodal straight after download takes well over a minute, whereas to
>>>> boot straight into it takes about 2 secs.
>>>>
>>>> One or 2 things I have noticed:
>>>>
>>>>    - a huge amount of memory allocation and GC occurs when trying to
>>>>    view RusSynodal straight after download
>>>>    - I am not great with the profiler but here are my tentative
>>>>    observations
>>>>       - Seems to be spending about half the time in
>>>>       java.util.BitSet.cardinality()
>>>>          - called by
>>>>          BitwisePassage.countVerses()<-RocketPassage.countVerses()<-AbstractPassage.getCardinality()<-VersificationToKJVMapper.add...
>>>>       - I tried caching AbstractPassage.getCardinality but that had no
>>>>       affect which would be the case if there were lots of different keys
>>>>    - Could it be that a module is in an unusual state after download?
>>>>     I have noticed that a module cannot be deleted directly after download,
>>>>    until the app is restarted due to:
>>>>       - sbmd.getConfigFile()==null when called by
>>>>       SwordBookDriver.isDeletable directly after module download
>>>>
>>>> Any suggestions are welcome.
>>>>
>>>> Cheers
>>>> Martin
>>>>
>>>>
>>>> On 16 January 2014 23:13, Martin Denham <mjdenham at gmail.com> wrote:
>>>>
>>>>> I think I will use the old AB v11n mapping for the next release of AB
>>>>> and come back to this after the release which has quite a lot of changes in
>>>>> it already anyway.
>>>>>
>>>>> The problem could be AB, Android, or JSword related and it is not
>>>>> trivial to reproduce involving repeated re-installs of RusSynodal.  I had a
>>>>> similar problem a year or two ago, it took a long time to investigate and
>>>>> ended up being a flaw in the Android VM which required a change to JSword.
>>>>>
>>>>> I keep trying to think what could cause these symptoms and drawing a
>>>>> blank.
>>>>>
>>>>> Cheers
>>>>> Martin
>>>>>
>>>>>
>>>>> On 16 January 2014 21:35, Chris Burrell <chris at burrell.me.uk> wrote:
>>>>>
>>>>>> I'll try profiling it! Nothing comes to mind though.
>>>>>>  On 16 Jan 2014 20:04, "Martin Denham" <mjdenham at gmail.com> wrote:
>>>>>>
>>>>>>> There is something unusual happening with the v11n mapping creation
>>>>>>> straight after downloading a file requiring mapping e.g. It takes over 2
>>>>>>> mins to load the mapping (a single call to mapVerse) for RusSynodal
>>>>>>> straight after download but a few seconds to load the file normally.
>>>>>>>
>>>>>>> 1. View ESV and anything else in split screen
>>>>>>> 2. Download RusSynodal
>>>>>>> 3. Exit download screen after download
>>>>>>> 4. select RusSynodal for display (with ESV) in 1 half of split screen
>>>>>>> 5. Mapping code now takes over 2 minutes to load mapping
>>>>>>> 6. But there are no errors and then everything seems fine
>>>>>>>
>>>>>>> 7. Exit and kill And Bible
>>>>>>> 8. Restart with ESV/RST in split screens
>>>>>>> 9. This time everything is initialised in a few seconds
>>>>>>>
>>>>>>> Any ideas?
>>>>>>>
>>>>>>> Martin
>>>>>>>
>>>>>>>
>>>>>>> On 16 January 2014 19:06, Chris Burrell <chris at burrell.me.uk> wrote:
>>>>>>>
>>>>>>>> Hi Martin
>>>>>>>>
>>>>>>>> Fine by me. Think it should be opt in if possible.
>>>>>>>>
>>>>>>>> A couple of thoughts on that note. Do you know of it s the io or
>>>>>>>> the cpu time that's the issue?
>>>>>>>>
>>>>>>>> We probably want to opt in with a list as well as all as there's no
>>>>>>>> point in loading the v11n if not required.
>>>>>>>>
>>>>>>>> I guess especially if we end up with mappings per module
>>>>>>>> eventually.
>>>>>>>>
>>>>>>>> On a separate note, I also found the books.installed call is very
>>>>>>>> expensive. Thinking it may be worth partially loading these. With around
>>>>>>>> 200 modules we spend almost 15 seconds loading them.
>>>>>>>>
>>>>>>>> Finally you can apply for a open source license of jprofiler which
>>>>>>>> helps massively to work out what's going on. Have got a couple of
>>>>>>>> uncommitted fixes for books. Installed find with that.
>>>>>>>>
>>>>>>>> Chris
>>>>>>>> On 16 Jan 2014 18:10, "Martin Denham" <mjdenham at gmail.com> wrote:
>>>>>>>>
>>>>>>>>>  Hi,
>>>>>>>>>
>>>>>>>>> I have integrated the new v11n mapping code and find I am getting
>>>>>>>>> a pause when doing the initial mapping between any 2 v11ns.
>>>>>>>>>
>>>>>>>>> I originally had the same problem with the AB mapping code so
>>>>>>>>> pre-loaded all mapping required for the installed set of documents at the
>>>>>>>>> start in a background thread to prevent delays.  The code I used is in
>>>>>>>>> VersificationMappingFactory.initialiseRequiredMappings()<https://github.com/mjdenham/and-bible/blob/master/AndBible/src/net/bible/android/control/versification/mapping/VersificationMappingFactory.java>
>>>>>>>>> .
>>>>>>>>>
>>>>>>>>> I am trying to think of the best way to do this in the new code.
>>>>>>>>>  Either i) you could add a method to do it which could be called or ii) you
>>>>>>>>> could proeload all required mappings automatically or iii) add a new method
>>>>>>>>> to allow AB (or other frontend) to trigger preload of all required mappings
>>>>>>>>> by adding a public method a bit like the new
>>>>>>>>> VersificationsMapper.ensure(v11ntopreload).
>>>>>>>>>
>>>>>>>>> This is quite an issue for mobile users.  I have a fast mobile and
>>>>>>>>> loading a mapping causes a noticeable delay the first time a verse changes,
>>>>>>>>> but the preload fix is fairly simple and worked well.
>>>>>>>>>
>>>>>>>>> Cheers
>>>>>>>>> Martin
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> jsword-devel mailing list
>>>>>>>>> jsword-devel at crosswire.org
>>>>>>>>> http://www.crosswire.org/mailman/listinfo/jsword-devel
>>>>>>>>>
>>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> jsword-devel mailing list
>>>>>>>> jsword-devel at crosswire.org
>>>>>>>> http://www.crosswire.org/mailman/listinfo/jsword-devel
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>
>>>>
>>>  _______________________________________________
>> jsword-devel mailing list
>> jsword-devel at crosswire.org
>> http://www.crosswire.org/mailman/listinfo/jsword-devel
>>
>>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.crosswire.org/pipermail/jsword-devel/attachments/20140118/35a028b8/attachment-0001.html>


More information about the jsword-devel mailing list