[jsword-devel] Getting the global key list for a book

DM Smith dmsmith at crosswire.org
Fri Feb 15 15:08:41 MST 2013


Contains(xxx) is correct. So don't need to concern about the indirection.

There are three parts to a compressed testament. The one index points to a second index which points to the data.
-- DM

On Feb 15, 2013, at 5:05 PM, Chris Burrell <chris at burrell.me.uk> wrote:

> I'm confused about the double indirection in that the contains() method doesn't use it which is what I based my bit on.
> 
> It simply checked for the size given an ordinal. So I'm doing the opposite, getting all the data first, then iterating through them looking for non-zero sizes
> 
> I'm pretty sure it works (tested a little bit). It's shaved off heaps off time (several minutes to a few seconds for all 250 modules I have installed), but obviously that's no good if it isn't doing the right thing!
> Chris
> 
> 
> 
> On 15 February 2013 21:59, DM Smith <dmsmith at crosswire.org> wrote:
> I like the direction this is heading. I'm not sure if it properly handles the double indirection of the compressed module. (I didn't look.) So assuming it is....
> 
> Just directly create and use a bitwise passage. don't get it from the factory.
> 
> The purpose of the factory was to be able to switch between implementations.
> 
> I added the two bitwise passage classes and it is pretty much all we use now.
> 
> I'm wondering whether it'd be cheaper to build a list of those that aren't present. (Just wondering).
> 
> Also have been musing on whether it'd be good to modify Bitwise passage to hold two testament bit maps. And null if not needed. Or maybe one per book.
> 
> Anyway, by writing to an api we can swap out the implementation. (We should add the new method as an overload by name to Passage and abstract passage. That way you don't need to know what kind of passage it is.)
> 
> It might be good to add a method to SwordUtils with the same decode params and determines whether the array has content. (i.e. is zero). This is in a tight loop, so would be fast.
> 
> 
> On Feb 15, 2013, at 3:26 PM, Chris Burrell <chris at burrell.me.uk> wrote:
> 
>> Hi DM
>> 
>> I'm about to test the following method for at least all Z backends:
>> 
>> Maybe you can tell me what you think of it:
>> 
>> getFastGlobalKeyList
>> 
>> public Key getFastGlobalKeyList() throws BookException {
>>         ZVerseBackendState rafBook = null;
>>         try {
>>             rafBook = initState();
>> 
>>             String v11nName = getBookMetaData().getProperty(ConfigEntryType.VERSIFICATION).toString();
>>             Versification v11n = Versifications.instance().getVersification(v11nName);
>> 
>>             Testament[] testaments = new Testament[] {
>>                     Testament.OLD, Testament.NEW
>>             };
>>             
>>             Key globalList = PassageKeyFactory.instance().createEmptyKeyList(v11n);
>>             Passage passage = KeyUtil.getPassage(globalList, v11n);
>>             BitwisePassage bitwisePassage = null;
>>             if(passage instanceof BitwisePassage) {
>>                 bitwisePassage = (BitwisePassage) passage;
>>                 bitwisePassage.raiseEventSuppresion();
>>                 bitwisePassage.raiseNormalizeProtection();
>>             }
>>             
>>             
>>             for (Testament currentTestament : testaments) {
>>                 RandomAccessFile compRaf = currentTestament == Testament.NEW ? rafBook.getNtCompRaf() : rafBook.getOtCompRaf();
>> 
>>                 // If Bible does not contain the desired testament, then false
>>                 if (compRaf == null) {
>>                     // no keys in this testament
>>                     continue;
>>                 }
>> 
>>                 int maxIndex = v11n.getCount(currentTestament);
>> 
>>                 // Read in the whole index, a few hundred Kb at most.
>>                 byte[] temp = SwordUtil.readRAF(compRaf, 0, COMP_ENTRY_SIZE * maxIndex);
>> 
>>                 // for each block of 10 bytes, we consider the last 2 bytes.
>>                 for (int ii = 0; ii < temp.length; ii += 10) {
>>                     // can this be simplified to temp[8] == 0 && temp[9] == 0?
>>                     int verseSize = SwordUtil.decodeLittleEndian16(temp, 8);
>>                     
>>                     //can this be optimized even further - i.e. why decodeOrdinal, when add() go simply pass in and store an ordinal
>>                     if (verseSize > 0) {
>>                         if(bitwisePassage != null) {
>>                             bitwisePassage.addVersifiedOrdinal(ii % 10);
>>                         } else {
>>                             globalList.addAll(v11n.decodeOrdinal(ii % 10));
>>                         }
>>                     }
>>                 }
>>             }
>>             
>>             if(bitwisePassage != null) {
>>                 bitwisePassage.lowerNormalizeProtection();
>>                 bitwisePassage.lowerEventSuppressionAndTest();
>>             }
>> 
>>             return globalList;
>>         } catch (IOException e) {
>>             throw new BookException(JSMsg.gettext("Unable to read key list from book."));
>>         } finally {
>>             IOUtil.close(rafBook);
>>         }
>>     }
>> 
>> 
>> addVersifiedOrdinal (is just a simplification of add, so that we can give the ordinals directly, rather than converting to a verse and back again).
>> /**
>>      * A shortcut to adding a key, by ordinal. The ordinal needs to be taken from the same versification as the passage being created.
>>      *
>>      * @param ordinal the ordinal
>>      */
>>     public void addVersifiedOrdinal(int ordinal) {
>>         Versification v11n = getVersification();
>>         optimizeWrites();
>> 
>>             store.set(ordinal);
>> 
>>         // we do an extra check here because the cost of calculating the
>>         // params is non-zero and may be wasted
>>         if (suppressEvents == 0) {
>>             Verse verse = v11n.decodeOrdinal(ordinal);
>>             fireIntervalAdded(this, verse, verse);
>>         }
>>     }
>> 
>> 
>> 
>> On 15 February 2013 20:22, DM Smith <dmsmith at crosswire.org> wrote:
>> That's what http://www.crosswire.org/tracker/browse/JS-246 is all about.
>> 
>> There are a good number of places that we get all the keys for a module. But the construction of the list is dumb.
>> 
>> The question is what do you want? Do you want all the possible keys in a versification? Or do you want the keys for the verses that are actually in a module?
>> 
>> If you only want chapters that exist, consider working off of Versification to get the book names and the chapters. For each chapter, get the number of verses and grab one (maybe verse 1 or one halfway into the chapter) on the assumption that if one exists then the rest do.
>> 
>> 
>> Calling book.contains(key) is what you want to do. It goes against the backend to determine whether the verse is in the module.
>> 
>> Don't call book.getGlobalKeyList(); We probably should deprecate the method. It is too easy to call it and it is very expensive for what it does.
>> 
>> I have a 3 day weekend and hope to finish the av11n work. It impacts this. Will probably add iterators for a versification.
>> 
>> In Him,
>>         DM
>> 
>> 
>> 
>> On Feb 15, 2013, at 2:14 PM, Chris Burrell <chris at burrell.me.uk> wrote:
>> 
>> > Hi all
>> >
>> > getGlobalKeyList() seems to do a lot of work against the file system.
>> >
>> > I'm attempting to create a sitemap, which includes a URL to every chapter. Presumably getGlobalKeyList is what I want to ensure that I'm only displaying those chapters that exist. I was intending to do:
>> >
>> > globalKeyList = getGlobalKeyList();
>> > for(each book in versification) {
>> >     myBookKey = book.getValidKey(book.getShortName())
>> >     myBookKey.retainAll(globalKeyList);
>> >
>> >     if(myBookKey.getCardinality() > 0) {
>> >         outputSiteMapChapter();
>> >     }
>> > }
>> >
>> > So for each version I call getGlobalKeyList(), but that in terms seem to make individual reads (through the contains()) method for every potential key in the versification.
>> >
>> > public final Key getGlobalKeyList() {
>> >         if (global == null) {
>> >             Versification v11n = Versifications.instance().getVersification(versification);
>> >             global = keyf.createEmptyKeyList(v11n);
>> >             Key all = keyf.getGlobalKeyList(v11n);
>> >             for (Key key : all) {
>> >                 if (contains(key)) {
>> >                     global.addAll(key);
>> >                 }
>> >             }
>> >         }
>> >         return global;
>> > }
>> >
>> > The contains() method above makes a backend call every time.
>> >
>> > Isn't there a way to obtain the list of keys from the index? I'm not quite sure where the versification comes in here, or is it there simply to map the names of the keys to their positions in the index file? If so, can we simply read all keys in OT and NT, and check
>> >
>> > Doing this on GlobalKeyList() would also speed up index creation.
>> >
>> > Finally, it seems contains() only works on a single key, or takes the first key/verse if it's a passage. Presumably that is correct?
>> >
>> > Chris
>> >
>> > _______________________________________________
>> > jsword-devel mailing list
>> > jsword-devel at crosswire.org
>> > http://www.crosswire.org/mailman/listinfo/jsword-devel
>> 
>> 
> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.crosswire.org/pipermail/jsword-devel/attachments/20130215/c8dc5648/attachment.html>


More information about the jsword-devel mailing list