[jsword-devel] [sword-devel] Method to find if BibleBook is contained in a Book
Chris Burrell
christopher at burrell.me.uk
Tue Apr 15 11:19:39 MST 2014
Thanks Martin - now I see what you mean about IBT!
DM, Martin's code simply checks existence of verse 1 & 2 (my option 2
above), using the code I wrote to work out if a verse is present. The IBT
stuff seems like a dirty hack for a poor module structure.
I'd be happy to integrate that into JSword. I'm presuming the option
suggested doesn't really add much to this?
I think integrating it, we would possibly want to make it part of a
AbstractPassageBook, and have it lazy init. Do we need want to retain a
list? Or would a HashSet be better? or even a LinkedHashSet?
Most of my use cases rely on asking whether a book is contained within the
Book, as opposed to obtaining a list of books.
Chris
On 15 April 2014 09:43, Martin Denham <mjdenham at gmail.com> wrote:
> I took a stab at this here<https://github.com/mjdenham/and-bible/blob/development/AndBible/src/net/bible/android/control/navigation/DocumentBibleBooks.java>.
> It was elegant until I catered for IBT module anomalies.
>
> My initial experiments seem to show it works really well in being fast and
> giving a quick 'heads-up' regarding which Bible books are in a module which
> is useful not only for partial dc support which seems the norm, but also
> for partial Bibles and commentaries e.g. NT only or developing modules.
>
> I have integrated this into the Passage selector and also page prev/next.
>
> Cheers
> Martin
>
>
> On 14 April 2014 23:50, DM Smith <dmsmith at crosswire.org> wrote:
>
>> It still is manual. I think there's a fairly optimal way to compute this,
>> but it is not perfect.
>>
>> The problem is that a module does not have to be laid down in order.
>> Osis2mod has an "append" flag that allows for additional material to be
>> appended to a module. This is useful for doing a book at a time. It it also
>> useful to fix a verse and append the fix to the module. Both the old and
>> the new will be in the module but only the new will be in the index.
>>
>> Also, if the module has books, chapters or verses out of order, these
>> will be reassembled into the right order (it is the nature of the index
>> file), but the data files will have the content in the order that is in the
>> module.
>>
>> The following is true about the index and data files:
>> Each verse in the data file is laid down in the order that it is read
>> from the input file.
>> The index contains the start of each verse in the data file.
>> There are separate index files for the OT and the NT. DC when present is
>> in one or the other.
>>
>> If the data is laid down in the proper order then we can use that
>> knowledge to figure out if the book or chapter has content.
>> The difference between the starts of the books (or chapters) can be used
>> to guess what is present. For example, if Genesis has a start of 10 and an
>> end of 4000, Exodus has a start and end of 0, and Lev has a start of 4000
>> and end of 10000, then we can guess that Genesis and Lev exist but Exodus
>> does not.
>>
>> Alternatively other sample points could be used. E.g. middle of the
>> chapters.
>>
>> This is only a heuristic.
>>
>> We can also note that the OT files don't exist or the data file has 0
>> size, then the module is NT alone. Or the other way around.
>>
>> I do think we need to make the module's conf be "immutable" as
>> downloaded, but have a "sidecar" conf file with settings we want to have. I
>> think once computed, it should be stored there. Maybe it can be computed on
>> the server and stored there for download.
>>
>> -- dm
>>
>>
>> On Apr 14, 2014, at 4:42 PM, Chris Burrell <christopher at burrell.me.uk>
>> wrote:
>>
>> Hi
>>
>> What's the latest on this? At the moment, STEP looks up auto-suggestions
>> based on versifications but this is annoying for Greek texts that do offer
>> the OT, but the OSMHB (OSHB) or WLC don't.
>>
>> What I'm really looking for is to query a book for it's BibleBooks,
>> rather than have to rely on the Versification. The versification is not
>> great from that point of view. It tells the frontend what might be in the
>> book, rather than what is in the book.
>>
>> If there's nothing there at the moment, I could settle for:
>> 1. calculate once and store scope (as an OSIS, or read it from conf
>> file). Then read the key and do some kind of parsing to get all books.
>> 2. check for all Bk.1.1 on start-up/first call and check for that
>> 3. Do a combination of both, i.e. calculate once and store on install (or
>> store if not stored before), then use that to check for all Bk.1.1 first
>> time round.
>> 4. Store a number of flags such as Gen.1.1=true, Ex.1.1=true, etc.
>>
>>
>> Bar 4, none of these options are efficient however. All of them require
>> at least 66 lookups for a standard module. And on small devices, this may
>> be an issue.
>>
>> Chris
>>
>>
>>
>> On 28 March 2014 20:50, DM Smith <dmsmith at crosswire.org> wrote:
>>
>>> It will be performant with Bibles.
>>>
>>> JSword is stable at the tip. I've just checked in the bug fix that Chris
>>> supplied.
>>>
>>> This change will be stable.
>>>
>>> -- DM Smith
>>>
>>> On Mar 28, 2014, at 4:34 PM, Martin Denham <mjdenham at gmail.com> wrote:
>>>
>>> I was only thinking of using it with SwordBook/AbstractPassageBook but
>>> if it is not performant then maybe it is not worth continuing and we should
>>> look at Scope. I thought that it was already being calculated in
>>> ZVerseBackend.contains() using the idxRaf.
>>>
>>> btw is it safe to get the tip of JSword yet?
>>>
>>> Martin
>>>
>>>
>>> On 28 March 2014 20:19, DM Smith <dmsmith at crosswire.org> wrote:
>>>
>>>> I think it would be good to support Scope formally, even if it never
>>>> makes it into SWORD. As a different issue, we'll be changing JSword to keep
>>>> a module's conf pristine and the things that we write to it, will be put
>>>> into a side-car conf. This will be the perfect place for us to compute the
>>>> value once for all time per module.
>>>>
>>>> The getRawTextLength is not as easy as I'd like. It's mostly done. A
>>>> bit more to do. For a couple of module types, both compressed, it is not
>>>> performant. It merely calls getRawText and then length. The problem is that
>>>> one has to uncompress the text to see how long it is.
>>>>
>>>> -- DM
>>>>
>>>> On Mar 28, 2014, at 3:31 PM, Martin Denham <mjdenham at gmail.com> wrote:
>>>>
>>>> An alternative method might be to use the Scope value which IBT have
>>>> placed in the .conf file, but I can't seem to get access to it via JSword.
>>>>
>>>> This is printed:
>>>> WARNING: Extra entry in kaz of Scope
>>>>
>>>> And in ConfigEntryTable:
>>>> log.warn("Extra entry in {} of {}", internal,
>>>> configEntry.getName());
>>>> extra.put(key, configEntry);
>>>>
>>>> But I can't see any way to get the value from the extra map? Is it
>>>> possible - I am a bit confused by the initialisation and retrieval of
>>>> metadata and properties in JSword.
>>>>
>>>> *Example scopes from IBT modules*
>>>>
>>>> Scope for kaz:
>>>> Scope=Gen-Josh.24.33 Judg-2Chr Ezra-Neh Esth-Ps.150 Prov.0-Prov.4.27
>>>> Prov.5-Prov.13.25 Prov.14-Prov.18.24 Prov.19-Song Isa-Lam Ezek-Dan.3.33
>>>> Dan.4-Dan.12 Hos-Mal Matt-Rev
>>>>
>>>> Scope for kylsc:
>>>> Scope=Matt-Rev
>>>>
>>>> I don't know if the strings used are compatible with PassageKeyFactory
>>>> but if we only look at the start and end of the scope we may be able to
>>>> deduce all that is required because I think IBT are the only people who use
>>>> scope.
>>>>
>>>> Martin
>>>>
>>>>
>>>>
>>>> On 28 March 2014 14:12, DM Smith <dmsmith at crosswire.org> wrote:
>>>>
>>>>> I'll add the method SwordBook.getRawTextLength(Key key), or something
>>>>> like it. -- DM
>>>>>
>>>>> On Mar 26, 2014, at 6:47 PM, Martin Denham <mjdenham at gmail.com> wrote:
>>>>>
>>>>> Given the above explanations and that many users have already
>>>>> downloaded such modules I have experimented with a work-around by adding
>>>>> some extra logic to And Bible to specifically cater for the IBT Synodal
>>>>> modules. I did this by making the assumption that all the empty verses
>>>>> start with: "<chapter eID=" which appears true and unique. It is a
>>>>> bit of a hack but it almost worked.
>>>>>
>>>>> The only problem is that after adding the extra getRawText checks it
>>>>> takes too long, even on my Nexus 4, to load the book list for IBT modules.
>>>>> However, a simpler way to avoid the getRawText calls would be to add a
>>>>> public int SwordBook.getRawText*Length*(Key key)
>>>>> which would be identical code to contains(Key key)
>>>>> (->ZVerseBackend.contains) but return verse length instead of a boolean
>>>>> (contains() calculates verse length to determine if a verse exists). What
>>>>> do you think? This would help because IBT empty verse stubs are very short
>>>>> and so normally the getRawText would not be required as part of the
>>>>> elaborated contains() check in And Bible.
>>>>>
>>>>> *Note:*
>>>>> I have discovered that this problem does not just affect
>>>>> deuterocanonical books in IBT Synodal modules, it also affects OT books in
>>>>> IBT NT-only modules e.g. KYLSC, which return text like "<chapter eID="gen4"
>>>>> osisID="Gen.1"/>".
>>>>>
>>>>> Martin
>>>>>
>>>>>
>>>>> On 26 March 2014 14:49, DM Smith <dmsmith at crosswire.org> wrote:
>>>>>
>>>>>> John,
>>>>>>
>>>>>> Putting this up on sword-devel, since that is a more appropriate
>>>>>> location for the discussion to continue. This is really not about JSword,
>>>>>> but rather about module making.
>>>>>>
>>>>>> The nature of osis2mod is to retain all markup except <verse> and
>>>>>> </verse> (or their equivalent milestoned version.) This means that the
>>>>>> markup for a chapter is put in the module's storage for that chapter and
>>>>>> noted in the index. In the case of the chapter that is given below, it is
>>>>>> split into 2 parts, Verse 0 and Verse 1.
>>>>>> Verse 0 will get the preamble of the chapter:
>>>>>> <chapter osisID="EpJer.1">
>>>>>> Verse 1 will get:
>>>>>> </chapter>
>>>>>> (These will have been transformed into their milestoned versions.)
>>>>>>
>>>>>> Also, verse 2 to 72 will be "linked" to verse 1, meaning that in the
>>>>>> index they are given the same location as verse 1.
>>>>>>
>>>>>> So, verse 0 has chapter start content and verse 1 to 72 have chapter
>>>>>> end content.
>>>>>>
>>>>>> Also, osis2mod does not complain if a verse is missing. Never has,
>>>>>> never will. It does "complain" of a verse being present that is not in the
>>>>>> versification. Always has, always will.
>>>>>>
>>>>>> That emptyvss indicates that all verses are present means exactly
>>>>>> that: All verses are present. This is not good if the module is in fact
>>>>>> incomplete.
>>>>>>
>>>>>> That JSword indicates that these "empty" verses are present means
>>>>>> that they have non-zero length in the module.
>>>>>>
>>>>>> JSword is graceful in handling this. It determines that the module
>>>>>> has content for the verse by examining the index. What Martin is trying to
>>>>>> do is find out which books, chapters and verses should be displayed to
>>>>>> users in pick lists. The only way this can be done at this time, by either
>>>>>> SWORD or JSword with the module in question, is to render each verse and
>>>>>> determine that it renders nothing. This is far too expensive an operation
>>>>>> to consider.
>>>>>>
>>>>>> The only way to efficiently determine scope is to examine the index
>>>>>> for each verse and see if the length is 0. The Scope entry in the conf has
>>>>>> been ruled out. It would have been computed using the reverse logic of
>>>>>> emptyvss. Go through the v11n from first verse to last and rather than
>>>>>> noting what is missing, note what is present.
>>>>>>
>>>>>> Today, most of our frontends display pick lists based on the v11n not
>>>>>> on the module content. It has long been confusing to end users of modules
>>>>>> that don't contain verses in the v11n.
>>>>>>
>>>>>> In my view, this is a module problem. It is far easier and faster to
>>>>>> rebuild and redistribute a module. We can tell a user to upgrade to the
>>>>>> most recent version of a module far easier than making and releasing a code
>>>>>> change and having them get a new version of the program. When the change is
>>>>>> a work-around for something that shouldn't be in module, I think we should
>>>>>> avoid that. For example, the NET Bible has some bugs that should be fixed.
>>>>>> But instead we have some special code that is essentially: if module is NET
>>>>>> then fix such-and-so when it occurs.
>>>>>>
>>>>>> Together in His Service,
>>>>>> DM Smith
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Mar 25, 2014, at 11:43 PM, John Austin <
>>>>>> gpl.programs.info at gmail.com> wrote:
>>>>>>
>>>>>> There has been a lot of discussion about how missing material in a
>>>>>> v11n should be treated (the discussion of the meaning and use of Scope was
>>>>>> part of that). Tools such as osis2mod generated warnings whenever OSIS
>>>>>> files lacked any part of the chosen v11n. The Scope conf param was, for a
>>>>>> time at least, the recommended method of describing what part of a v11n was
>>>>>> covered by a module. For these reasons, many existing modules (IBT alone
>>>>>> has at least 26 such modules) are currently encoded so as to encompass the
>>>>>> entire v11n, returning empty-string verse content for all verses in the
>>>>>> v11n that are not included in the module, and using the .conf Scope param
>>>>>> to define exactly what is present in the module.
>>>>>>
>>>>>> So even though current module making best practice may be different,
>>>>>> it would be good for JSword to be graceful with modules that are encoded
>>>>>> somewhat differently if at all possible, at least for a time. There are
>>>>>> many modules out there, old and new, which don't contain the complete v11n,
>>>>>> so determining book coverage is important.
>>>>>>
>>>>>> -John
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 03/25/2014 08:19 PM, DM Smith wrote:
>>>>>>
>>>>>> Those verses exist since they are defined in the OSIS input file to
>>>>>> osis2mod. Osis2mod retains everything in its input. This is a well
>>>>>> documented behavior of osis2mod.
>>>>>>
>>>>>> The end chapter markup will be put in the last verse that is in the
>>>>>> chapter, which might be verse 0.
>>>>>>
>>>>>> They should use xslt to strip empty verses, chapters and books out of
>>>>>> their file into an intermediate file and give that as input to
>>>>>> osis2mod.
>>>>>>
>>>>>> Alternatively they can use <!-- ... --> to comment out huge swaths of
>>>>>> the input file.
>>>>>>
>>>>>>
>>>>>> -- DM
>>>>>>
>>>>>> On Mar 25, 2014, at 7:48 AM, Martin Denham <mjdenham at gmail.com
>>>>>> <mailto:mjdenham at gmail.com <mjdenham at gmail.com>>> wrote:
>>>>>>
>>>>>> IBT have just passed me more information regarding their handling of
>>>>>> empty verses to help clarify if this is an IBT module issue or not.
>>>>>> The following is an extract from IBT's e-mail:
>>>>>>
>>>>>> Here are examples of how IBT's OSIS source defines empty verses in
>>>>>> the markup:
>>>>>>
>>>>>> Empty book (Epistle of Jeremiah):
>>>>>> <div type="x-Synodal-non-canonical"__><div type="book"
>>>>>> osisID="EpJer"><chapter osisID="EpJer.1"><verse sID="EpJer.1.1-72"
>>>>>> osisID="EpJer.1.1 EpJer.1.2 EpJer.1.3 EpJer.1.4 EpJer.1.5
>>>>>> EpJer.1.6 EpJer.1.7 EpJer.1.8 EpJer.1.9 EpJer.1.10 EpJer.1.11
>>>>>> EpJer.1.12 EpJer.1.13 EpJer.1.14 EpJer.1.15 EpJer.1.16 EpJer.1.17
>>>>>> EpJer.1.18 EpJer.1.19 EpJer.1.20 EpJer.1.21 EpJer.1.22 EpJer.1.23
>>>>>> EpJer.1.24 EpJer.1.25 EpJer.1.26 EpJer.1.27 EpJer.1.28 EpJer.1.29
>>>>>> EpJer.1.30 EpJer.1.31 EpJer.1.32 EpJer.1.33 EpJer.1.34 EpJer.1.35
>>>>>> EpJer.1.36 EpJer.1.37 EpJer.1.38 EpJer.1.39 EpJer.1.40 EpJer.1.41
>>>>>> EpJer.1.42 EpJer.1.43 EpJer.1.44 EpJer.1.45 EpJer.1.46 EpJer.1.47
>>>>>> EpJer.1.48 EpJer.1.49 EpJer.1.50 EpJer.1.51 EpJer.1.52 EpJer.1.53
>>>>>> EpJer.1.54 EpJer.1.55 EpJer.1.56 EpJer.1.57 EpJer.1.58 EpJer.1.59
>>>>>> EpJer.1.60 EpJer.1.61 EpJer.1.62 EpJer.1.63 EpJer.1.64 EpJer.1.65
>>>>>> EpJer.1.66 EpJer.1.67 EpJer.1.68 EpJer.1.69 EpJer.1.70 EpJer.1.71
>>>>>> EpJer.1.72"/><verse eID="EpJer.1.1-72"/></chapter>__</div></div>
>>>>>>
>>>>>> I'm not sure how osis2mod handles all this when importing to the
>>>>>> module, but it works perfectly without warnings or errors. Also,
>>>>>> when the resulting module is passed to the "emptyvss" tool, it
>>>>>> passes this test without warnings.
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 25 March 2014 11:38, Martin Denham <mjdenham at gmail.com
>>>>>> <mailto:mjdenham at gmail.com <mjdenham at gmail.com>>> wrote:
>>>>>>
>>>>>> I am having problems getting a list of BibleBooks contained in
>>>>>> some AV modules which we know do not contain certain books. I
>>>>>> can't work out if the problem is with JSword, the modules, or
>>>>>> osis2mod.
>>>>>>
>>>>>> There are 2 related problems I can see:
>>>>>>
>>>>>> 1. book.contains(nonExistingVerse) returns TRUE
>>>>>> 2. book.getRawText(nonExistingVerse) returns <chapter end tag>
>>>>>>
>>>>>> Here is a simple test to show the problem using KAZ which has
>>>>>> Synodal v11n but does not contain any deuterocanonical books:
>>>>>>
>>>>>> SwordBook kaz = (SwordBook)Books.installed().getBook("KAZ");
>>>>>> Verse esd11Verse = new Verse(kaz.getVersification(),
>>>>>> BibleBook.ESD1, 1, 1);
>>>>>> System.out.println(kaz.contains(esd11Verse));// prints: *true*
>>>>>> System.out.println(kaz.getRawText(esd11Verse));// prints:
>>>>>> *<chapter eID="gen7" osisID="1Esd.1"/>*
>>>>>> Verse esd12Verse = new Verse(kaz.getVersification(),
>>>>>> BibleBook.ESD1, 1, 2);
>>>>>> System.out.println(kaz.contains(esd12Verse));// prints: *true*
>>>>>> System.out.println(kaz.getRawText(esd12Verse));// prints:
>>>>>> *<chapter eID="gen7" osisID="1Esd.1"/>*
>>>>>>
>>>>>> So how does "<chapter eID="gen7" osisID="1Esd.1"/>" get into verse
>>>>>> content unexpectedly?
>>>>>>
>>>>>> It seems to me like it could be either:
>>>>>>
>>>>>> 1. a module problem; but IBT say they do not add empty verse slots
>>>>>> 2. Sword osis2mod issue
>>>>>> 3. JSword issue: why is JSword returning a chapter end tag
>>>>>> instead of verse content
>>>>>>
>>>>>> Any ideas what might cause this problem?
>>>>>>
>>>>>> Thanks
>>>>>> Martin
>>>>>>
>>>>>>
>>>>>> On 11 March 2014 12:15, DM Smith <dmsmith at crosswire.org
>>>>>> <mailto:dmsmith at crosswire.org <dmsmith at crosswire.org>>> wrote:
>>>>>>
>>>>>> We haven't pushed this down into JSword. So far it is the
>>>>>> responsibility of the front-end. Chris B has made it efficient
>>>>>> to ask a Book whether it contains a Verse.
>>>>>>
>>>>>> Essentially, when it comes to asking a module if it has
>>>>>> meaningful content, you want containsAny(Key verses, boolean
>>>>>> includeIntros) and containsAny(Key verses) { return
>>>>>> containsAny(verses, false); }
>>>>>>
>>>>>> I think it should ignore verse 0 by default. If it doesn't
>>>>>> have verse content, then does the content really mean
>>>>>> something?
>>>>>>
>>>>>> As you have noted contains(Key) is confusing. There are a few
>>>>>> places where it means containsAny. Usually it means
>>>>>> containAll. The name, contains, was chosen early as we derived
>>>>>> from a container class where the argument was an element of
>>>>>> the container. That is, contains is supposed to mean
>>>>>> isMemberOf. Later we changed the inheritance as it wasn't an
>>>>>> "is a" relationship.
>>>>>>
>>>>>> But we need to be careful of not introducing more confusion.
>>>>>>
>>>>>> By the way, the list serve was holding mail for a few days.
>>>>>>
>>>>>> In Him,
>>>>>> DM
>>>>>>
>>>>>> On Mar 8, 2014, at 5:26 PM, Martin Denham <mjdenham at gmail.com
>>>>>> <mailto:mjdenham at gmail.com <mjdenham at gmail.com>>> wrote:
>>>>>>
>>>>>> > Is there an efficient way to find if a BibleBook is
>>>>>> contained in a Book (Bible or commentary) using JSword?
>>>>>> >
>>>>>> > I recall this subject being discussed but can't recall the
>>>>>> outcome.
>>>>>> >
>>>>>> > Thanks
>>>>>> > Martin
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> sword-devel mailing list: sword-devel at crosswire.org
>>>>>> http://www.crosswire.org/mailman/listinfo/sword-devel
>>>>>> Instructions to unsubscribe/change your settings at above page
>>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> jsword-devel mailing list
>>>>> jsword-devel at crosswire.org
>>>>> http://www.crosswire.org/mailman/listinfo/jsword-devel
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> jsword-devel mailing list
>>>>> jsword-devel at crosswire.org
>>>>> http://www.crosswire.org/mailman/listinfo/jsword-devel
>>>>>
>>>>>
>>>> _______________________________________________
>>>> jsword-devel mailing list
>>>> jsword-devel at crosswire.org
>>>> http://www.crosswire.org/mailman/listinfo/jsword-devel
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> jsword-devel mailing list
>>>> jsword-devel at crosswire.org
>>>> http://www.crosswire.org/mailman/listinfo/jsword-devel
>>>>
>>>>
>>> _______________________________________________
>>> jsword-devel mailing list
>>> jsword-devel at crosswire.org
>>> http://www.crosswire.org/mailman/listinfo/jsword-devel
>>>
>>>
>>>
>>> _______________________________________________
>>> jsword-devel mailing list
>>> jsword-devel at crosswire.org
>>> http://www.crosswire.org/mailman/listinfo/jsword-devel
>>>
>>>
>> _______________________________________________
>> jsword-devel mailing list
>> jsword-devel at crosswire.org
>> http://www.crosswire.org/mailman/listinfo/jsword-devel
>>
>>
>>
>> _______________________________________________
>> jsword-devel mailing list
>> jsword-devel at crosswire.org
>> http://www.crosswire.org/mailman/listinfo/jsword-devel
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.crosswire.org/pipermail/jsword-devel/attachments/20140415/365cb644/attachment-0001.html>
More information about the jsword-devel
mailing list