[jsword-devel] [sword-devel] Method to find if BibleBook is contained in a Book

Martin Denham mjdenham at gmail.com
Fri Mar 28 13:34:43 MST 2014


I was only thinking of using it with SwordBook/AbstractPassageBook but if
it is not performant then maybe it is not worth continuing and we should
look at Scope.  I thought that it was already being calculated in
ZVerseBackend.contains() using the idxRaf.

btw is it safe to get the tip of JSword yet?

Martin


On 28 March 2014 20:19, DM Smith <dmsmith at crosswire.org> wrote:

> I think it would be good to support Scope formally, even if it never makes
> it into SWORD. As a different issue, we'll be changing JSword to keep a
> module's conf pristine and the things that we write to it, will be put into
> a side-car conf. This will be the perfect place for us to compute the value
> once for all time per module.
>
> The getRawTextLength is not as easy as I'd like. It's mostly done. A bit
> more to do. For a couple of module types, both compressed, it is not
> performant. It merely calls getRawText and then length. The problem is that
> one has to uncompress the text to see how long it is.
>
> -- DM
>
> On Mar 28, 2014, at 3:31 PM, Martin Denham <mjdenham at gmail.com> wrote:
>
> An alternative method might be to use the Scope value which IBT have
> placed in the .conf file, but I can't seem to get access to it via JSword.
>
> This is printed:
> WARNING: Extra entry in kaz of Scope
>
> And in ConfigEntryTable:
>     log.warn("Extra entry in {} of {}", internal, configEntry.getName());
>     extra.put(key, configEntry);
>
> But I can't see any way to get the value from the extra map?  Is it
> possible - I am a bit confused by the initialisation and retrieval of
> metadata and properties in JSword.
>
> *Example scopes from IBT modules*
>
> Scope for kaz:
> Scope=Gen-Josh.24.33 Judg-2Chr Ezra-Neh Esth-Ps.150 Prov.0-Prov.4.27
> Prov.5-Prov.13.25 Prov.14-Prov.18.24 Prov.19-Song Isa-Lam Ezek-Dan.3.33
> Dan.4-Dan.12 Hos-Mal Matt-Rev
>
> Scope for kylsc:
> Scope=Matt-Rev
>
> I don't know if the strings used are compatible with PassageKeyFactory but
> if we only look at the start and end of the scope we may be able to deduce
> all that is required because I think IBT are the only people who use scope.
>
> Martin
>
>
>
> On 28 March 2014 14:12, DM Smith <dmsmith at crosswire.org> wrote:
>
>> I'll add the method SwordBook.getRawTextLength(Key key), or something
>> like it. -- DM
>>
>> On Mar 26, 2014, at 6:47 PM, Martin Denham <mjdenham at gmail.com> wrote:
>>
>> Given the above explanations and that many users have already downloaded
>> such modules I have experimented with a work-around by adding some extra
>> logic to And Bible to specifically cater for the IBT Synodal modules.  I
>> did this by making the assumption that all the empty verses start with: "<chapter
>> eID=" which appears true and unique.  It is a bit of a hack but it
>> almost worked.
>>
>> The only problem is that after adding the extra getRawText checks it
>> takes too long, even on my Nexus 4, to load the book list for IBT modules.
>>  However, a simpler way to avoid the getRawText calls would be to add a
>>     public int SwordBook.getRawText*Length*(Key key)
>> which would be identical code to contains(Key key)
>> (->ZVerseBackend.contains) but return verse length instead of a boolean
>> (contains() calculates verse length to determine if a verse exists).  What
>> do you think?  This would help because IBT empty verse stubs are very short
>> and so normally the getRawText would not be required as part of the
>> elaborated contains() check in And Bible.
>>
>> *Note:*
>> I have discovered that this problem does not just affect deuterocanonical
>> books in IBT Synodal modules, it also affects OT books in IBT NT-only
>> modules e.g. KYLSC, which return text like "<chapter eID="gen4"
>> osisID="Gen.1"/>".
>>
>> Martin
>>
>>
>> On 26 March 2014 14:49, DM Smith <dmsmith at crosswire.org> wrote:
>>
>>> John,
>>>
>>> Putting this up on sword-devel, since that is a more appropriate
>>> location for the discussion to continue. This is really not about JSword,
>>> but rather about module making.
>>>
>>> The nature of osis2mod is to retain all markup except <verse> and
>>> </verse> (or their equivalent milestoned version.) This means that the
>>> markup for a chapter is put in the module's storage for that chapter and
>>> noted in the index. In the case of the chapter that is given below, it is
>>> split into 2 parts, Verse 0 and Verse 1.
>>> Verse 0 will get the preamble of the chapter:
>>> <chapter osisID="EpJer.1">
>>> Verse 1 will get:
>>> </chapter>
>>> (These will have been transformed into their milestoned versions.)
>>>
>>> Also, verse 2 to 72 will be "linked" to verse 1, meaning that in the
>>> index they are given the same location as verse 1.
>>>
>>> So, verse 0 has chapter start content and verse 1 to 72 have chapter end
>>> content.
>>>
>>> Also, osis2mod does not complain if a verse is missing. Never has, never
>>> will. It does "complain" of a verse being present that is not in the
>>> versification. Always has, always will.
>>>
>>> That emptyvss indicates that all verses are present means exactly that:
>>> All verses are present. This is not good if the module is in fact
>>> incomplete.
>>>
>>> That JSword indicates that these "empty" verses are present means that
>>> they have non-zero length in the module.
>>>
>>> JSword is graceful in handling this. It determines that the module has
>>> content for the verse by examining the index. What Martin is trying to do
>>> is find out which books, chapters and verses should be displayed to users
>>> in pick lists. The only way this can be done at this time, by either SWORD
>>> or JSword with the module in question, is to render each verse and
>>> determine that it renders nothing. This is far too expensive an operation
>>> to consider.
>>>
>>> The only way to efficiently determine scope is to examine the index for
>>> each verse and see if the length is 0. The Scope entry in the conf has been
>>> ruled out. It would have been computed using the reverse logic of emptyvss.
>>> Go through the v11n from first verse to last and rather than noting what is
>>> missing, note what is present.
>>>
>>> Today, most of our frontends display pick lists based on the v11n not on
>>> the module content. It has long been confusing to end users of modules that
>>> don't contain verses in the v11n.
>>>
>>> In my view, this is a module problem. It is far easier and faster to
>>> rebuild and redistribute a module. We can tell a user to upgrade to the
>>> most recent version of a module far easier than making and releasing a code
>>> change and having them get a new version of the program. When the change is
>>> a work-around for something that shouldn't be in module, I think we should
>>> avoid that. For example, the NET Bible has some bugs that should be fixed.
>>> But instead we have some special code that is essentially: if module is NET
>>> then fix such-and-so when it occurs.
>>>
>>> Together in His Service,
>>> DM Smith
>>>
>>>
>>>
>>> On Mar 25, 2014, at 11:43 PM, John Austin <gpl.programs.info at gmail.com>
>>> wrote:
>>>
>>> There has been a lot of discussion about how missing material in a v11n
>>> should be treated (the discussion of the meaning and use of Scope was part
>>> of that). Tools such as osis2mod generated warnings whenever OSIS files
>>> lacked any part of the chosen v11n. The Scope conf param was, for a time at
>>> least, the recommended method of describing what part of a v11n was covered
>>> by a module. For these reasons, many existing modules (IBT alone has at
>>> least 26 such modules) are currently encoded so as to encompass the entire
>>> v11n, returning empty-string verse content for all verses in the v11n that
>>> are not included in the module, and using the .conf Scope param to define
>>> exactly what is present in the module.
>>>
>>> So even though current module making best practice may be different, it
>>> would be good for JSword to be graceful with modules that are encoded
>>> somewhat differently if at all possible, at least for a time. There are
>>> many modules out there, old and new, which don't contain the complete v11n,
>>> so determining book coverage is important.
>>>
>>> -John
>>>
>>>
>>>
>>> On 03/25/2014 08:19 PM, DM Smith wrote:
>>>
>>> Those verses exist since they are defined in the OSIS input file to
>>> osis2mod. Osis2mod retains everything in its input. This is a well
>>> documented behavior of osis2mod.
>>>
>>> The end chapter markup will be put in the last verse that is in the
>>> chapter, which might be verse 0.
>>>
>>> They should use xslt to strip empty verses, chapters and books out of
>>> their file into an intermediate file and give that as input to osis2mod.
>>>
>>> Alternatively they can use <!-- ... --> to comment out huge swaths of
>>> the input file.
>>>
>>>
>>> -- DM
>>>
>>> On Mar 25, 2014, at 7:48 AM, Martin Denham <mjdenham at gmail.com
>>> <mailto:mjdenham at gmail.com <mjdenham at gmail.com>>> wrote:
>>>
>>> IBT have just passed me more information regarding their handling of
>>> empty verses to help clarify if this is an IBT module issue or not.
>>> The following is an extract from IBT's e-mail:
>>>
>>>    Here are examples of how IBT's OSIS source defines empty verses in
>>>    the markup:
>>>
>>>    Empty book (Epistle of Jeremiah):
>>>    <div type="x-Synodal-non-canonical"__><div type="book"
>>>    osisID="EpJer"><chapter osisID="EpJer.1"><verse sID="EpJer.1.1-72"
>>>    osisID="EpJer.1.1 EpJer.1.2 EpJer.1.3 EpJer.1.4 EpJer.1.5
>>>    EpJer.1.6 EpJer.1.7 EpJer.1.8 EpJer.1.9 EpJer.1.10 EpJer.1.11
>>>    EpJer.1.12 EpJer.1.13 EpJer.1.14 EpJer.1.15 EpJer.1.16 EpJer.1.17
>>>    EpJer.1.18 EpJer.1.19 EpJer.1.20 EpJer.1.21 EpJer.1.22 EpJer.1.23
>>>    EpJer.1.24 EpJer.1.25 EpJer.1.26 EpJer.1.27 EpJer.1.28 EpJer.1.29
>>>    EpJer.1.30 EpJer.1.31 EpJer.1.32 EpJer.1.33 EpJer.1.34 EpJer.1.35
>>>    EpJer.1.36 EpJer.1.37 EpJer.1.38 EpJer.1.39 EpJer.1.40 EpJer.1.41
>>>    EpJer.1.42 EpJer.1.43 EpJer.1.44 EpJer.1.45 EpJer.1.46 EpJer.1.47
>>>    EpJer.1.48 EpJer.1.49 EpJer.1.50 EpJer.1.51 EpJer.1.52 EpJer.1.53
>>>    EpJer.1.54 EpJer.1.55 EpJer.1.56 EpJer.1.57 EpJer.1.58 EpJer.1.59
>>>    EpJer.1.60 EpJer.1.61 EpJer.1.62 EpJer.1.63 EpJer.1.64 EpJer.1.65
>>>    EpJer.1.66 EpJer.1.67 EpJer.1.68 EpJer.1.69 EpJer.1.70 EpJer.1.71
>>>    EpJer.1.72"/><verse eID="EpJer.1.1-72"/></chapter>__</div></div>
>>>
>>>    I'm not sure how osis2mod handles all this when importing to the
>>>    module, but it works perfectly without warnings or errors. Also,
>>>    when the resulting module is passed to the "emptyvss" tool, it
>>>    passes this test without warnings.
>>>
>>>
>>>
>>> On 25 March 2014 11:38, Martin Denham <mjdenham at gmail.com
>>> <mailto:mjdenham at gmail.com <mjdenham at gmail.com>>> wrote:
>>>
>>>    I am having problems getting a list of BibleBooks contained in
>>>    some AV modules which we know do not contain certain books.  I
>>>    can't work out if the problem is with JSword, the modules, or
>>>    osis2mod.
>>>
>>>    There are 2 related problems I can see:
>>>
>>>     1. book.contains(nonExistingVerse) returns TRUE
>>>     2. book.getRawText(nonExistingVerse) returns <chapter end tag>
>>>
>>>    Here is a simple test to show the problem using KAZ which has
>>>    Synodal v11n but does not contain any deuterocanonical books:
>>>
>>>    SwordBook kaz = (SwordBook)Books.installed().getBook("KAZ");
>>>    Verse esd11Verse = new Verse(kaz.getVersification(),
>>>    BibleBook.ESD1, 1, 1);
>>>    System.out.println(kaz.contains(esd11Verse));// prints: *true*
>>>    System.out.println(kaz.getRawText(esd11Verse));// prints:
>>>    *<chapter eID="gen7" osisID="1Esd.1"/>*
>>>    Verse esd12Verse = new Verse(kaz.getVersification(),
>>>    BibleBook.ESD1, 1, 2);
>>>    System.out.println(kaz.contains(esd12Verse));// prints: *true*
>>>    System.out.println(kaz.getRawText(esd12Verse));// prints:
>>>    *<chapter eID="gen7" osisID="1Esd.1"/>*
>>>
>>>    So how does "<chapter eID="gen7" osisID="1Esd.1"/>" get into verse
>>>    content unexpectedly?
>>>
>>>    It seems to me like it could be either:
>>>
>>>     1. a module problem; but IBT say they do not add empty verse slots
>>>     2. Sword osis2mod issue
>>>     3. JSword issue: why is JSword returning a chapter end tag
>>>        instead of verse content
>>>
>>>    Any ideas what might cause this problem?
>>>
>>>    Thanks
>>>    Martin
>>>
>>>
>>>    On 11 March 2014 12:15, DM Smith <dmsmith at crosswire.org
>>>    <mailto:dmsmith at crosswire.org <dmsmith at crosswire.org>>> wrote:
>>>
>>>        We haven't pushed this down into JSword. So far it is the
>>>        responsibility of the front-end. Chris B has made it efficient
>>>        to ask a Book whether it contains a Verse.
>>>
>>>        Essentially, when it comes to asking a module if it has
>>>        meaningful content, you want containsAny(Key verses, boolean
>>>        includeIntros) and containsAny(Key verses) { return
>>>        containsAny(verses, false); }
>>>
>>>        I think it should ignore verse 0 by default. If it doesn't
>>>        have verse content, then does the content really mean something?
>>>
>>>        As you have noted contains(Key) is confusing. There are a few
>>>        places where it means containsAny. Usually it means
>>>        containAll. The name, contains, was chosen early as we derived
>>>        from a container class where the argument was an element of
>>>        the container.  That is, contains is supposed to mean
>>>        isMemberOf. Later we changed the inheritance as it wasn't an
>>>        "is a" relationship.
>>>
>>>        But we need to be careful of not introducing more confusion.
>>>
>>>        By the way, the list serve was holding mail for a few days.
>>>
>>>        In Him,
>>>                DM
>>>
>>>        On Mar 8, 2014, at 5:26 PM, Martin Denham <mjdenham at gmail.com
>>>        <mailto:mjdenham at gmail.com <mjdenham at gmail.com>>> wrote:
>>>
>>>        > Is there an efficient way to find if a BibleBook is
>>>        contained in a Book (Bible or commentary) using JSword?
>>>        >
>>>        > I recall this subject being discussed but can't recall the
>>>        outcome.
>>>        >
>>>        > Thanks
>>>        > Martin
>>>
>>>
>>> _______________________________________________
>>> sword-devel mailing list: sword-devel at crosswire.org
>>> http://www.crosswire.org/mailman/listinfo/sword-devel
>>> Instructions to unsubscribe/change your settings at above page
>>>
>>
>> _______________________________________________
>> jsword-devel mailing list
>> jsword-devel at crosswire.org
>> http://www.crosswire.org/mailman/listinfo/jsword-devel
>>
>>
>>
>> _______________________________________________
>> jsword-devel mailing list
>> jsword-devel at crosswire.org
>> http://www.crosswire.org/mailman/listinfo/jsword-devel
>>
>>
> _______________________________________________
> jsword-devel mailing list
> jsword-devel at crosswire.org
> http://www.crosswire.org/mailman/listinfo/jsword-devel
>
>
>
> _______________________________________________
> jsword-devel mailing list
> jsword-devel at crosswire.org
> http://www.crosswire.org/mailman/listinfo/jsword-devel
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.crosswire.org/pipermail/jsword-devel/attachments/20140328/73e62be8/attachment-0001.html>


More information about the jsword-devel mailing list