[jsword-devel] Nave's Topical Bible Reference linking

trent.jsword at trentonadams.ca trent.jsword at trentonadams.ca
Wed Mar 3 13:48:50 MST 2010

Hi DM,

Sorry this is a bit long.  I hope I got things working right this time.  If you have an issues with some code, because it doesn't fit the jsword model in some way, or whatever, let me know.

I don't have a problem with the new beta Nave's.  My new XSL works fine with it.
  <xsl:template match="ref">
    <xsl:variable name="target" select="translate(@target, 'abcdefghijklmnopqrstuvwxyz', 'ABCDEFGHIJKLMNOPQRSTUVWXYZ')"/>
      <!-- Some strong modules use target attribute, so we check them -->
      <xsl:when test="starts-with($target, 'STRONG:') or
                      starts-with($target, 'STRONGSHEBREW:') or
                      starts-with($target, 'STRONGSGREEK:')">
        <xsl:variable name="sref">
          <xsl:call-template name="strong-protocol">
            <xsl:with-param name="text" select="$target"/>
          <xsl:attribute name="href">
            <xsl:value-of select="$sref"/>
          <xsl:value-of select="."/>
      <xsl:when test="contains(@osisRef, 'Bible:')">
        <a href="{@osisRef}"><xsl:value-of select="."/></a>

  <xsl:template name="strong-protocol">
    <xsl:param name="text"/>
      <xsl:when test="starts-with($text, 'STRONG:H')">
        <xsl:value-of select="$hebrew.def.protocol"/>H<xsl:value-of select="substring-after($text, 'STRONG:H')"/>
      <xsl:when test="starts-with($text, 'STRONGSHEBREW:')">
        <xsl:value-of select="$hebrew.def.protocol"/>H<xsl:value-of select="substring-after($text, 'STRONGSHEBREW:')"/>
      <xsl:when test="starts-with($text, 'STRONG:G')">
        <xsl:value-of select="$greek.def.protocol"/>G<xsl:value-of select="substring-after($text, 'STRONG:G')"/>
      <xsl:when test="starts-with($text, 'STRONGSGREEK:')">
        <xsl:value-of select="$greek.def.protocol"/>G<xsl:value-of select="substring-after($text, 'STRONGSGREEK:')"/>
        <xsl:value-of select="$greek.def.protocol"/>

I tried both the StrongsHebrew and StrongsGreek modules in beta, as well as the "Strong" module in beta, and they all work with the above XSL.  Unfortunately, they are marked up differently, hence the more complex XSL fragment.

I highly recommend that we change all PROTOCOL handling to case insensitive matching, according to the HTTP specification (3.2.3 http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html).  If we do, an osisRef of Bible:John.3.16 will just work.
            if (BIBLE_PROTOCOL.equalsIgnoreCase(protocol)) {

If the above, at least, is not done, Easton's Bible Dictionary, among many others I'm sure, will not work with the XSL.  The alternative is to parse out "Bible:" using substring-after, from within the XSL.  But, I think it's reasonable to make all protocols case-insensitive, unless I'm missing something! Yes/no?

At least the following modules have linked text to the bible, after putting in the above XSL...
- Easton's Bible Dictionary (production)
- Nave's Topical Reference (beta)

However, the production Nave's required a bit of hacking, because it's text (not TEI) and because it sometimes spans multiple lines between # and |, which it really shouldn't.  And, what happens then is that it becomes difficult to replace "<br></br>" and the new lines with nothing, because you don't want it to do it through the whole file.  I was trying a bit of regular expressions, but couldn't seem to get it.  An example of the problem text is below. 

<br></br> -Inspiration of
<br></br>    #Ex 12:1; Le 10:8; 11:1; 13:1; 15:1; Nu 2:1; 4:1,17; 18:1;
<br></br>    19:1; 20:12|

The regex I tried, to remove the br tags and new lines, was variations of this one...
text = text.replaceAll("(?m)(?s)(<br></br>    #.*?)$(?:(?:^<br></br>    )(.*?)$)*(\|)", "$1$2$3");

But, that would be very difficult to debug, if we ever found a problem anyhow.

So, I drafted some code to process this using a BufferedReader, by marking beginning (<br></br>    #) and ending (|) references.  I then remove "<br></br>    " from the beginning of the line, if scripture references were already started by #.  The caveat of this is that if "<br></br>" is ever removed, it will break.  However, it will fail-safe to no link processing, and just leaving the raw text with the "<br></br>" in place.  It may be worth while throwing this in a method.

            if ("Nave".equals(bmd.getInitials()) &&
                "1.1".equals(bmd.getProperties().get("Version"))) {
                final BufferedReader reader = new BufferedReader(new StringReader(text));

                    String curLine = null;
                    final StringBuffer sBuffer = new StringBuffer();
                    final Pattern refStart = Pattern.compile("<br></br>    #");
                    final Pattern refEnd = Pattern.compile("\\|");
                    boolean inReferences = false;
                    boolean endRef = false;
                    boolean processed = false;
                        curLine = reader.readLine();
                        if (curLine != null)
                            if (!inReferences)
                            {   // # indicates start of references, check for it
                                final Matcher matcher =
                                inReferences = matcher.find();
                                if (inReferences)
                                {   // start of link, needs new line
                                    sBuffer.append("<br></br>    ");
                            if (inReferences)
                            {   // | indicates end of references, check for it
                                final Matcher matcher = refEnd.matcher(curLine);
                                endRef = matcher.find();

                            if (inReferences)
                            {   // delete "<br></br>    " inside of reference
                                    curLine.replaceAll("^.*?<br></br> *", ""));

                            if (endRef || ! inReferences)

                            if (endRef)
                                inReferences = false;
                                endRef = false;
                                processed = true; // processed one or more
                    while (curLine != null);
                    text = sBuffer.toString();
                    if (processed)
                    {   // only add links if processing went okay
                        text = text.replaceAll("(?s)(?m)(<br></br> +)#(.+?)\\|",
                            "$1<a href=\"bible://{$2}\">$2</a>");
                catch (IOException e)
                    e.printStackTrace();  // should we ignore this, it's a string, not a real IO stream????

The above solution is much easier to debug, seeing you can step through it.

Every entry I've tried works with the above code.  Even this long one from "FAMINE"...
<br></br> -Sent as a judgment
<br></br>    #Le 26:19-29; De 28:23,24,38-42; 1Ki 17:1; 2Ki 8:1; 1Ch
<br></br>    21:12; Ps 105:16; 107:33,34; Isa 3:1-8; 14:30; Jer 19:9;
<br></br>    14:15-22; 29:17,19; La 5:4,5,10; Eze 4:16,17; 5:16,17;
<br></br>    14:13; Joe 1:15,16; Am 4:6-9; 5:16,17; Hag 1:10,11; Mt 24:7;
<br></br>    Lu 21:11; Re 6:5-8|

The code is kind of ugly, but so is the text that is being processed.  The processing, on my machine, is showing 0ms-1ms processing time; so it's extremely fast.  Is this worth including, or should we direct people to just use the TEI beta version?  Either way you want it works for me; I just thought that the average person might not know to try the beta repo.  Besides, one thing I REALLY like about the Java code, is it loads all the verses in one swoop when you click, where as you can't do that with the TEI, because every single verse/verse-range is a <ref...> followed by <p>, etc, etc.  So, without a bunch of extra XSL work, applying templates in certain cases, and only with certain books, etc, etc, it would take more time to do.

Ah well, it was fun either way; but then I get cheap thrills too, lol.


----- "trent jsword" <trent.jsword at trentonadams.ca> wrote:

> From: "trent jsword" <trent.jsword at trentonadams.ca>
> To: "DM Smith" <dmsmith at crosswire.org>
> Cc: "J-Sword Developers Mailing List" <jsword-devel at crosswire.org>
> Sent: Monday, March 1, 2010 11:09:43 PM GMT -06:00 US/Canada Central
> Subject: Re: [jsword-devel] Nave's Topical Bible Reference linking
> Sorry for the delay in replying, I was busy with the wife and kids all
> day.  We went bowling and what not, lot's of fun.
> Nope, wasn't aware of that new beta, I'll check for that next time I
> work on changes related to an existing module.
> I'll have to fix up my XSL that I gave you previously.  The
> "otherwise" may need to just be removed, and replaced with something
> else, like checking for non-existent attributes, depending, as it
> conflicts with the TEI stuff in the new Nave's.
> But, the replace I gave you is almost right, I just have to make it
> work with multi-line mode.  I'll make the replacement check for
> version 1.0 as well, and that way the new Nave's in beta will be
> completely ignored.  Does that sound like a good way of doing this?
> Should the code somehow be marked for removal at a later date?  Or
> should we keep it around for quite awhile after the new Nave's is
> moved to production?  I'd imagine, so we don't affect users too much,
> we may just want to leave it, eh?
> Anyhow, I'm going to work on some of the XSL, and test a bit.  I'm
> seeing some issues with the new Nave's, cause it doesn't appear to be
> marked up properly in some cases, in which case I should probably
> report it.
> Thanks.
> ----- "DM Smith" <dmsmith at crosswire.org> wrote:
> > From: "DM Smith" <dmsmith at crosswire.org>
> > To: "Trenton D. Adams" <trent.jsword at trentonadams.ca>, "J-Sword
> Developers Mailing List" <jsword-devel at crosswire.org>
> > Sent: Monday, March 1, 2010 6:39:23 AM GMT -06:00 US/Canada Central
> > Subject: Re: [jsword-devel] Nave's Topical Bible Reference linking
> >
> > I think there's a beta for it.
> > Are you using that?
> > -- DM
> > 
> > On Mar 1, 2010, at 2:40 AM, Trenton D. Adams wrote:
> > 
> > > textPaneBookDataDisplay.refresh()
> > > 
> > > Under the following line...
> > >            String text = XMLUtil.writeToString(htmlsep);
> > > 
> > > put....
> > >            if ("Nave".equals(bmd.getInitials())) {
> > >                text = text.replaceAll("(#)(.+\\|)", "$1<a
> > href=\"bible://{$2}\">$2</a>");
> > >            }
> > > 
> > > 1. The Nave's is pretty much raw text, and the biblical
> references
> > are not marked up as far as I can see.  So, I figured a regex
> > replacement with a link was the best way of doing it.
> > > 2. Used the convenience method replaceAll() as it's a single
> call,
> > unless you want to cache a pre-compiled pattern matcher in some
> way?
> > > 
> > > Any thoughts?  Can we add this?
> > > 
> > > Thanks.
> > > 
> > > _______________________________________________
> > > jsword-devel mailing list
> > > jsword-devel at crosswire.org
> > > http://www.crosswire.org/mailman/listinfo/jsword-devel
> _______________________________________________
> jsword-devel mailing list
> jsword-devel at crosswire.org
> http://www.crosswire.org/mailman/listinfo/jsword-devel

More information about the jsword-devel mailing list