[jsword-devel] XSLT and enrichment of OSIS Text...

Chris Burrell chris at burrell.me.uk
Mon Nov 22 02:41:29 MST 2010


Anyone know how to look ahead in an XSLT?

I would like to wrap punctuation of an OSIS xml module into the previous xml
element. Let me explain:

we may have something looking like:

<verse>
  <w>he</w>
  <w>laughs</w>
  ,
  <w>he</w>
  <w>sings</w>
</verse>

I've tried using following::*[1] and following-sibling::*[1] but that gives
me the next node, i.e. the next "w" node. But I need the text. So for
example, when I'm transforming "<w>laughs</w>" I would like to put the text
"laughs," into my target document.

Any ideas? I tried looking on the internet, but haven't found much so far.
Just wondering if you've experienced this before? and how you got around it?
Chris




On 21 November 2010 21:36, Chris Burrell <chris at burrell.me.uk> wrote:

> Created js-125 and attached some sample code...
>
> Chris
>
>
> On 21 November 2010 13:36, Chris Burrell <chris at burrell.me.uk> wrote:
>
>> Thanks for the answers, will load up the samples sometime this week. As
>> far as Crucible go, is a fully blown product, rather than a JIRA plugin. But
>> it integrates with JIRA so we can set that up when it's installed.
>> Licensing-wise it's the same model as the other Atlassian products. It's
>> free for open-source development.
>>
>> Chris
>>
>>
>>
>> On 21 November 2010 13:19, DM Smith <dmsmith at crosswire.org> wrote:
>>
>>>
>>> On Nov 21, 2010, at 5:47 AM, Chris Burrell wrote:
>>>
>>> Thanks Steven. However, I believe some Books only have strong numbers in
>>> one testament? (perhaps I'm wrong here).
>>>
>>>
>>> Sort of right. The Greek NTs only have them in the NT, if at all. All
>>> others either have it in both or in none at all. At least for the CrossWire
>>> modules. Can't vouch for other modules.
>>>
>>> And therefore, a user interface would only be able to provide an
>>> interlinear on the New Testament or the Old Testament. Same for morphology.
>>>
>>> Ditto. But regarding morphology, we don't have an OT morphology (like we
>>> have Robinson's for the NT), so from a practical perspective, we don't have
>>> it for any OT.
>>>
>>> I believe a number of Books have the information in the New Testament,
>>> but not in the Old Testament?
>>>
>>> I take it from your comments before, DM, on another thread, that it's
>>> probably unlikely to be able to determine this very easily? unless one
>>> assumes that if a word is tagged with morphological information in the OT,
>>> then the whole of the OT is tagged. I assume that's not an assumption I can
>>> make...
>>>
>>> DM, should I create an issue in JIRA and attach some of the interlinear
>>> work I've been doing - it's not quite finished yet, but should be shortly.
>>>
>>>
>>> Sure. It's fair to create a JIRA issue at any time. BTW, I've now got all
>>> JIRA for JSword and BibleDesktop going here.
>>>
>>> I'm not particularly happy with my storing/indexing mechanism, but can't
>>> think of a more efficient way at the moment. All it would be really, is an
>>> XSLT, with a few java objects that do the lookups.
>>>
>>>
>>> That's OK. Jira is a good place to iterate patches. Don't bother deleting
>>> the old but just upload the new one with the same name. A convention on
>>> Lucene, of which I have contributed, is to suffix the file with .patch. They
>>> also name the files with the jira issue, e.g. JS-104.patch.
>>>
>>> I was wondering, it would be nice to have something like Crucible.  I'm
>>> would be very happy to set it up for Crosswire. I've used it at work, and it
>>> is really very good. http://www.atlassian.com/software/crucible/tour/ I
>>> think that would make viewing patches, and so on particularly good :)
>>>
>>>
>>> As long as it is free or that we have the license freely. I'm the JIRA
>>> admin, so I probably need to do the install. Perhaps, you can walk me
>>> through it, if needed. I can't get to it today. Not sure when I'll next have
>>> some free time. So bug me again later, if I don't get to it.
>>>
>>>
>>>
>>> Chris
>>>
>>> On 15 November 2010 13:16, Mullins, Steven (DMME) <
>>> Steven.Mullins at dmme.virginia.gov> wrote:
>>>
>>>>  Chris,
>>>>
>>>> Some have lexical forms instead of Stongs numbers.  That is, they give a
>>>> "dictonary" form for the inflected word in the text.  These usually also
>>>> give morphology information too.
>>>>
>>>> Steve
>>>>
>>>>  ------------------------------
>>>> *From:* Chris Burrell [mailto:chris at burrell.me.uk]
>>>> *Sent:* Saturday, November 13, 2010 4:14 AM
>>>>
>>>> *To:* J-Sword Developers Mailing List
>>>> *Subject:* Re: [jsword-devel] XSLT and enrichment of OSIS Text...
>>>>
>>>> Quick question, does anyone know of a way at the moment to determine
>>>> whether Strongs/Morphology  information is available in a "Book", but on a
>>>> testament level. Am I correct in thinking some books will have Strongs only
>>>> in the New Testament? Or others might have morphology information only in
>>>> the New Testament?
>>>>
>>>> That would be useful for telling the user for the interlinear options
>>>> are for any given passage he is looking at...
>>>>
>>>> Chris
>>>>
>>>>
>>>> On 9 November 2010 22:29, Chris Burrell <chris at burrell.me.uk> wrote:
>>>>
>>>>> I think I fixed the verse issue by having empty spans to make up for
>>>>> the height (bit of a hack). Another way might be to set the height of the
>>>>> verse to be 1em x the number of displayed lines. I think that helped to an
>>>>> extent in the past.
>>>>>
>>>>> One interesting thing, with KJV at least is the punctuation and
>>>>> whitespace. (I've now updated this with some approximation of what I mean
>>>>> http://crosswire.org/~chrisburrell - click the config button if you
>>>>> want to show in stacks and you'll have to select kjv and the passage again!
>>>>> ). You can see straight away there's a slight issue. And that is because the
>>>>> punctuation isn't included in the W elements. It's some text() that has for
>>>>> direct parent a verse. Makes it quite hard to do properly in XSLT. Ideally I
>>>>> reckon, we should try and parse it into the previous child. How, not quite
>>>>> sure yet... Also look at Romans 1:4, you can see at least 4 spaces. That's
>>>>> because there are 4 strongs that have been tagged, but don't have any
>>>>> associated words. The question really, is whether to show them at all!? They
>>>>> don't tell us much really, and almost suggest that these words didn't make
>>>>> it in to the translation... So I'd be intrigued to see if you can amend your
>>>>> interlinear and still have the punctuation and spaces display correctly AND
>>>>> have the Ws spaced correctly when there is no punctuation.
>>>>>
>>>>> As for the bug, this can be discussed on the other thread. This the
>>>>> singleton issue that I believe is in the driver reading the passage, which
>>>>> means passages get mixed up. For me, it's definitely a concurrency issue,
>>>>> since my left and right panes, when set to the same book, loaded up and
>>>>> mixed up the passages between themselves.
>>>>>
>>>>> As for the parameter, I mean one that gets set in the Transformer
>>>>> before the xslt is processed. Let me share some real code.
>>>>>
>>>>> Top of the stylesheet before anything is declared:
>>>>>
>>>>> <xsl:param name="InterlinearProvider" /> <!-- passed in as a java
>>>>> object from code -->
>>>>> <xsl:variable name="interlinearProviderService"
>>>>> select="jsword:com.tyndalehouse.step.core.xsl.IPSample.new()" /> <!--
>>>>> creating instance in spreadsheet to show the problem -->
>>>>>
>>>>>  <!-- this works and I can thereby confirm that InterlinearProvider is
>>>>> an object of the right type -->
>>>>> <xsl:variable name="interlinearWord"
>>>>> select="jsword:getWord($interlinearProviderService,
>>>>> $InterlinearProvider)"/>-->
>>>>> <xsl:value-of select="$interlinearWord"/>
>>>>>
>>>>> <!-- this doesn't work -->
>>>>> <xsl:variable name="interlinearWord"
>>>>> select="jsword:getWord($InterlinearProvider, $InterlinearProvider)"/>
>>>>> <xsl:value-of select="$interlinearWord"/>
>>>>>
>>>>> The function signature for the purpose of the test was getWord(Object
>>>>> o);
>>>>>
>>>>> But I've tried with getWord() (calling getWord($InterlinearProvider)
>>>>> doesn't work, calling getWord($interlinearProviderService) does work. I've
>>>>> also tried getWord(String,String), etc.
>>>>>
>>>>> So it seems it works for things that initialised in the XSLT
>>>>> (interlinearProviderService), but not for those outside, even though I can
>>>>> see that the stylesheet thinks of $InterlinearProvider as a java object, cos
>>>>> I can pass it in and have look at its properties in debug mode.
>>>>>
>>>>> I've also tried copy the xsl:param to xsl:variable first and calling
>>>>> getWord($copiedVariable) but that didn't work either.
>>>>>
>>>>> Does that make more sense?
>>>>> Chris
>>>>>
>>>>>
>>>>>   On 9 November 2010 20:56, DM Smith <dmsmith at crosswire.org> wrote:
>>>>>
>>>>>>   On 11/09/2010 11:19 AM, Chris Burrell wrote:
>>>>>>
>>>>>> Trent/All
>>>>>>
>>>>>> I have successfully managed to call instance methods on variables set
>>>>>> and initialised in the XSL. However, as soon as I change to a parameter that
>>>>>> is passed in (using the same format jsword:getWord($interlinearObj, arg1,
>>>>>> arg2) ) my stylesheet refuses to compile at runtime. I've also tried copying
>>>>>> the "xsl:parameter" into the xsl:variable, but that doesn't work either.
>>>>>>
>>>>>>
>>>>>> I'm not sure if you mean a parameter to the stylesheet. If so in the
>>>>>> top of BD's xslt it copies the values into variables and uses them.
>>>>>>
>>>>>> If you mean a parameter to a template, I'm not sure what the problem
>>>>>> would be. You might need to use some quoting magic. Can't remember off the
>>>>>> top of my head how it is done. Maybe ${xyz}???
>>>>>>
>>>>>> If you mean that $interlinearObj is not a string or a number, then
>>>>>> that is likely to be your problem.
>>>>>>
>>>>>>
>>>>>>
>>>>>> However I am able to pass the parameter as an instance method
>>>>>> parameter (so I could pass it to X that would instantiated during the XSL
>>>>>> transformation, and then invoke the method on it?). Or I could pass in
>>>>>> parameters to initialise it in the XSL, as opposed to passing in the
>>>>>> already-instantiated object into the XSL and trying to invoke a method on
>>>>>> that...
>>>>>>
>>>>>> So no biggies... But would have been nicer to provide an object that
>>>>>> the XSL just needs to use, rather than set up as well.
>>>>>> Any ideas?
>>>>>>
>>>>>> Chris
>>>>>>
>>>>>>
>>>>>> On 7 November 2010 00:04, Chris Burrell <chris at burrell.me.uk> wrote:
>>>>>>
>>>>>>> On this last note, I believe we have concurrency issues. I have a two
>>>>>>> column page, displaying one passage each. On load of the page they load up a
>>>>>>> passage each, but then this once, the passage on the right (only verse 1)
>>>>>>> has gone to the left (which was requesting just one verse but from a
>>>>>>> different passage:
>>>>>>>
>>>>>>> left pane: requested Acts 2:10, got Romans 1:1
>>>>>>> right pane: corrupt XML in verse 1, verse 2 seems to be Romans
>>>>>>> 1:2-following
>>>>>>>
>>>>>>> Anyone else come across those issues?
>>>>>>> Chris
>>>>>>>
>>>>>>> On 6 November 2010 20:53, Chris Burrell <chris at burrell.me.uk> wrote:
>>>>>>>
>>>>>>>> Another question too. It seems sometimes, both in bible desktop and
>>>>>>>> my current application, the html rendered is broken?
>>>>>>>>
>>>>>>>> Any ideas why that might be?
>>>>>>>>
>>>>>>>> For example, I get:
>>>>>>>> "<div class="passageText ui-widget"><div><h2 class="heading">Acts
>>>>>>>> 2:10</h2><span class="verse"><span class="w"><sup
>>>>>>>> class="verseNumber">10</sup></span><span class="w"*><span
>>>>>>>> class="text">emma="strong:G1909" morph="robinson:PREP*"
>>>>>>>> src="4"&gt;upon every soul of man that doeth evil, of the Jew first, and
>>>>>>>> also of the Gentile;</span></span></span> </div></div>"
>>>>>>>>
>>>>>>>> The above in bold shows that it didn't get XSLTed properly.
>>>>>>>>
>>>>>>>> Instead of "<div class="passageText ui-widget"><div><h2
>>>>>>>> class="heading">Acts 2:10</h2><span class="verse"><sup
>>>>>>>> class="verseNumber">10</sup><span class="w"><span class="text">&nbsp;</span>
>>>>>>>> </span><span class="w"><span class="text">&nbsp;</span> </span><span
>>>>>>>> class="w"><span class="text">Phrygia</span></span>, <span class="w"><span
>>>>>>>> class="text">&nbsp;</span> </span><span class="w"><span
>>>>>>>> class="text">and</span></span> <span class="w"><span
>>>>>>>> class="text">Pamphylia</span></span>, <span class="w"><span class="text">in
>>>>>>>> Egypt</span></span>, <span class="w"><span class="text">and</span></span>
>>>>>>>> <span class="w"><span class="text">in the parts</span></span> <span
>>>>>>>> class="w"><span class="text">of Libya</span></span> <span class="w"><span
>>>>>>>> class="text">about</span></span> <span class="w"><span
>>>>>>>> class="text">Cyrene</span></span>, <span class="w"><span
>>>>>>>> class="text">and</span></span> <span class="w"><span
>>>>>>>> class="text">strangers</span></span> <span class="w"><span class="text">of
>>>>>>>> Rome</span></span>, <span class="w"><span class="text">Jews</span></span>
>>>>>>>> <span class="w"><span class="text">&nbsp;</span> </span><span
>>>>>>>> class="w"><span class="text">and</span></span> <span class="w"><span
>>>>>>>> class="text">proselytes</span></span>,</span> </div></div>"
>>>>>>>>
>>>>>>>> So somehow it lost a whole load on the way out of the XSLT? The only
>>>>>>>> difference is that the first one is on startup of the server, the second is
>>>>>>>> with a refresh in the browser. Perhaps something hasn't loaded up
>>>>>>>> correctly/entirely?
>>>>>>>>
>>>>>>>> Chris
>>>>>>>>
>>>>>>>> On 5 November 2010 23:10, Chris Burrell <chris at burrell.me.uk>wrote:
>>>>>>>>
>>>>>>>>> Thanks DM. So I found this page (again)!
>>>>>>>>> http://www.crosswire.org/~dmsmith/interlinear/<http://www.crosswire.org/%7Edmsmith/interlinear/>
>>>>>>>>>
>>>>>>>>> And managed to replicate (and solve?) the issues I found originally
>>>>>>>>> when I looked at it before:
>>>>>>>>>
>>>>>>>>> 1st When lines in the interlinear only have 1 line (i.e. no 2nd/3rd
>>>>>>>>> or 4th line). As a result, when the text wraps, it floats below the first
>>>>>>>>> line. As a hack (although on could argue that there is an empty spot there,
>>>>>>>>> rather than nothing), I think we can put a <span>&nbsp;</span> or we could
>>>>>>>>> use a height maybe? (not quite so good, unless we specify in ems and exs).
>>>>>>>>> And the second thing is that within a particular word stack, the words might
>>>>>>>>> wrap. I believe this particular issue is only visible in IE. For IE 8, the
>>>>>>>>> fix is to put a whitespace: nowrap CSS directive. Not sure if that helps on
>>>>>>>>> IE6 and 7 though? Spec says it should be supported on both browsers.
>>>>>>>>>
>>>>>>>>> And yup, I'm targetting web environments, and also web mobile
>>>>>>>>> browsers.
>>>>>>>>> Chris
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 5 November 2010 20:09, DM Smith <dmsmith at crosswire.org> wrote:
>>>>>>>>>
>>>>>>>>>> I'm heading out for the weekend. In a few minutes.
>>>>>>>>>> It'll probably be Monday evening when I send it.
>>>>>>>>>>
>>>>>>>>>> The solution uses spans with their display set to block.
>>>>>>>>>>
>>>>>>>>>> -- DM
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 11/05/2010 03:55 PM, Chris Burrell wrote:
>>>>>>>>>>
>>>>>>>>>> DM, you said you might have an intearlinear model that worked? I
>>>>>>>>>> had another look to see how I did mine previously, and found that in fact I
>>>>>>>>>> used tables. I think I struggled for quite a while to get a model working
>>>>>>>>>> across browsers using DIVs, but none of them seemed to wrap properly at the
>>>>>>>>>> end of the line.  But unfortunately table layouts are slow and therefore it
>>>>>>>>>> would be better to have divs.
>>>>>>>>>>
>>>>>>>>>> Would you be able to let me have your samples?
>>>>>>>>>> Chris
>>>>>>>>>>
>>>>>>>>>> On 5 November 2010 19:21, Chris Burrell <chris at burrell.me.uk>wrote:
>>>>>>>>>>
>>>>>>>>>>> What's GNT? Greek New Testament? I think we can do more than that
>>>>>>>>>>> too. If other Bible versions have strong numbers and/or morphology tags,
>>>>>>>>>>> then we can put those in parallel, and end up having French with English
>>>>>>>>>>> "subtitles", or English with English, as well as English with Greek, etc.
>>>>>>>>>>>
>>>>>>>>>>> So I've had a look at the framework so far and it seems fairly
>>>>>>>>>>> easy not to use Bible Desktop components and have a good XSLT
>>>>>>>>>>> transformation. So all we would need to add is some helpers that users can
>>>>>>>>>>> easily integrate into their XSLTs. It would nice to have some sample XSLs
>>>>>>>>>>> for people to use. So for example, I've had to strip out all the CSS and
>>>>>>>>>>> font tags from the Bible Desktop one so as to produce a good XHTML compliant
>>>>>>>>>>> one.
>>>>>>>>>>>
>>>>>>>>>>> Say we give the XSLT a InterlinearProvider initialised with its
>>>>>>>>>>> version and passage, as it parses the strong/morph option we can then call
>>>>>>>>>>> get($provider, @strong, @morph), which would in turn optionally return the
>>>>>>>>>>> correct words (or best word since sometimes you may have multiple options in
>>>>>>>>>>> modules tagged with strong numbers only. In fact it would be better to have
>>>>>>>>>>> something like get($provider, osis_verse_id, @strong, @morph). Since then,
>>>>>>>>>>> if we don't have the morphology of the word, at least we can limit the
>>>>>>>>>>> lookups to those words that are tagged in a particular verse (that assumes
>>>>>>>>>>> that versification is comparable between versions).
>>>>>>>>>>>
>>>>>>>>>>> We'll want to add options to have tagged information displayed on
>>>>>>>>>>> the side of a word/phrase or below a word/phrase. At the moment the XSLT
>>>>>>>>>>> displays morph and strong tags next to the text. I'll add some
>>>>>>>>>>> transformations to have it on separate lines. Then we can reuse the same
>>>>>>>>>>> transformations to line up text beneath it.
>>>>>>>>>>>
>>>>>>>>>>> DM, I had a look at "flying saucer" , but didn't quite
>>>>>>>>>>> understand where it comes in? Would the idea be instead of the XSLT? And
>>>>>>>>>>> have it transform to different UIs?
>>>>>>>>>>>
>>>>>>>>>>> Chris
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 5 November 2010 03:51, Tonny Kohar <tonny.kohar at gmail.com>wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hi,
>>>>>>>>>>>>
>>>>>>>>>>>> On Thu, Nov 4, 2010 at 11:30 PM, DM Smith <
>>>>>>>>>>>> dmsmith at crosswire.org> wrote:
>>>>>>>>>>>> > Much of the transformations is done in BibleDesktop.
>>>>>>>>>>>> Refactoring these and
>>>>>>>>>>>> > putting it into JSword and/or common would be good.
>>>>>>>>>>>> >
>>>>>>>>>>>>
>>>>>>>>>>>> +1
>>>>>>>>>>>> Yes it would be nice to have this under JSword instead of
>>>>>>>>>>>> BIbleDesktop
>>>>>>>>>>>>
>>>>>>>>>>>> Sincerely
>>>>>>>>>>>> Tonny Kohar
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> jsword-devel mailing list
>>>>>> jsword-devel at crosswire.org
>>>>>> http://www.crosswire.org/mailman/listinfo/jsword-devel
>>>>>>
>>>>>>
>>>>>
>>>>
>>> _______________________________________________
>>> jsword-devel mailing list
>>> jsword-devel at crosswire.org
>>> http://www.crosswire.org/mailman/listinfo/jsword-devel
>>>
>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.crosswire.org/pipermail/jsword-devel/attachments/20101122/f1cf61f7/attachment-0001.html>


More information about the jsword-devel mailing list