[jsword-devel] XSLT and enrichment of OSIS Text...

Chris Burrell chris at burrell.me.uk
Sun Nov 21 14:36:36 MST 2010


Created js-125 and attached some sample code...

Chris

On 21 November 2010 13:36, Chris Burrell <chris at burrell.me.uk> wrote:

> Thanks for the answers, will load up the samples sometime this week. As far
> as Crucible go, is a fully blown product, rather than a JIRA plugin. But it
> integrates with JIRA so we can set that up when it's installed.
> Licensing-wise it's the same model as the other Atlassian products. It's
> free for open-source development.
>
> Chris
>
>
>
> On 21 November 2010 13:19, DM Smith <dmsmith at crosswire.org> wrote:
>
>>
>> On Nov 21, 2010, at 5:47 AM, Chris Burrell wrote:
>>
>> Thanks Steven. However, I believe some Books only have strong numbers in
>> one testament? (perhaps I'm wrong here).
>>
>>
>> Sort of right. The Greek NTs only have them in the NT, if at all. All
>> others either have it in both or in none at all. At least for the CrossWire
>> modules. Can't vouch for other modules.
>>
>> And therefore, a user interface would only be able to provide an
>> interlinear on the New Testament or the Old Testament. Same for morphology.
>>
>> Ditto. But regarding morphology, we don't have an OT morphology (like we
>> have Robinson's for the NT), so from a practical perspective, we don't have
>> it for any OT.
>>
>> I believe a number of Books have the information in the New Testament, but
>> not in the Old Testament?
>>
>> I take it from your comments before, DM, on another thread, that it's
>> probably unlikely to be able to determine this very easily? unless one
>> assumes that if a word is tagged with morphological information in the OT,
>> then the whole of the OT is tagged. I assume that's not an assumption I can
>> make...
>>
>> DM, should I create an issue in JIRA and attach some of the interlinear
>> work I've been doing - it's not quite finished yet, but should be shortly.
>>
>>
>> Sure. It's fair to create a JIRA issue at any time. BTW, I've now got all
>> JIRA for JSword and BibleDesktop going here.
>>
>> I'm not particularly happy with my storing/indexing mechanism, but can't
>> think of a more efficient way at the moment. All it would be really, is an
>> XSLT, with a few java objects that do the lookups.
>>
>>
>> That's OK. Jira is a good place to iterate patches. Don't bother deleting
>> the old but just upload the new one with the same name. A convention on
>> Lucene, of which I have contributed, is to suffix the file with .patch. They
>> also name the files with the jira issue, e.g. JS-104.patch.
>>
>> I was wondering, it would be nice to have something like Crucible.  I'm
>> would be very happy to set it up for Crosswire. I've used it at work, and it
>> is really very good. http://www.atlassian.com/software/crucible/tour/ I
>> think that would make viewing patches, and so on particularly good :)
>>
>>
>> As long as it is free or that we have the license freely. I'm the JIRA
>> admin, so I probably need to do the install. Perhaps, you can walk me
>> through it, if needed. I can't get to it today. Not sure when I'll next have
>> some free time. So bug me again later, if I don't get to it.
>>
>>
>>
>> Chris
>>
>> On 15 November 2010 13:16, Mullins, Steven (DMME) <
>> Steven.Mullins at dmme.virginia.gov> wrote:
>>
>>>  Chris,
>>>
>>> Some have lexical forms instead of Stongs numbers.  That is, they give a
>>> "dictonary" form for the inflected word in the text.  These usually also
>>> give morphology information too.
>>>
>>> Steve
>>>
>>>  ------------------------------
>>> *From:* Chris Burrell [mailto:chris at burrell.me.uk]
>>> *Sent:* Saturday, November 13, 2010 4:14 AM
>>>
>>> *To:* J-Sword Developers Mailing List
>>> *Subject:* Re: [jsword-devel] XSLT and enrichment of OSIS Text...
>>>
>>> Quick question, does anyone know of a way at the moment to determine
>>> whether Strongs/Morphology  information is available in a "Book", but on a
>>> testament level. Am I correct in thinking some books will have Strongs only
>>> in the New Testament? Or others might have morphology information only in
>>> the New Testament?
>>>
>>> That would be useful for telling the user for the interlinear options are
>>> for any given passage he is looking at...
>>>
>>> Chris
>>>
>>>
>>> On 9 November 2010 22:29, Chris Burrell <chris at burrell.me.uk> wrote:
>>>
>>>> I think I fixed the verse issue by having empty spans to make up for the
>>>> height (bit of a hack). Another way might be to set the height of the verse
>>>> to be 1em x the number of displayed lines. I think that helped to an extent
>>>> in the past.
>>>>
>>>> One interesting thing, with KJV at least is the punctuation and
>>>> whitespace. (I've now updated this with some approximation of what I mean
>>>> http://crosswire.org/~chrisburrell - click the config button if you
>>>> want to show in stacks and you'll have to select kjv and the passage again!
>>>> ). You can see straight away there's a slight issue. And that is because the
>>>> punctuation isn't included in the W elements. It's some text() that has for
>>>> direct parent a verse. Makes it quite hard to do properly in XSLT. Ideally I
>>>> reckon, we should try and parse it into the previous child. How, not quite
>>>> sure yet... Also look at Romans 1:4, you can see at least 4 spaces. That's
>>>> because there are 4 strongs that have been tagged, but don't have any
>>>> associated words. The question really, is whether to show them at all!? They
>>>> don't tell us much really, and almost suggest that these words didn't make
>>>> it in to the translation... So I'd be intrigued to see if you can amend your
>>>> interlinear and still have the punctuation and spaces display correctly AND
>>>> have the Ws spaced correctly when there is no punctuation.
>>>>
>>>> As for the bug, this can be discussed on the other thread. This the
>>>> singleton issue that I believe is in the driver reading the passage, which
>>>> means passages get mixed up. For me, it's definitely a concurrency issue,
>>>> since my left and right panes, when set to the same book, loaded up and
>>>> mixed up the passages between themselves.
>>>>
>>>> As for the parameter, I mean one that gets set in the Transformer before
>>>> the xslt is processed. Let me share some real code.
>>>>
>>>> Top of the stylesheet before anything is declared:
>>>>
>>>> <xsl:param name="InterlinearProvider" /> <!-- passed in as a java object
>>>> from code -->
>>>> <xsl:variable name="interlinearProviderService"
>>>> select="jsword:com.tyndalehouse.step.core.xsl.IPSample.new()" /> <!--
>>>> creating instance in spreadsheet to show the problem -->
>>>>
>>>>  <!-- this works and I can thereby confirm that InterlinearProvider is
>>>> an object of the right type -->
>>>> <xsl:variable name="interlinearWord"
>>>> select="jsword:getWord($interlinearProviderService,
>>>> $InterlinearProvider)"/>-->
>>>> <xsl:value-of select="$interlinearWord"/>
>>>>
>>>> <!-- this doesn't work -->
>>>> <xsl:variable name="interlinearWord"
>>>> select="jsword:getWord($InterlinearProvider, $InterlinearProvider)"/>
>>>> <xsl:value-of select="$interlinearWord"/>
>>>>
>>>> The function signature for the purpose of the test was getWord(Object
>>>> o);
>>>>
>>>> But I've tried with getWord() (calling getWord($InterlinearProvider)
>>>> doesn't work, calling getWord($interlinearProviderService) does work. I've
>>>> also tried getWord(String,String), etc.
>>>>
>>>> So it seems it works for things that initialised in the XSLT
>>>> (interlinearProviderService), but not for those outside, even though I can
>>>> see that the stylesheet thinks of $InterlinearProvider as a java object, cos
>>>> I can pass it in and have look at its properties in debug mode.
>>>>
>>>> I've also tried copy the xsl:param to xsl:variable first and calling
>>>> getWord($copiedVariable) but that didn't work either.
>>>>
>>>> Does that make more sense?
>>>> Chris
>>>>
>>>>
>>>>   On 9 November 2010 20:56, DM Smith <dmsmith at crosswire.org> wrote:
>>>>
>>>>>   On 11/09/2010 11:19 AM, Chris Burrell wrote:
>>>>>
>>>>> Trent/All
>>>>>
>>>>> I have successfully managed to call instance methods on variables set
>>>>> and initialised in the XSL. However, as soon as I change to a parameter that
>>>>> is passed in (using the same format jsword:getWord($interlinearObj, arg1,
>>>>> arg2) ) my stylesheet refuses to compile at runtime. I've also tried copying
>>>>> the "xsl:parameter" into the xsl:variable, but that doesn't work either.
>>>>>
>>>>>
>>>>> I'm not sure if you mean a parameter to the stylesheet. If so in the
>>>>> top of BD's xslt it copies the values into variables and uses them.
>>>>>
>>>>> If you mean a parameter to a template, I'm not sure what the problem
>>>>> would be. You might need to use some quoting magic. Can't remember off the
>>>>> top of my head how it is done. Maybe ${xyz}???
>>>>>
>>>>> If you mean that $interlinearObj is not a string or a number, then that
>>>>> is likely to be your problem.
>>>>>
>>>>>
>>>>>
>>>>> However I am able to pass the parameter as an instance method parameter
>>>>> (so I could pass it to X that would instantiated during the XSL
>>>>> transformation, and then invoke the method on it?). Or I could pass in
>>>>> parameters to initialise it in the XSL, as opposed to passing in the
>>>>> already-instantiated object into the XSL and trying to invoke a method on
>>>>> that...
>>>>>
>>>>> So no biggies... But would have been nicer to provide an object that
>>>>> the XSL just needs to use, rather than set up as well.
>>>>> Any ideas?
>>>>>
>>>>> Chris
>>>>>
>>>>>
>>>>> On 7 November 2010 00:04, Chris Burrell <chris at burrell.me.uk> wrote:
>>>>>
>>>>>> On this last note, I believe we have concurrency issues. I have a two
>>>>>> column page, displaying one passage each. On load of the page they load up a
>>>>>> passage each, but then this once, the passage on the right (only verse 1)
>>>>>> has gone to the left (which was requesting just one verse but from a
>>>>>> different passage:
>>>>>>
>>>>>> left pane: requested Acts 2:10, got Romans 1:1
>>>>>> right pane: corrupt XML in verse 1, verse 2 seems to be Romans
>>>>>> 1:2-following
>>>>>>
>>>>>> Anyone else come across those issues?
>>>>>> Chris
>>>>>>
>>>>>> On 6 November 2010 20:53, Chris Burrell <chris at burrell.me.uk> wrote:
>>>>>>
>>>>>>> Another question too. It seems sometimes, both in bible desktop and
>>>>>>> my current application, the html rendered is broken?
>>>>>>>
>>>>>>> Any ideas why that might be?
>>>>>>>
>>>>>>> For example, I get:
>>>>>>> "<div class="passageText ui-widget"><div><h2 class="heading">Acts
>>>>>>> 2:10</h2><span class="verse"><span class="w"><sup
>>>>>>> class="verseNumber">10</sup></span><span class="w"*><span
>>>>>>> class="text">emma="strong:G1909" morph="robinson:PREP*"
>>>>>>> src="4"&gt;upon every soul of man that doeth evil, of the Jew first, and
>>>>>>> also of the Gentile;</span></span></span> </div></div>"
>>>>>>>
>>>>>>> The above in bold shows that it didn't get XSLTed properly.
>>>>>>>
>>>>>>> Instead of "<div class="passageText ui-widget"><div><h2
>>>>>>> class="heading">Acts 2:10</h2><span class="verse"><sup
>>>>>>> class="verseNumber">10</sup><span class="w"><span class="text">&nbsp;</span>
>>>>>>> </span><span class="w"><span class="text">&nbsp;</span> </span><span
>>>>>>> class="w"><span class="text">Phrygia</span></span>, <span class="w"><span
>>>>>>> class="text">&nbsp;</span> </span><span class="w"><span
>>>>>>> class="text">and</span></span> <span class="w"><span
>>>>>>> class="text">Pamphylia</span></span>, <span class="w"><span class="text">in
>>>>>>> Egypt</span></span>, <span class="w"><span class="text">and</span></span>
>>>>>>> <span class="w"><span class="text">in the parts</span></span> <span
>>>>>>> class="w"><span class="text">of Libya</span></span> <span class="w"><span
>>>>>>> class="text">about</span></span> <span class="w"><span
>>>>>>> class="text">Cyrene</span></span>, <span class="w"><span
>>>>>>> class="text">and</span></span> <span class="w"><span
>>>>>>> class="text">strangers</span></span> <span class="w"><span class="text">of
>>>>>>> Rome</span></span>, <span class="w"><span class="text">Jews</span></span>
>>>>>>> <span class="w"><span class="text">&nbsp;</span> </span><span
>>>>>>> class="w"><span class="text">and</span></span> <span class="w"><span
>>>>>>> class="text">proselytes</span></span>,</span> </div></div>"
>>>>>>>
>>>>>>> So somehow it lost a whole load on the way out of the XSLT? The only
>>>>>>> difference is that the first one is on startup of the server, the second is
>>>>>>> with a refresh in the browser. Perhaps something hasn't loaded up
>>>>>>> correctly/entirely?
>>>>>>>
>>>>>>> Chris
>>>>>>>
>>>>>>> On 5 November 2010 23:10, Chris Burrell <chris at burrell.me.uk> wrote:
>>>>>>>
>>>>>>>> Thanks DM. So I found this page (again)!
>>>>>>>> http://www.crosswire.org/~dmsmith/interlinear/<http://www.crosswire.org/%7Edmsmith/interlinear/>
>>>>>>>>
>>>>>>>> And managed to replicate (and solve?) the issues I found originally
>>>>>>>> when I looked at it before:
>>>>>>>>
>>>>>>>> 1st When lines in the interlinear only have 1 line (i.e. no 2nd/3rd
>>>>>>>> or 4th line). As a result, when the text wraps, it floats below the first
>>>>>>>> line. As a hack (although on could argue that there is an empty spot there,
>>>>>>>> rather than nothing), I think we can put a <span>&nbsp;</span> or we could
>>>>>>>> use a height maybe? (not quite so good, unless we specify in ems and exs).
>>>>>>>> And the second thing is that within a particular word stack, the words might
>>>>>>>> wrap. I believe this particular issue is only visible in IE. For IE 8, the
>>>>>>>> fix is to put a whitespace: nowrap CSS directive. Not sure if that helps on
>>>>>>>> IE6 and 7 though? Spec says it should be supported on both browsers.
>>>>>>>>
>>>>>>>> And yup, I'm targetting web environments, and also web mobile
>>>>>>>> browsers.
>>>>>>>> Chris
>>>>>>>>
>>>>>>>>
>>>>>>>> On 5 November 2010 20:09, DM Smith <dmsmith at crosswire.org> wrote:
>>>>>>>>
>>>>>>>>> I'm heading out for the weekend. In a few minutes.
>>>>>>>>> It'll probably be Monday evening when I send it.
>>>>>>>>>
>>>>>>>>> The solution uses spans with their display set to block.
>>>>>>>>>
>>>>>>>>> -- DM
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 11/05/2010 03:55 PM, Chris Burrell wrote:
>>>>>>>>>
>>>>>>>>> DM, you said you might have an intearlinear model that worked? I
>>>>>>>>> had another look to see how I did mine previously, and found that in fact I
>>>>>>>>> used tables. I think I struggled for quite a while to get a model working
>>>>>>>>> across browsers using DIVs, but none of them seemed to wrap properly at the
>>>>>>>>> end of the line.  But unfortunately table layouts are slow and therefore it
>>>>>>>>> would be better to have divs.
>>>>>>>>>
>>>>>>>>> Would you be able to let me have your samples?
>>>>>>>>> Chris
>>>>>>>>>
>>>>>>>>> On 5 November 2010 19:21, Chris Burrell <chris at burrell.me.uk>wrote:
>>>>>>>>>
>>>>>>>>>> What's GNT? Greek New Testament? I think we can do more than that
>>>>>>>>>> too. If other Bible versions have strong numbers and/or morphology tags,
>>>>>>>>>> then we can put those in parallel, and end up having French with English
>>>>>>>>>> "subtitles", or English with English, as well as English with Greek, etc.
>>>>>>>>>>
>>>>>>>>>> So I've had a look at the framework so far and it seems fairly
>>>>>>>>>> easy not to use Bible Desktop components and have a good XSLT
>>>>>>>>>> transformation. So all we would need to add is some helpers that users can
>>>>>>>>>> easily integrate into their XSLTs. It would nice to have some sample XSLs
>>>>>>>>>> for people to use. So for example, I've had to strip out all the CSS and
>>>>>>>>>> font tags from the Bible Desktop one so as to produce a good XHTML compliant
>>>>>>>>>> one.
>>>>>>>>>>
>>>>>>>>>> Say we give the XSLT a InterlinearProvider initialised with its
>>>>>>>>>> version and passage, as it parses the strong/morph option we can then call
>>>>>>>>>> get($provider, @strong, @morph), which would in turn optionally return the
>>>>>>>>>> correct words (or best word since sometimes you may have multiple options in
>>>>>>>>>> modules tagged with strong numbers only. In fact it would be better to have
>>>>>>>>>> something like get($provider, osis_verse_id, @strong, @morph). Since then,
>>>>>>>>>> if we don't have the morphology of the word, at least we can limit the
>>>>>>>>>> lookups to those words that are tagged in a particular verse (that assumes
>>>>>>>>>> that versification is comparable between versions).
>>>>>>>>>>
>>>>>>>>>> We'll want to add options to have tagged information displayed on
>>>>>>>>>> the side of a word/phrase or below a word/phrase. At the moment the XSLT
>>>>>>>>>> displays morph and strong tags next to the text. I'll add some
>>>>>>>>>> transformations to have it on separate lines. Then we can reuse the same
>>>>>>>>>> transformations to line up text beneath it.
>>>>>>>>>>
>>>>>>>>>> DM, I had a look at "flying saucer" , but didn't quite understand
>>>>>>>>>> where it comes in? Would the idea be instead of the XSLT? And have it
>>>>>>>>>> transform to different UIs?
>>>>>>>>>>
>>>>>>>>>> Chris
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 5 November 2010 03:51, Tonny Kohar <tonny.kohar at gmail.com>wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi,
>>>>>>>>>>>
>>>>>>>>>>> On Thu, Nov 4, 2010 at 11:30 PM, DM Smith <dmsmith at crosswire.org>
>>>>>>>>>>> wrote:
>>>>>>>>>>> > Much of the transformations is done in BibleDesktop.
>>>>>>>>>>> Refactoring these and
>>>>>>>>>>> > putting it into JSword and/or common would be good.
>>>>>>>>>>> >
>>>>>>>>>>>
>>>>>>>>>>> +1
>>>>>>>>>>> Yes it would be nice to have this under JSword instead of
>>>>>>>>>>> BIbleDesktop
>>>>>>>>>>>
>>>>>>>>>>> Sincerely
>>>>>>>>>>> Tonny Kohar
>>>>>>>>>>>
>>>>>>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> jsword-devel mailing list
>>>>> jsword-devel at crosswire.org
>>>>> http://www.crosswire.org/mailman/listinfo/jsword-devel
>>>>>
>>>>>
>>>>
>>>
>> _______________________________________________
>> jsword-devel mailing list
>> jsword-devel at crosswire.org
>> http://www.crosswire.org/mailman/listinfo/jsword-devel
>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.crosswire.org/pipermail/jsword-devel/attachments/20101121/3bc0501f/attachment-0001.html>


More information about the jsword-devel mailing list