[jsword-devel] XSLT and enrichment of OSIS Text...
DM Smith
dmsmith at crosswire.org
Sun Nov 21 06:19:20 MST 2010
On Nov 21, 2010, at 5:47 AM, Chris Burrell wrote:
> Thanks Steven. However, I believe some Books only have strong numbers in one testament? (perhaps I'm wrong here).
Sort of right. The Greek NTs only have them in the NT, if at all. All others either have it in both or in none at all. At least for the CrossWire modules. Can't vouch for other modules.
> And therefore, a user interface would only be able to provide an interlinear on the New Testament or the Old Testament. Same for morphology.
Ditto. But regarding morphology, we don't have an OT morphology (like we have Robinson's for the NT), so from a practical perspective, we don't have it for any OT.
> I believe a number of Books have the information in the New Testament, but not in the Old Testament?
>
> I take it from your comments before, DM, on another thread, that it's probably unlikely to be able to determine this very easily? unless one assumes that if a word is tagged with morphological information in the OT, then the whole of the OT is tagged. I assume that's not an assumption I can make...
>
> DM, should I create an issue in JIRA and attach some of the interlinear work I've been doing - it's not quite finished yet, but should be shortly.
Sure. It's fair to create a JIRA issue at any time. BTW, I've now got all JIRA for JSword and BibleDesktop going here.
> I'm not particularly happy with my storing/indexing mechanism, but can't think of a more efficient way at the moment. All it would be really, is an XSLT, with a few java objects that do the lookups.
That's OK. Jira is a good place to iterate patches. Don't bother deleting the old but just upload the new one with the same name. A convention on Lucene, of which I have contributed, is to suffix the file with .patch. They also name the files with the jira issue, e.g. JS-104.patch.
> I was wondering, it would be nice to have something like Crucible. I'm would be very happy to set it up for Crosswire. I've used it at work, and it is really very good. http://www.atlassian.com/software/crucible/tour/ I think that would make viewing patches, and so on particularly good :)
As long as it is free or that we have the license freely. I'm the JIRA admin, so I probably need to do the install. Perhaps, you can walk me through it, if needed. I can't get to it today. Not sure when I'll next have some free time. So bug me again later, if I don't get to it.
>
> Chris
>
> On 15 November 2010 13:16, Mullins, Steven (DMME) <Steven.Mullins at dmme.virginia.gov> wrote:
> Chris,
>
> Some have lexical forms instead of Stongs numbers. That is, they give a "dictonary" form for the inflected word in the text. These usually also give morphology information too.
>
> Steve
>
> From: Chris Burrell [mailto:chris at burrell.me.uk]
> Sent: Saturday, November 13, 2010 4:14 AM
>
> To: J-Sword Developers Mailing List
> Subject: Re: [jsword-devel] XSLT and enrichment of OSIS Text...
>
> Quick question, does anyone know of a way at the moment to determine whether Strongs/Morphology information is available in a "Book", but on a testament level. Am I correct in thinking some books will have Strongs only in the New Testament? Or others might have morphology information only in the New Testament?
>
> That would be useful for telling the user for the interlinear options are for any given passage he is looking at...
>
> Chris
>
>
> On 9 November 2010 22:29, Chris Burrell <chris at burrell.me.uk> wrote:
> I think I fixed the verse issue by having empty spans to make up for the height (bit of a hack). Another way might be to set the height of the verse to be 1em x the number of displayed lines. I think that helped to an extent in the past.
>
> One interesting thing, with KJV at least is the punctuation and whitespace. (I've now updated this with some approximation of what I mean http://crosswire.org/~chrisburrell - click the config button if you want to show in stacks and you'll have to select kjv and the passage again! ). You can see straight away there's a slight issue. And that is because the punctuation isn't included in the W elements. It's some text() that has for direct parent a verse. Makes it quite hard to do properly in XSLT. Ideally I reckon, we should try and parse it into the previous child. How, not quite sure yet... Also look at Romans 1:4, you can see at least 4 spaces. That's because there are 4 strongs that have been tagged, but don't have any associated words. The question really, is whether to show them at all!? They don't tell us much really, and almost suggest that these words didn't make it in to the translation... So I'd be intrigued to see if you can amend your interlinear and still have the punctuation and spaces display correctly AND have the Ws spaced correctly when there is no punctuation.
>
> As for the bug, this can be discussed on the other thread. This the singleton issue that I believe is in the driver reading the passage, which means passages get mixed up. For me, it's definitely a concurrency issue, since my left and right panes, when set to the same book, loaded up and mixed up the passages between themselves.
>
> As for the parameter, I mean one that gets set in the Transformer before the xslt is processed. Let me share some real code.
>
> Top of the stylesheet before anything is declared:
>
> <xsl:param name="InterlinearProvider" /> <!-- passed in as a java object from code -->
> <xsl:variable name="interlinearProviderService" select="jsword:com.tyndalehouse.step.core.xsl.IPSample.new()" /> <!-- creating instance in spreadsheet to show the problem -->
>
> <!-- this works and I can thereby confirm that InterlinearProvider is an object of the right type -->
> <xsl:variable name="interlinearWord" select="jsword:getWord($interlinearProviderService, $InterlinearProvider)"/>-->
> <xsl:value-of select="$interlinearWord"/>
>
> <!-- this doesn't work -->
> <xsl:variable name="interlinearWord" select="jsword:getWord($InterlinearProvider, $InterlinearProvider)"/>
> <xsl:value-of select="$interlinearWord"/>
>
> The function signature for the purpose of the test was getWord(Object o);
>
> But I've tried with getWord() (calling getWord($InterlinearProvider) doesn't work, calling getWord($interlinearProviderService) does work. I've also tried getWord(String,String), etc.
>
> So it seems it works for things that initialised in the XSLT (interlinearProviderService), but not for those outside, even though I can see that the stylesheet thinks of $InterlinearProvider as a java object, cos I can pass it in and have look at its properties in debug mode.
>
> I've also tried copy the xsl:param to xsl:variable first and calling getWord($copiedVariable) but that didn't work either.
>
> Does that make more sense?
> Chris
>
>
> On 9 November 2010 20:56, DM Smith <dmsmith at crosswire.org> wrote:
> On 11/09/2010 11:19 AM, Chris Burrell wrote:
>>
>> Trent/All
>>
>> I have successfully managed to call instance methods on variables set and initialised in the XSL. However, as soon as I change to a parameter that is passed in (using the same format jsword:getWord($interlinearObj, arg1, arg2) ) my stylesheet refuses to compile at runtime. I've also tried copying the "xsl:parameter" into the xsl:variable, but that doesn't work either.
>
> I'm not sure if you mean a parameter to the stylesheet. If so in the top of BD's xslt it copies the values into variables and uses them.
>
> If you mean a parameter to a template, I'm not sure what the problem would be. You might need to use some quoting magic. Can't remember off the top of my head how it is done. Maybe ${xyz}???
>
> If you mean that $interlinearObj is not a string or a number, then that is likely to be your problem.
>
>
>>
>> However I am able to pass the parameter as an instance method parameter (so I could pass it to X that would instantiated during the XSL transformation, and then invoke the method on it?). Or I could pass in parameters to initialise it in the XSL, as opposed to passing in the already-instantiated object into the XSL and trying to invoke a method on that...
>>
>> So no biggies... But would have been nicer to provide an object that the XSL just needs to use, rather than set up as well.
>> Any ideas?
>>
>> Chris
>>
>>
>> On 7 November 2010 00:04, Chris Burrell <chris at burrell.me.uk> wrote:
>> On this last note, I believe we have concurrency issues. I have a two column page, displaying one passage each. On load of the page they load up a passage each, but then this once, the passage on the right (only verse 1) has gone to the left (which was requesting just one verse but from a different passage:
>>
>> left pane: requested Acts 2:10, got Romans 1:1
>> right pane: corrupt XML in verse 1, verse 2 seems to be Romans 1:2-following
>>
>> Anyone else come across those issues?
>> Chris
>>
>> On 6 November 2010 20:53, Chris Burrell <chris at burrell.me.uk> wrote:
>> Another question too. It seems sometimes, both in bible desktop and my current application, the html rendered is broken?
>>
>> Any ideas why that might be?
>>
>> For example, I get:
>> "<div class="passageText ui-widget"><div><h2 class="heading">Acts 2:10</h2><span class="verse"><span class="w"><sup class="verseNumber">10</sup></span><span class="w"><span class="text">emma="strong:G1909" morph="robinson:PREP" src="4">upon every soul of man that doeth evil, of the Jew first, and also of the Gentile;</span></span></span> </div></div>"
>>
>> The above in bold shows that it didn't get XSLTed properly.
>>
>> Instead of "<div class="passageText ui-widget"><div><h2 class="heading">Acts 2:10</h2><span class="verse"><sup class="verseNumber">10</sup><span class="w"><span class="text"> </span> </span><span class="w"><span class="text"> </span> </span><span class="w"><span class="text">Phrygia</span></span>, <span class="w"><span class="text"> </span> </span><span class="w"><span class="text">and</span></span> <span class="w"><span class="text">Pamphylia</span></span>, <span class="w"><span class="text">in Egypt</span></span>, <span class="w"><span class="text">and</span></span> <span class="w"><span class="text">in the parts</span></span> <span class="w"><span class="text">of Libya</span></span> <span class="w"><span class="text">about</span></span> <span class="w"><span class="text">Cyrene</span></span>, <span class="w"><span class="text">and</span></span> <span class="w"><span class="text">strangers</span></span> <span class="w"><span class="text">of Rome</span></span>, <span class="w"><span class="text">Jews</span></span> <span class="w"><span class="text"> </span> </span><span class="w"><span class="text">and</span></span> <span class="w"><span class="text">proselytes</span></span>,</span> </div></div>"
>>
>> So somehow it lost a whole load on the way out of the XSLT? The only difference is that the first one is on startup of the server, the second is with a refresh in the browser. Perhaps something hasn't loaded up correctly/entirely?
>>
>> Chris
>>
>> On 5 November 2010 23:10, Chris Burrell <chris at burrell.me.uk> wrote:
>> Thanks DM. So I found this page (again)! http://www.crosswire.org/~dmsmith/interlinear/
>>
>> And managed to replicate (and solve?) the issues I found originally when I looked at it before:
>>
>> 1st When lines in the interlinear only have 1 line (i.e. no 2nd/3rd or 4th line). As a result, when the text wraps, it floats below the first line. As a hack (although on could argue that there is an empty spot there, rather than nothing), I think we can put a <span> </span> or we could use a height maybe? (not quite so good, unless we specify in ems and exs). And the second thing is that within a particular word stack, the words might wrap. I believe this particular issue is only visible in IE. For IE 8, the fix is to put a whitespace: nowrap CSS directive. Not sure if that helps on IE6 and 7 though? Spec says it should be supported on both browsers.
>>
>> And yup, I'm targetting web environments, and also web mobile browsers.
>> Chris
>>
>>
>> On 5 November 2010 20:09, DM Smith <dmsmith at crosswire.org> wrote:
>> I'm heading out for the weekend. In a few minutes.
>> It'll probably be Monday evening when I send it.
>>
>> The solution uses spans with their display set to block.
>>
>> -- DM
>>
>>
>> On 11/05/2010 03:55 PM, Chris Burrell wrote:
>>>
>>> DM, you said you might have an intearlinear model that worked? I had another look to see how I did mine previously, and found that in fact I used tables. I think I struggled for quite a while to get a model working across browsers using DIVs, but none of them seemed to wrap properly at the end of the line. But unfortunately table layouts are slow and therefore it would be better to have divs.
>>>
>>> Would you be able to let me have your samples?
>>> Chris
>>>
>>> On 5 November 2010 19:21, Chris Burrell <chris at burrell.me.uk> wrote:
>>> What's GNT? Greek New Testament? I think we can do more than that too. If other Bible versions have strong numbers and/or morphology tags, then we can put those in parallel, and end up having French with English "subtitles", or English with English, as well as English with Greek, etc.
>>>
>>> So I've had a look at the framework so far and it seems fairly easy not to use Bible Desktop components and have a good XSLT transformation. So all we would need to add is some helpers that users can easily integrate into their XSLTs. It would nice to have some sample XSLs for people to use. So for example, I've had to strip out all the CSS and font tags from the Bible Desktop one so as to produce a good XHTML compliant one.
>>>
>>> Say we give the XSLT a InterlinearProvider initialised with its version and passage, as it parses the strong/morph option we can then call get($provider, @strong, @morph), which would in turn optionally return the correct words (or best word since sometimes you may have multiple options in modules tagged with strong numbers only. In fact it would be better to have something like get($provider, osis_verse_id, @strong, @morph). Since then, if we don't have the morphology of the word, at least we can limit the lookups to those words that are tagged in a particular verse (that assumes that versification is comparable between versions).
>>>
>>> We'll want to add options to have tagged information displayed on the side of a word/phrase or below a word/phrase. At the moment the XSLT displays morph and strong tags next to the text. I'll add some transformations to have it on separate lines. Then we can reuse the same transformations to line up text beneath it.
>>>
>>> DM, I had a look at "flying saucer" , but didn't quite understand where it comes in? Would the idea be instead of the XSLT? And have it transform to different UIs?
>>>
>>> Chris
>>>
>>>
>>> On 5 November 2010 03:51, Tonny Kohar <tonny.kohar at gmail.com> wrote:
>>> Hi,
>>>
>>> On Thu, Nov 4, 2010 at 11:30 PM, DM Smith <dmsmith at crosswire.org> wrote:
>>> > Much of the transformations is done in BibleDesktop. Refactoring these and
>>> > putting it into JSword and/or common would be good.
>>> >
>>>
>>> +1
>>> Yes it would be nice to have this under JSword instead of BIbleDesktop
>>>
>>> Sincerely
>>> Tonny Kohar
>
>
> _______________________________________________
> jsword-devel mailing list
> jsword-devel at crosswire.org
> http://www.crosswire.org/mailman/listinfo/jsword-devel
>
>
>
>
> _______________________________________________
> jsword-devel mailing list
> jsword-devel at crosswire.org
> http://www.crosswire.org/mailman/listinfo/jsword-devel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.crosswire.org/pipermail/jsword-devel/attachments/20101121/c4915bcb/attachment-0001.html>
More information about the jsword-devel
mailing list