<html><head><meta http-equiv="Content-Type" content="text/html charset=iso-8859-1"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;">The other fixer upper was tag-soup.<div><br><div style=""><div>On Mar 18, 2014, at 8:22 PM, DM Smith <<a href="mailto:dmsmith@crosswire.org">dmsmith@crosswire.org</a>> wrote:</div><br class="Apple-interchange-newline"><blockquote type="cite"><div style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;"><div><br class="Apple-interchange-newline">On Mar 18, 2014, at 5:02 PM, Chris Burrell <<a href="mailto:christopher@burrell.me.uk">christopher@burrell.me.uk</a>> wrote:</div><br class="Apple-interchange-newline"><blockquote type="cite"><div dir="ltr">Yup - so I was looking at the code tonight.<div><br></div><div>I don't think the problem is quite as bad/hard to fix as you make it sound.</div><div><br></div><div>I think there are two types of issues</div><div>- a verse on its own not producing correct XML</div><div>- a bunch of XML together not producing well nested XML</div><div><br></div><div>Not sure how to solve the second, but the easy (?) solution on the second one is to amalgamate all the raw text first before parsing it. Now that we pass the whole passage down one more level, it shouldn't be too difficult to do that?</div></div></blockquote><div><br></div>Amalgamation may minimize the problem. Especially if we are displaying a chapter at a time. But if we display a search results list, a parallel display or an arbitrary passage chosen by the user then it may exhibit problems. </div><div style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;"><br></div><div style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;">The problem is still the same.</div><div style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;"><br></div><div style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;">We've also talked about expanding the context of what fails by grabbing an adjacent verse and adding it to the amalgamation and re-parsing.</div><div style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;"><br></div><div style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;">It is not hard to write a parser. That's essentially what we have with the ThML parser. Such a parser could know when it sees an unmatched start or end tag. Presuming that the module is valid, well-formed as a whole we can either prefix or append the missing tag to the result. This would "solve" the problem. (The ThML parser does not do that).</div><div style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;"><br><blockquote type="cite"><div dir="ltr"><div><br></div><div>On the second, there may some nice XML parsers that fix stuff up more gracefully as well...</div></div></blockquote><div><br></div>By definition an XML parser must fail on bad input. I've not seen any that fix up broken xml. Every year I do a survey of available parsers not just XML to see if there is something that might help. One that caught my eye: JTidy.</div><div style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;"><br></div><div style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;">JTidy understands the xhtml spec and can take badly formed HTML and clean it up. I was trying to figure out if I could re-write it for another schema, or to take a schema and generate a cleanup technique. It was more complicated than I was willing to get into.</div><div style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;"><br></div><div style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;">DM</div><div style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;"><br><blockquote type="cite"><div dir="ltr"><div><br></div><div>Chris</div><div><br><br><div class="gmail_quote">---------- Forwarded message ----------<br>From:<span class="Apple-converted-space"> </span><b class="gmail_sendername">DM Smith</b><span class="Apple-converted-space"> </span><span dir="ltr"><<a href="mailto:dmsmith@crosswire.org">dmsmith@crosswire.org</a>></span><br>Date: 18 March 2014 20:45<br>Subject: Re: [sword-devel] Tables across verse boundaries<br>To:<span class="Apple-converted-space"> </span><a href="mailto:christopher@burrell.me.uk">christopher@burrell.me.uk</a>, SWORD Developers' Collaboration Forum <<a href="mailto:sword-devel@crosswire.org">sword-devel@crosswire.org</a>><br><br><br><div style="word-wrap: break-word;"><div><div class=""><div>On Mar 18, 2014, at 3:29 PM, Chris Burrell <<a href="mailto:christopher@burrell.me.uk" target="_blank">christopher@burrell.me.uk</a>> wrote:</div><br><blockquote type="cite"><div dir="ltr">Hi DM<div><br></div><div>1- You're right, it was my mistake around across verses. Ezra 1 would be an example where you have 3 rows per verse, and a table over two verses.</div></div></blockquote></div>No problem. It's hard to debug a problem where the text is made up.</div><div><div class=""><br><blockquote type="cite"><div dir="ltr"><div><br></div><div>2- My issue with the markup and having the verse number inside the cell was that I got a 'nesting' warning by mod2osis. Is that something i just ignore? (i.e. "verse sID" in the first cell with "verse eID" in the second cell)</div></div></blockquote><div><br></div></div>The nesting warnings are relatively benign. They indicate that the verse in isolation is not well-formed XML and that when displayed in certain contexts it will have problems.</div><div><br></div><div>That the verse sID is in one cell and the verse eID is in another by itself is not a problem. It is more a question if the raw data from the module is a well-formed fragment.<div class=""><br><div></div><br><blockquote type="cite"><div dir="ltr"><div><br></div><div>3- I had another look at the output, and the module does in fact have the table in it. It looks like it wrapped it into verse 8, as expected. So it seems, that maybe this is an issue specific to JSword?</div></div></blockquote><div><br></div></div>It is a particularily bad problem with JSword. JSword passes the verse raw data to an xml parser to create an xml fragment, which it fails when not well-formed. When the exception is caught, we then strip all markup out of the raw data and re-parse it. </div><div>This is particular to JSword.</div><div><br></div><div>However, when the verse is shown in isolation by any SWORD frontend or in a table cell, it most likely will not display as intended. It's that JSword does it one worse. If we wish to discuss JSword's shortcoming more, we should do that on jsword-devel or create an issue for it (if there isn't already one, as we have talked about this problem in the long past.)</div><div><div class="h5"><div><br></div><div><blockquote type="cite"><div dir="ltr"><div><br></div><div>Chris</div><div><br></div></div><div class="gmail_extra"><br><br><div class="gmail_quote">On 18 March 2014 13:50, Jonathan Morgan<span class="Apple-converted-space"> </span><span dir="ltr"><<a href="mailto:jonmmorgan@gmail.com" target="_blank">jonmmorgan@gmail.com</a>></span><span class="Apple-converted-space"> </span>wrote:<br><blockquote class="gmail_quote" style="margin: 0px 0px 0px 0.8ex; border-left-width: 1px; border-left-color: rgb(204, 204, 204); border-left-style: solid; padding-left: 1ex;"><div dir="ltr">Hi DM,<br><div class="gmail_extra"><br><div class="gmail_quote"><div>On Tue, Mar 18, 2014 at 12:01 PM, DM Smith<span class="Apple-converted-space"> </span><span dir="ltr"><<a href="mailto:dmsmith@crosswire.org" target="_blank">dmsmith@crosswire.org</a>></span><span class="Apple-converted-space"> </span>wrote:<br><blockquote class="gmail_quote" style="margin: 0px 0px 0px 0.8ex; border-left-width: 1px; border-left-color: rgb(204, 204, 204); border-left-style: solid; padding-left: 1ex;"><div>On Mar 17, 2014, at 1:07 PM, Chris Burrell <<a href="mailto:chris@burrell.me.uk" target="_blank">chris@burrell.me.uk</a>> wrote:<br><br>> Hello<br>><br>> I'm looking at converting a module that has tables across verse boundaries... Is this supported?<br><br></div>It should be. At least by osis2mod. I don't know if SWORD renderers have code for tables. I'll leave that for someone else to answer. JSword probably will choke on tables. I'll go into that in a bit.<br></blockquote><div><br></div></div><div>Last time we discussed OSIS tables they weren't supported by the SWORD renderers.<br>I don't think anything has changed.<br><br>Jon<span class="Apple-converted-space"> </span><br></div><blockquote class="gmail_quote" style="margin: 0px 0px 0px 0.8ex; border-left-width: 1px; border-left-color: rgb(204, 204, 204); border-left-style: solid; padding-left: 1ex;"><div><div><br>> I'm using the sword utilities to convert the module, however, I'm seeing that the 'table' element is getting dropped?<br><br></div>I'm presuming that you are using osis2mod. osis2mod should not drop anything. To verify what osis2mod creates I recommend creating a raw module (that is, use no compression flags) and use the -d 2 flag. This will put milestones for the start and end of the verses into the module. Then you can use a text editor (stay away from NotePad as the line endings may not be windows friendly) to look at the file and search for the constructs.<br><div><br>> (both using mod2imp to check,<br><br></div>Using mod2imp is also useful because it marks each index entry with the verse slot name. But it may not be necessary, if the raw file gives what you wish.<br><div><br>> as well as using JSword).<br><br></div>JSword has some problems going to OSIS. It assumes that each verse is well-formed xml. If it is not, it strips all xml, leaving text (with notes inline).<br><br>This is a fairly safe assumption, but tables will probably will make that fail.<br><br>This assumption is something that all SWORD/JSword frontends make at some points. Two examples:<br>Search results list that show verse content as well as references.<br>Stacked or side-by-side parallel display.<br><div><br>><br>> If this is supported, does someone have some example mark-up that I could use as a starting point?<br><br></div>I'm trying to understand where in a Bible a table would be useful. I can see it in an introduction. But spanning verses? No way. There is no tabular data in the Bible. (Please correct me if I'm wrong!)<br><br>I have seen people use tables to control rendering. If this is what is being done, some one needs guidance.<br><br>In a commentary, which is indexed by verse numbers, anything could happen.<br><br>Regarding sample markup, it is analogous to simple HTML tables, but other than <table> the element names are different.<br>The <table> element can be wholly contained within:<br><div><br><chapter><br><speech><br><note><br><cell><br><p><br>Nothing else can be a parent to <table>.<br><br>A table has a few attributes, cols and rows to give dimensions; canonical to indicate whether it contains canonical material; and the standard OSIS attributes.<br>It can contain a <head> and also <row> elements. Both are optional, but it doesn't make sense to have a table without rows.<br><br>I'm not clear what is the purpose of head. It can contain many of the same content as a verse.<br><br>The <row> element can only contain <cell> elements and it has a role attribute that can have a value of label or data. It also has a canonical attribute and the standard OSIS attributes.<br><br>The <cell> element can contain pretty much anything that a <div> or a <chapter> can contain except <div> and <chapter>. It also has the same role attribute, but defaults to data. It also has an align attribute with a value from left, right, center, justify, start and end. And of course it has canonical and standard OSIS attributes.<br><br>Since a table cannot be milestoned, the element it is contained within also cannot be milestoned. The manual states that for any given element you can chose to use the milestoned version or the container version but not both in the same document.<br><br>I guess a verse can be split across multiple cells and even rows by using the milestoned version of a verse.<br><br>If a <table> only has a single column, a <list> may be a better container.<br><br>Hope this helps.<br><br>Together in His Service,<br> DM<br><br><br></div><div>_______________________________________________<br>sword-devel mailing list:<span class="Apple-converted-space"> </span><a href="mailto:sword-devel@crosswire.org" target="_blank">sword-devel@crosswire.org</a><br><a href="http://www.crosswire.org/mailman/listinfo/sword-devel" target="_blank">http://www.crosswire.org/mailman/listinfo/sword-devel</a><br>Instructions to unsubscribe/change your settings at above page<br></div></blockquote></div><br></div></div><br>_______________________________________________<br>sword-devel mailing list:<span class="Apple-converted-space"> </span><a href="mailto:sword-devel@crosswire.org" target="_blank">sword-devel@crosswire.org</a><br><a href="http://www.crosswire.org/mailman/listinfo/sword-devel" target="_blank">http://www.crosswire.org/mailman/listinfo/sword-devel</a><br>Instructions to unsubscribe/change your settings at above page<br></blockquote></div><br></div>_______________________________________________<br>sword-devel mailing list:<span class="Apple-converted-space"> </span><a href="mailto:sword-devel@crosswire.org" target="_blank">sword-devel@crosswire.org</a><br><a href="http://www.crosswire.org/mailman/listinfo/sword-devel" target="_blank">http://www.crosswire.org/mailman/listinfo/sword-devel</a><br>Instructions to unsubscribe/change your settings at above page</blockquote></div><br></div></div></div></div><br></div></div><span><smime.p7s></span>_______________________________________________<br>jsword-devel mailing list<br><a href="mailto:jsword-devel@crosswire.org">jsword-devel@crosswire.org</a><br><a href="http://www.crosswire.org/mailman/listinfo/jsword-devel">http://www.crosswire.org/mailman/listinfo/jsword-devel</a><br></blockquote></div><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;"><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;">_______________________________________________</span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;"><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;">jsword-devel mailing list</span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;"><a href="mailto:jsword-devel@crosswire.org" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;">jsword-devel@crosswire.org</a><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;"><a href="http://www.crosswire.org/mailman/listinfo/jsword-devel" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;">http://www.crosswire.org/mailman/listinfo/jsword-devel</a></blockquote></div><br></div></body></html>