[jsword-devel] [sword-devel] Log error suggest a module issue?

Chris Burrell chris at burrell.me.uk
Tue Apr 16 13:27:04 MST 2013


Isn't another solution that going forward we change the modules to be using
well-formed XML? I guess I'm not understanding...


On 16 April 2013 12:55, DM Smith <dmsmith at crosswire.org> wrote:

> Chris,
>
> There is no issue with UKJV, per se.
>
> osis2mod preserves all module markup, perhaps transformed, except the
> <verse> element. Earlier versions of osis2mod did not transform the
> <chapter> element to its milestoned version.
>
> This should be considered JSword's problem to deal with, which is what
> JSword is doing. Whenever JSword encounters a "verse" (in this case verse
> 0) it uses an xml parser to convert the text into DOM. All xml parsers
> require well-formed xml and are required to fail when otherwise. When
> JSword encounters an error in its assumption, it reports it and then strips
> xml from it.
>
> We have an open issue to do a better job with the handling of broken xml.
>
> There are a couple of improvements:
> Gather the text to display and convert all of it, instead of converting
> each verse one at a time. This recognizes that a tag opened in one verse
> may be closed in another. However, it does not work for "verse in
> isolation" (search results, lookups, parallel viewing, ...)
>
> Use a "lenient" xml parser (by definition there is no such thing) to
> repair text to be well-formed. I found Flying Saucer and jsoup, which look
> promising.
>
> The other possibility is to not use an xml parser at all to create the DOM
> but to do it with our own parsing (like we do for GBF and ThML).
>
> I'll cross-post this to Sword-devel, as that is where this started.
>
> In Him,
>         DM
>
> On Apr 15, 2013, at 4:15 PM, Chris Burrell  wrote:
>
> > Hi all
> >
> > There is perhaps an issue with the UKJV module. My logs show me:
> >
> > 2013-04-15 20:31:30,236 INFO   - UKJV:Exo 21:0: Parse UKJV(Exo 21:0)
> failed: Error on line 1: The element type "chapter" must be terminated by
> the matching end-tag "</chapter>".
> >
> > Cheers,
> > Chris
> >
> > _______________________________________________
> > sword-devel mailing list: sword-devel at crosswire.org
> > http://www.crosswire.org/mailman/listinfo/sword-devel
> > Instructions to unsubscribe/change your settings at above page
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.crosswire.org/pipermail/jsword-devel/attachments/20130416/10ea0d10/attachment.html>


More information about the jsword-devel mailing list