<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
</head>
<body text="#000000" bgcolor="#ffffff">
John,<br>
<br>
Sorry for the late reply. This patch looks good and we'll commit it
shortly.<br>
<br>
Regarding using a "real" parser, it is a good idea. But we don't want
SWORD to be dependant on an external parser. The only way I see us
doing it is to implement the SAX interface ourselves but allow for an
alternative implementation to be used. I don't think that would be too
hard or that much of a change.<br>
<br>
In Him,<br>
DM<br>
<br>
On 02/04/2010 05:31 AM, John Zaitseff wrote:
<blockquote cite="mid:20100204103152.GA4346@zap.org.au" type="cite">
<pre wrap="">Dear SWORD developers,
Firstly, thanks for developing the SWORD library! I have been using
this library, in conjunction with the BibleTime front-end, for many
years.
I have recently started to develop some OSIS documents of my own.
In doing so, I found that the XML parser in osis2mod is somewhat
fragile---something that you are, no doubt, aware of.
In particular, osis2mod does not handle XML comments at all, nor
does it correctly parse the <header> element. Being able to handle
XML comments is, I think, quite important---I like to document the
SVN revision ID, for example, in an XML comment.
Furthermore, the osis2mod XML parser looks for the first <div> in
the document, no matter where that occurs. In particular, if the
OSIS document includes a <revisionDesc> tag in the header, it will
have <p> tags as well---which will be translated by transformBSP()
into <div> tags---and get used as the starting point for the
document!
For this reason, I have generated a quick patch that will solve
these particular problems. Could you please apply it to the SVN
head for utilities/osis2mod.cpp. Comments are handled similar to
spaces: they are skipped. And handleToken() now looks for the first
<div> after the </revision> end tag.
In general, I think that (perhaps eventually) the proper way to
parse XML is to use a library like libxml---which is designed
specifically for this purpose.
Yours truly,
John Zaitseff
</pre>
<pre wrap="">
<fieldset class="mimeAttachmentHeader"></fieldset>
_______________________________________________
sword-devel mailing list: <a class="moz-txt-link-abbreviated" href="mailto:sword-devel@crosswire.org">sword-devel@crosswire.org</a>
<a class="moz-txt-link-freetext" href="http://www.crosswire.org/mailman/listinfo/sword-devel">http://www.crosswire.org/mailman/listinfo/sword-devel</a>
Instructions to unsubscribe/change your settings at above page</pre>
</blockquote>
<br>
</body>
</html>