<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">

<html>

<head>

  <meta content="text/html; charset=ISO-8859-1"

 http-equiv="Content-Type">

</head>

<body text="#000000" bgcolor="#ffffff">

John,<br>

<br>

Sorry for the late reply. This patch looks good and we'll commit it

shortly.<br>

<br>

Regarding using a "real" parser, it is a good idea. But we don't want

SWORD to be dependant on an external parser. The only way I see us

doing it is to implement the SAX interface ourselves but allow for an

alternative implementation to be used. I don't think that would be too

hard or that much of a change.<br>

<br>

In Him,<br>

&nbsp;&nbsp;&nbsp; DM<br>

<br>

On 02/04/2010 05:31 AM, John Zaitseff wrote:

<blockquote cite="mid:20100204103152.GA4346@zap.org.au" type="cite">

  <pre wrap="">Dear SWORD developers,

Firstly, thanks for developing the SWORD library!  I have been using

this library, in conjunction with the BibleTime front-end, for many

years.

I have recently started to develop some OSIS documents of my own.

In doing so, I found that the XML parser in osis2mod is somewhat

fragile---something that you are, no doubt, aware of.

In particular, osis2mod does not handle XML comments at all, nor

does it correctly parse the &lt;header&gt; element.  Being able to handle

XML comments is, I think, quite important---I like to document the

SVN revision ID, for example, in an XML comment.

Furthermore, the osis2mod XML parser looks for the first &lt;div&gt; in

the document, no matter where that occurs.  In particular, if the

OSIS document includes a &lt;revisionDesc&gt; tag in the header, it will

have &lt;p&gt; tags as well---which will be translated by transformBSP()

into &lt;div&gt; tags---and get used as the starting point for the

document!

For this reason, I have generated a quick patch that will solve

these particular problems.  Could you please apply it to the SVN

head for utilities/osis2mod.cpp.  Comments are handled similar to

spaces: they are skipped.  And handleToken() now looks for the first

&lt;div&gt; after the &lt;/revision&gt; end tag.

In general, I think that (perhaps eventually) the proper way to

parse XML is to use a library like libxml---which is designed

specifically for this purpose.

Yours truly,

John Zaitseff

  </pre>

  <pre wrap="">

<fieldset class="mimeAttachmentHeader"></fieldset>

_______________________________________________

sword-devel mailing list: <a class="moz-txt-link-abbreviated" href="mailto:sword-devel@crosswire.org">sword-devel@crosswire.org</a>

<a class="moz-txt-link-freetext" href="http://www.crosswire.org/mailman/listinfo/sword-devel">http://www.crosswire.org/mailman/listinfo/sword-devel</a>

Instructions to unsubscribe/change your settings at above page</pre>

</blockquote>

<br>

</body>

</html>