[sword-devel] Validating ThML and OSIS modules
Chris Little
chrislit at crosswire.org
Tue Jan 6 23:09:42 MST 2009
Jonathan Morgan wrote:
> On Tue, Jan 6, 2009 at 3:40 PM, Chris Little <chrislit at crosswire.org> wrote:
>> So write it and submit a patch.
>>
>> [Some basic requirements: Don't add library dependencies to Sword itself,
>> make the validator toggleable at runtime, and ensure that the validation
>> library is in C/C++ and can compile under Win32 and with GCC.]
>
> It was the expected answer, but my answer is: no. I do not have time
> to spend on it. I have not strongly complained about validity of
> modules. My statement is that if you care about validity, you would
> better spend your time enforcing validity in the importer than arguing
> about it on a mailing list.
I have just a couple quick comments, from the perspective of a content
encoder (and ignoring all other roles that I may personally have).
XML Validators tend not to be very good for doing document validation at
the editing stage. They are fine for confirming that a document is valid
or not, but are often very unhelpful when bringing a document to the
point of validity. I've tried to use xmllint for that purpose but find
it a long and frustrating process. Any other validation facility based
on libxml2 (like xmllint) would likely be the same. (Xerces might be
considerably better, and I have a feeling one or two of the editors I
mention below use it beneath the surface.)
As a content encoder, XML editors give me the best results when I want
to find encoding errors. They tend to give a better indication of
patterns of errors, whereas validators might only give the first error
they identify and then quit. I use Oxygen now, partly because it is a
Java program so it will run on whatever platform I'm using and they had
a nice student license, but I have used Topologi & XML Spy in the past
and they were fine for my needs. I'm sure there are some good OSS XML
editors out there (though I've seen less encouraging results from jEdit).
And secondly, invalid OSIS according to the schema isn't _always_
invalid OSIS according to what we meant the schema to express. That is
to say, we know there is one outstanding bug in the OSIS schema, and
there may be others. As far as our TEI P5 schema goes--I maintain it
myself and it's quite experimental. I've expressed willingness in the
past to add additional TEI modules to our schema or even to add/adjust
elements or attributes if we need them. So, in some cases it may be
important for the encoder to overrule the judgment of the
validator/schema so that he can encode and import a document he knows to
be correctly encoded.
So that is to say that a validator within the importer has some value
(and I've suggested adding one in the past), but it's not the most
useful feature for content encoders. A good XML editor is (IMO).
--Chris
More information about the sword-devel
mailing list