[sword-devel] Calvin's commentaries, and ThML to OSIS conversion

Luke Plant L.Plant.98 at cantab.net
Wed Jul 11 08:45:08 MST 2007


Hi all,

I've been trying to create a Sword module containing all of Calvin's 
commentaries, using the ThML sources from CCEL.  I've made good 
progress, and some of my work should be reusable for other projects.  
I've read that Sword is trying to move away from ThML to OSIS, so the 
module will be an OSIS module.  I've been careful to remove any manual 
editing, so that everything can be generated automatically from the 
ThML (in case of any updates to the the sources).

The main steps are:
 
1) Make some corrections to the ThML (use of scripCom tag in 
particular) - DONE (implemented using Python script)

2) Combine all the ThML files into a single ThML source - DONE (Python)

3) Convert to OSIS.  I've done this using XSLT, and I'm intending to 
release my thml2osis.xslt as a separate project.  It is about 90% done 
(at least in terms of translating Calvin's Commentaries), and has tests 
and so on.  It should be a useful and portable utility for converting 
other CCEL sources.  (The test suite is currently executed using unix 
tools, which would be a problem for Windows developers.)

4) Import as a Sword module.  The problem here is that osis2mod is 
basically for importing Bibles only -- it expects you to use <div 
type="book">, <div type="chapter"> (or <chapter>) and <verse>.  These 
are not really natural or semantic ways to mark up a commentary. A more 
obvious and natural way to do it is like this:

<div type="section" annotateType="commentary" 
annotateRef="Bible:Gen.1.1"><p>Blah blah...</p></div>

I do actually have a Python script which converts this markup to the 
that expected by osis2mod, but it uses DOM, and memory usage for the 45 
Mb input OSIS file is prohibitive.   Anyway, I think creating a version 
of osis2mod for commentaries is the better way to handle this (I did 
find an old message in sword-devel saying that an importer would be 
written if OSIS commentaries were provided).

I would write the osis2mod modifications myself, but I've looked at 
osis2mod and the main function that needs modifying, handleToken(), is 
a bit of a beast -- about 400 lines, about 20 local variables etc.  I'm 
not confident enough with Sword to be able to refactor it properly, and 
I don't want to do large amounts of copy and paste.

So, is someone willing to help out with this final step?

Also, is there a place where I should release this stuff?  I think Sword 
needs a 'sword contrib' project, or at least a section on the wiki that 
details how to get these different things.  I get the impression that 
the main Sword developers have various scripts to help them, and a 
central repository for these kinds of tools would be very helpful.  A 
Bazaar repository would probably be ideal -- I could put up a 
publically readable one for my stuff.

Regards,

Luke

-- 
Sometimes I wonder if men and women really suit each other. Perhaps 
they should live next door and just visit now and then. (Katherine 
Hepburn)

Luke Plant || http://lukeplant.me.uk/



More information about the sword-devel mailing list