joe at eireneh.com
Fri Jun 11 14:49:37 MST 2004
I'm not sure if there are any good ways to make the modules simpler, but
that aside, a mod2osis in Java would be trivially easy.
This is how easy:
Passage ref = PassageFactory.createPassage("Gen-Rev");
Book bible = Books.installed().getBookMetaData(BIBLE_NAME).getBook();
BookData data = bible.getData(ref);
SAXEventProvider osis = data.getSAXEventProvider();
String text = XMLUtil.writeToString(osis);
I would submit that this isn't that complex!
Alter line 1 to bite off smaller chunks. "Gen" will do the whole of
Genesis, "Gen-Mal" will do the O.T, "Gen 1:1" will do a single verse and
so on. You can do non-contiguous regions too: "Gen 1:1, Mat 5, Rev"
Alter line 2 to choose a different Bible.
This will convert GBF, THML and broken Sword OSIS into well-formed and
(mostly) valid OSIS.
We work quite hard to patch up for the broken XML by fixing or deleting
broken entities and illegal characters. There is probably more we could do
in guessing at how to end un-ended tags and so on.
The current CVS does not attempt to delete nodes that don't exist in the
OSIS spec. So the result may not be 100% valid.
However I only know of one common example of illegal nodes and that is
resp that appears as an element when the spec says it is only an attribute.
A simple stylesheet should be able to fix that.
The last release (0.9.7) used JAXB which would only output totally valid
OSIS. Although at the expense of some considerable download size.
> I think that J-Sword's handling of modules is complex. Perhaps I am not
> the best to comment but here is one anyway:
> The general architecture of J-Sword is that there is a translator/parser
> for each encoding of a module. When a passage is displayed it is parsed
> and then converted into OSIS. The parsers are written to be somewhat
> fault tolerant, but to my knowledge, they do not fix bad data. I have
> found that in some cases the parsers just get confused. There is an
> attempt to log faults that are found so that the original can be cleaned
> I think I read in the archives that when Joe and others did the parser
> work, they had to fix up a lot of the modules to get j-sword to work.
> It should be fairly easy to write a mod2osis using j-sword. There is
> sample code on how to do it for a passage in
> tmp at nitwit.de wrote:
>> Sword's mod2osis utility apparently is somewhat buggy at least it does
>> create valid XML. I'm new to jsword but as its source base is quite
>> complex I
>> want to ask if somebody has written a replacement for mod2osis or
>> written a
>> parser that corrects the invalid files.
> jsword-devel mailing list
> jsword-devel at crosswire.org
More information about the jsword-devel