joseph.walker at gmail.com
Mon Feb 28 16:37:48 MST 2005
2 things to add - I *think* that large majority of the GBF and ThML
tags are supported, however I am very sure that we are not doing
sensible things with all of them, and there is probably no single
correct answer to what to do with them either since they both seem to
include elements of style (bold etc) where OSIS sensibly avoids
On the speed issue - I've not done tests, but I could well imagine
that creating an JDom/OSIS tree from GBF could be *faster* than from
an pure OSIS/XML data source. The reason being that both create the
tree using 'new Element("osis"); ...'. The OSIS parser creates SAX
events from the input stream. Without writing our own XML parser, we
are forced use on that incorporates code to check for all sorts of
error conditions and act in the Official Way. GBF on the other hand
has it's own parser that only needs to create an OSIS tree and can
therefore be simpler and maybe quicker.
However I think that converting everything to OSIS is a very good idea
just because the quality of the GBF and ThML sources is so very poor.
Re-generating them would enable us to clean them up a lot.
On Mon, 28 Feb 2005 18:00:10 -0500, DM Smith <dmsmith555 at yahoo.com> wrote:
> I think your suggestion is a distinct possibility for the future. But I
> don't think we are there yet. Before we could use JSword's engine to
> transform the module in other formats into OSIS as a mechanism to migrate
> the modules, I think we will need to do very extensive certification of the
> process and the resulting modules.
> Also, OSIS is a moving target. When I started, we were conforming to 1.1.1
> and shortly after that we upgraded to 1.5. At that time 2.0 was a draft
> proposal. Now it is final. I just added support for a few more elements in
> Once we migrate to 2.x (post JSword 1.0), we will probably revisit the
> mapping of GBF and ThML to OSIS to make sure that our transformation is not
> lossy (we do get all the verse text out, but do we get all other markup?
> Don't know. For example, if anything is marked up as italic or bold, we
> don't support that) and that the transformations make sense (Some of the
> verses look ugly after our transformations and sometimes there is a lot of
> extra whitespace).
> That said, it takes a lot to parse OSIS as text into OSIS as JDom elements
> and then transform that. Currently the performance is about the same or just
> a bit slower than other formats. So it would not be a big win. We do need to
> look at directly using the OSIS text, but that is scheduled post 1.0.
> Read, James C wrote:
> These are good examples. To get OSIS from a Bible module you will do
> something like the following: // Get a list of all the installed books Books
> books = Books.installed(); // Get a particular installed book Book bible =
> books.getBookMetaData("KJV"); // Create a range of what you want Key key =
> bible.getKey("Gen-Rev"); // Get the data BookData data = bible.getData(key);
> // Get a SAX stream for the OSIS document SAXEventProvider osissep =
> data.getSAXEventProvider(); // Write the OSIS to a string String wholeBible
> = XMLUtil.writeToString(osissep); // Print it out
> System.out.println(wholeBible ); Thank you. :) That's just what I needed.
> One humble suggestion though. Seeing as all the basic operations with the
> modules seem to involve first converting them into OSIS why don't we convert
> all the modules to OSIS for the JSword package and save the end user the
> waiting time for such a conversion and us the repetition of code. This email
> has been scanned for all viruses by the MessageLabs Email Security System.
> _______________________________________________ jsword-devel mailing list
> jsword-devel at crosswire.org
> jsword-devel mailing list
> jsword-devel at crosswire.org
More information about the jsword-devel