[jsword-devel] Re: [sword-devel] radical idea

Daniel Glassey dglassey at gmail.com
Fri Jun 17 12:57:43 MST 2005

On 16/06/05, DM Smith <dmsmith555 at yahoo.com> wrote:
> I did an exercise for the online publisher that I worked for where we
> took xhtml, removed the forms, substituted cals table for html tables,
> and removed all the formatting elements other than class. On an as
> needed basis we added some style elements (strong, em and the like but
> not bold, italic and the like). Then we added elements for content
> markup (e.g. chapter, section, footnote).
> My realization from this exercise was that html does not offer
> significant semantic markup. In looking at TEI, Dublin Core, ThML
> additions, LegalML, OSIS, and the like, it appears that others think
> that HTML is not a good document markup language. I think HTML is *a*
> good delivery markup. But this should be the transformation from a
> content markup to a delivery markup targeted for the user's specific
> device (e.g. Church projector, web browser, print, hard copy book, pda,
> phone, stand-alone application, text reader for the blind, ...)

That is what OSIS is for - the xslts can be customised from the
original for whatever delivery markup people want.

But, if sword is wanting to target more kinds of delivery then it
should use xslt internally. I can't think of why sword would want to
do that, but I'd be interested in hearing opinions on that.

Current delivery formats for sword are:
websites - e.g swordweb, the java based webapp
HTML based frontends - MacSword, BibleTime, GnomeSword,
(RTF based frontends - SPFW)

Anything else?

I guess what we need to know is if CSS is sufficient for making the
differences in the visual output on all the systems.

I don't know if that is going to be the case for the website frontend.

> JSword converts modules on the fly into OSIS and then uses xslt to
> transform it into HTML with a clear separation of content and
> presentation (at least as far as the Java's HTML renderer allows). My
> experience has been that we have tweaked the xsl to get better and
> better rendering of original. 

That's a good point - you don't have to be stuck with the decisions of
the person importing wrt presentation.

> We have also added filtering into the xslt
> (small/large verse numbers, notes on/off, strongs on/off, morph on/off,
> verses on separate lines, ...) 

Hmmm, indeed, that part of the filtering probably isn't best handled
by CSS though I guess it might be possible with divs and spans to hide
the bits you don't want. I guess verses on separate lines may be
possible as well. (IANAHE iana html expert)

> and we allow the users to define their
> own style sheet (granted they need to be developers) or pick from
> alternates (at least till we merge them into the parameter driven
> standard stylesheet).

That's good. I can't see the sword lib moving to using libxslt/libxml2
at runtime, so that isn't going to be the 'core' way to do it.

Though, if sword is using xml I'd rather it used xml tools to do the job. ;)

The current filters in the library would be backup but a new filter
could use xslt to do the filtering.

> If the module were to change from a rich content markup to HTML, this
> would circumvent this advantage. Having it in HTML would make it harder
> to target it to alternate interfaces such as PDF 

I don't get what a PDF interface would be for that the original OSIS
doc wouldn't be better for.

> or PDA. (On my PDA, tables often force horizontal scrolling for the entire page. This is 
> not a good thing)

So you use CSS to specify the width of the table?

PDA's are an interesting problem - the general one is constrained by
CPU, memory and storage. So for them I would want a solution that has
as much pre-processing as possible, a low lib memory footprint and as
small a module as possible.

I'm not sure that OSIS and xslt runtime processing gives that.

> For a post 1.0 release we plan to add support for using OSIS directly.
> So if an HTML module is generated from an OSIS original, I think that it
> may require us to go with the original.

As long as the copyright holder allows that (I wouldn't expect them
to) then that works. I would expect that it would be more likely for
them to provide (or help us to produce) OSIS as an import format for
sword that they could also use to create other delivery formats and
not allow the OSIS to be redistributed.

OK, here's another variation on the idea. The modules still get
distributed as a variation on OSIS - the importer converts from ThML
or GBF or whatever.
The lib will provide the frontends of a way to make a preprocessed
version of the module in the presentation html.

Um, not sure about that but I'll just throw that thought out there. It
may not be practical as there are the verse/section indexes and lucene
search indexing to think about.


> Daniel Glassey wrote:
> >well, you have been warned in the title. This is just thinking based
> >loosely on a discussion this afternoon as we are needed to decide on
> >storage issues. This is a medium-term suggestion - post BibleCS 1.5.8.
> >
> >The first part of radical idea is to throw away the RTF output filters
> >(BibleCS will need to use an HTML widget or have someone implement a
> >windows frontend with an HTML widget which doesn't seem unreasonable
> >given that it is ok to wait 2 years between releases anyway ;) ).
> >
> >Then, since all other frontends use HTML then why not use HTML(or
> >XHTML) as the basis of the storage format in sword.
> >
> >CSS would be used for styling.
> >
> >OSIS isn't designed for presentation so sword has to transform the
> >stored markup to the presentational form - HTML. It seems more
> >efficient to do this at module creation time rather than at runtime.
> >
> >If there is any additional markup that is needed that can't be
> >implemented in straight HTML then it could be simple additions that
> >sword can recognise.
> >
> >OSIS would still be the preferred base document. You would use xslt in
> >a new sword import program osis2work to transform it into an HTML
> >based sword work together with the sword indexes.
> >And a work2osis would have to be written to do the reverse transformation.
> >
> >ThML is close enough to HTML that that conversion shouldn't be hard.
> >I'm not sure how to convert GBF to this, but either via a filter or
> >via OSIS make sense.
> >
> >The objection I can come up with is that it really isn't nice to the
> >JSword guys who afaiu use OSIS internally, but I'll wait for that
> >opinion. :)
> >
> >Regards,
> >Daniel

More information about the jsword-devel mailing list