[sword-devel] OSIS markup for gen books and devotionals

David "Judah's Shadow" Blue yudahsshadow at gmx.com
Tue Sep 23 11:32:45 MST 2014

On September 6, 2014 10:35:43 AM EDT, Laurie Fooks <laurie.fooks at gmail.com> wrote:
>The issue still stands as to what markup is common to front ends and
>at what level should that apply - engine or front end.- sorry, I am
>not aufait with how the sword system works at the program level

Sorry for the delayed reply, this was sitting unfinished for a good while in my drafts folder forgotten apparently.

I would say that there is some confusion as to what the sword engine does as far as getting content out from modules and in to frontends. The engine allows you, as the frontend developer, to receive the module text and tagging from the engine via various filters depending on your needs/desires. The engine reads the module, then outputs the requested passage formatted according to the filter that the frontend program has requested. For Xiphos, it requests the XHTML filter be used and then sends the resulting XHTML "document" to it's rendering code for display. For the main Windows frontend (commonly called BibleCS on this list) it requests the filter for RTF formatting codes, and then passes that to it's rendering code for display (or at least that's how it used to work). Diatheke, the command line front end, last I played with it, lets you specify the output filter to use as an argument (but defaulted to plain text with /no/ special formatting). BibleTime, on the other hand, gets the raw module tagging and transforms it into HTML itself so it can have a finer-grained control over how things are displayed via display templates (which are really just style sheets create specifically for the HTML tags BibleTime sends to render) and things like highlighting the current verse, formatting footnotes, cross references, etc. in a uniform way (determined again by the CSS), and so forth. It also changes some formatting if you request a passage be printed, so as to be more print friendly. If there isn't code to handle a particular OSIS/GBF/THML/TEI/etc. tag, poetry for instance (currently anyway), then AFAIK it gets the tags from the engine.

Now all this might sound pedantic and that front-ends should just render what the engine sends, but imagine a frontend that sends the text through a TTS engine for visually impaired persons. This frontend would have no use for HTML formatting, but it would care what the underlying markup that this HTML represents is. A TTS frontend wouldn't care that words are "red", it would care /why/ they are "red", so this front end would want the raw OSIS to be able to understand that the text is a quote from Jesus and perhaps change the voice used. Or perhaps this frontend would just want the raw text to read verbatim rather than caring what tags are present.

And then there is the case of frontends that aren't Bible reading programs at all. For instance, using graph theory to analyze word usage and compare it to the Strong's Numbers tagged in, or analyze cross references, or some other such thing.

This is why OSIS is semantic markup not presentation markup. It is also why you are discouraged from using <l> and <lg> for presentational line breaks, or to delimit prose paragraphing. It may work (for now) but what happens when a frontend comes along that does something with poetry that makes sense for poetry, but breaks your blank line? I recall someone saying that some publishers are not wanting to license their modules for use unless they can be assured the modules will display the way they want. So, a fix to a frontend for that case may break the presentation you had in mind. 

>If we don't have a high level of commonality then I am concerned that
>we are losing the purpose of having a common "engine"

Well, no not necessarily, the purpose of the engine is to read the modules and then provide the content in a way that the front end can meaningfully use for its purpose. This can, of necessity, create /some/ disparity in how things are handled. I'll go into more detail below but each frontend knows what it needs to best display text for its purpose. The mobile phone apps, for example, may not wish to pad things with extra spacing like a laptop/desktop (or even tablet) app because of the smaller screen real estate. Or my above mentioned possible TTS frontend that just wants raw text with no formatting whatsoever.

You can look at it this way, if the engine determined how frontends should display the text, what would the point of multiple front ends be? 

>The documentation pointed me to OSIS as the ongoing supported
>standard, but sadly I am finding that is not the reality in genbooks
>for anything but basics.

Well yes, most work is for biblical texts, that's where the big efforts lie. I do believe I was around before there /were/ genbooks. So in the long history of sword they are relatively young. And not all front-ends will give them priority to endure rendering is correct. As far as BibleTime goes, IIRC, genbooks should get the same treatment as bibles, read the tags, output HTML, style according to template. 

>In some situations, it would be useful to make more use of media and
>other advantages of an electronic document - sword is currently
>limited and I'm not able to help with programming, but I would like to
>be able to maximise use of what is available and have it usable on
>whatever sword affiliated front end the end user prefers.

Well yes, that is the goal of the engine. Personally I'm sad that the module makers haven't been making much but exact screen reproductions of print modules and not taking advantage of the other options being digital gives. But until people need the additional features, they won't be added. Images are a good example, this is very new to the code, but wasn't put in until someone felt the need. 

>As I have said, I do appreciate the work many people have contributed
>- I am also trying to help by giving my test results on what I have
>found as a new user. I do feel that the system / project would benefit
>from establishing (or publishing/clarifying same) minimum standards
>for markup functionality

This is a laudable goal (with caveats for for the aforementioned cases where presentation is not visual or doesn't care about formatting for other reasons). But this is very difficult to do. BibleTime used to have code to render poetry, but it ran into problems because of the way HTML works, poetry would be fine until you selected the first verse, the the whole rest of the chapter would get highlighted like the current verse. The problem rests in how you divide up the text. We can't even get module makers to agree on this let alone frontend rendering (I'm speaking of BCV and kin not versification schemes). The one thing you have to be aware of as well is the type of module. Technically there are only 4: bible, commentary, lexicon/dictionary, and book. Any other distinction is a classification within those 4 types. So devotionals, for instance, are a special case of lexicon/dictionary modules keyed to dates. And devotional modules should be created according to lexicon guides, rather than book guides. 
Sent from my Android device with K-9 Mail. Please excuse my brevity.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.crosswire.org/pipermail/sword-devel/attachments/20140923/b65ebf83/attachment.html>

More information about the sword-devel mailing list