[sword-devel] Food for thought regarding OSIS and some of its alternatives...

Chris Little chrislit at crosswire.org
Tue Feb 7 22:13:27 MST 2006



Kahunapule Michael Johnson wrote:
> Troy A. Griffitts wrote:
>>     I disagree that OSIS has slowed down development here.  It COULD
>> HAVE slowed down development here if we tried to actually work on our
>> osis2mod converter to handle a broader range of legal OSIS markup, but
>> up to now, we pretty much encode our OSIS texts the way we want and
>> that pretty much is defined by what our OSIS importer expects.

> Yes, but you miss the point... you aren't making a fully valid OSIS
> reader. If you did, it almost certainly would have slowed you down. As
> it is now, you really can't be sure that you can read any OSIS text
> anyone else might generate after reading the schema and documentation
> themselves. This is hardly a good situation for a "standard" for
> Scripture text markup. Sounds pretty subjective to me... maybe your file
> will import. Maybe not. Sure, it is OSIS, but not our dialect...

We use OSIS to the extent we desire and preserve what we do not. As an 
example, we don't currently do anything with <name> elements. We could 
create some sort of hyperlink to a dictionary of names. If we had any 
texts that marked names with the <name> element, we might very well do 
this. However, at present we do nothing with these elements other than 
preserve them in our databases.

We have done imports of data I've converted using my own scripts 
(according to my concept of OSIS), that Troy has converted using his own 
scripts (according to his concept of OSIS), that Todd Tillinghast has 
converted using his own scripts (according to his concept of OSIS), and 
probably from other sources. None of them had any flaws, though neither 
Todd nor I had any intention of writing converters specifically for the 
purpose of importing texts into Sword.

Like any NEW technology, it requires that we do work. To that extent, it 
is more work than we would have had to do had we just stuck with GBF, 
ThML, etc. But I would rather move to OSIS alone than stick with those 
formats because OSIS can actually handle markup needs that are 
impossible in either of those formats.

> OSIS milestones are overly complex and prone to error when manually
> coded. That is compounded by the fact that OSIS waffles on what is
> primary (book/chapter/verse or book/section/paragraph or poetry
> stanza/verse) in terms of XML encoding, allowing the choice of milestone
> or container at the whim of the encoder. The decoder must deal with a
> multitude of possibilities, including bizarre situations like
> overlapping verses that don't actually happen in any Bible text I'm
> aware of. (I'm not talking about alternate verse marking systems applied
> to the same text, but one verse marking system overlapping its own
> verses.) It all looks like an afterthought patchwork to me. I can (and
> did) do better than OSIS.

Best practice states BSP is primary and CV is to be milestoned. As a 
programmer and an encoder, I have no problem dealing with permitting 
both containers and milestones.

Based on your previous work, not to mention your total arrogance, I 
really doubt you did better than OSIS.

> I'd rather support a simple markup that handles complex situations when
> necessary, but makes all the easy stuff truly easy. Sure, a common
> markup is valuable, but not when it is not the best available markup.
> Persisting in promoting one markup that hasn't really caught on because
> of its problems when a better one is available won't necessarily make it
> more likely that you will get a widely-accepted standard. Indeed, you
> could kill off a better standard only to see your favorite pet die of
> its own birth defects. It is hard to dispassionately look logically at
> your "baby" project and decide to go with something better...

Indeed. In OSIS, easy stuff is easy, hard stuff is possible. OSIS has, 
indeed, caught on to a reasonable extent in multiple organizations. USFX 
hasn't. USFM is impractical for us, as it is not XML. I'm unclear of 
what you really expect in terms of OSIS adoption. It is the most widely 
used open standard for Bible encoding (I wouldn't really count USFM as 
open). It probably follows only (U)SFM and LGM (maybe ThML, but that 
isn't very practical for Bibles) in terms of encoded texts, though it 
has been around for far, far less time.

--Chris



More information about the sword-devel mailing list