[sword-devel] WEB update request; OSIS

Chris Little chrislit at crosswire.org
Tue Aug 10 23:18:27 MST 2004


Kahunapule Michael P. Johnson wrote:

> Hello, Adrian, Chris, and all:
> 
> I probably sounded a bit harsh in my opinion of OSIS, right now. Please forgive me if I have offended anyone who has poured effort into trying to make OSIS work. I'm mostly frustrated because I really want OSIS (or at least a good open XML Bible format) to become widely adopted and used. However, there are some serious issues that will most likely kill OSIS before it gets started if they aren't corrected. Therefore, even though OSIS is only likely to be a real, commonly used standard in the future, it may suffer the fate of XSEM and never see that future.
 >
> 1. OSIS does not properly preserve quotation punctuation in all cases, as currently documented. Furthermore, the keepers of the standard don't seem to think that quotation punctuation is important to preserve, but they seem to believe that such punctuation should always be generated from markup according to modern English rules of grammar, independent of the way the translators punctuated their work. Therefore, it is impossible to code a Bible translation in OSIS that differs in the way quotations are punctuated and expect that OSIS readers and renderers will render the quotation punctuation correctly. Just ignoring the <q> tag doesn't work if you want to mark text for possible use in a "red letter" edition.

We've covered this a great deal, including downsides to your 
suggestions.  The best solution will probably lie in identifying, 
presumably in the header or on each <q> element, the rendered form. 
Encoding quotation marks directly is just not a good solution.  If 
you're willing to entertain the possibility that others who have 
opinions differing from your own just might have good ideas too, you can 
re-read the previous threads in sword-devel or the discussions in 
osis-core (I think it is publicly archived).

> 2. The OSIS web site is not being kept current with the current OSIS schema and documentation. I dare you to start at the main web site and find the current OSIS 2.0 schema in less than 10 clicks. Indeed, you will probably come away satisfied that version 1.1 is the latest.

There's a "User Manual and Schema" link in both the left side menu and 
in the footer menu on the BTG site.  It has a link to the 2.0 schema. 
There is a 2.0.1 schema, granted, but the difference is extremely minor.

> 3. OSIS is much more complex than it has to be to generate and use, so it will probably never catch on without good applications to hide this complexity from the user. Greater complexity brings greater opportunity for errors and variations in interpretation. Just using an XML editor to deal with OSIS itself won't cut it-- not with ordinary working linguists. This can be fixed two ways without abandoning OSIS: (1) write more high-quality programs that use OSIS directly, or (2) use an intermediate, simpler format, then convert to and from OSIS.

I consider myself a pretty ordinary linguist, and I don't mind firing up 
an XML editor to edit a bit of OSIS.  (Strictly speaking, I barely do 
any programming and most of it consists of variations on a single theme 
in Perl.  Lots of linguists use Perl, Python, etc. so I wouldn't be 
surprised to see them use scripting to convert their existing data.)

But I think you're right.  We do need applications, like the Word 2003 
plug-in that is already available.  More applications would be great. 
But conversion is also an acceptable technique.  And that's why we have 
SFM to OSIS converters.

OSIS was designed with at least one principle in common with Perl: 
"Common things should be easy; advanced things should at least be possible."

> 4. OSIS has some other minor flaws that I think the keepers of the OSIS standard actually understand and intend to correct, so I'm not too concerned about them. (This includes things like the inability to mark supplied text in a Psalm Hebrew title, which is required to encode KJV, NKJV, etc. This has a work-around that may be good enough for some applications: just encode the Psalm Hebrew title as regular text instead of a title.)
> 
> 5. OSIS expects a lot of metadata not found in many existing Scripture texts to be added to it to comply with higher levels of conformance. This may slow or prevent the conversion of some texts to OSIS.

No one expects every document to conform to the highest levels of 
conformance.

> 6. None of the OSIS texts that I have seen are both high quality and fully conformant to the current OSIS schema. They tend to omit things like poetry line breaks and paragraph marks, or they are missing some markup, or they use some markup in ways not intended by the keepers of the OSIS standard. If OSIS were really good at doing what I expect it to do, I would have expected to see much better quality and quantity of Bibles in OSIS.

Badly encoded documents exist, that's true.  I don't know whether it is 
possible to prevent that.  But high quality, fully conformant (though 
perhaps not to the highest levels of conformance, which have some pretty 
steep requirements) documents do exist.  Converted documents can only be 
of quality comparable to their sources.  Most publicly available 
documents don't include poetry markup and many lack paragraphing.  The 
types of documents that include poetry, paragraphing, pericope, 
different types of notes, consistently parsable references, etc. exist, 
and some have been converted to OSIS.  The fact that you haven't seen 
them is largely the result of their not being publicly distributed.  I 
believe some small samples of completely marked content are available on 
the BTG site.  These represent excerpts from complete Bibles.

> Much of the support for OSIS is only an illusion, and only extends to a small number of people. I like the idea of having a common, open XML Bible text interchange standard that is widely accepted and works properly. Of the problems above, #1 is very serious, in my opinion. The rest of it can be overcome. However, I'm not going to bet on OSIS succeeding unless some things change.

I think there is more support than you know of.  Again, just because no 
one bothered to tell you about it doesn't mean it doesn't exist.  I 
believe an OSIS document was used earlier this year as the basis for a 
new Bible's first printing.  There are applications for editing, 
converting, rendering, and reading OSIS documents.  And there are around 
a dozen fully marked OSIS Bibles that I know of (and that excludes all 
of CCEL's documents that are converted by stylesheet and CrossWire's 
documents converted by exporters).

The rumors of OSIS' death are greatly exaggerated.

--Chris




More information about the sword-devel mailing list