<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 TRANSITIONAL//EN">
<HTML>
<HEAD>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; CHARSET=UTF-8">
<META NAME="GENERATOR" CONTENT="GtkHTML/3.26.0">
</HEAD>
<BODY>
Much more I could say on this, and no doubt others will jump in; but let me answer one key question that affects all the others:<BR>
<BR>
OSIS *does* use a pre-existing XML vocabulary: OSIS is almost entirely a pure subset of TEI. The extensions are tiny, and very specific to Biblical materials (for example, a very specific encoding for Biblical references).<BR>
<BR>
TEI has many millions of $, over 20 years, and many thousands of expert hours of labor in it. It is almost universally used for serious encoding of texts of literary, linguistic, and historical texts. This you can easily verify via Google or at your local university. If someone wants a grant to encode some important work, say from the National Endowment for the Humanities, the Mellon Foundation, or other large-scale funders, using anything *but* TEI is so unusual that they need to specifically make a case for it in their proposals (certainly in a very specialized case that can be done; but TEI has proven so valuable and so effective that it better be a very specialized case before one gives up the huge advantages of TEI). There are countless projects using TEI throughout, thus lots of tools and expertise available. <BR>
<BR>
Also, a lot of the TEI data is data that has important connections to the data OSIS people care about -- the collected works of important theologians, historians, and philosophers; the Greek and Latin classics; English and other literature that explores Biblical themes (Dostoevsky and Milton, to name two of the most obvious examples). Few if any serious projects relating to any of this, use HTML or XHTML for their data. Of course most everybody delivers HTML to browsers; but it's trivial to convert TEI to HTML or XHTML, and extremely non-trivial to go the other way.<BR>
<BR>
XHTML5 is a fine thing, obviously far better than HTML itself. But it gives you no rules about the things that OSIS specifies. It gives you almost no semantics for the things it defines (other than layout). And it lacks tons of specific things: poetic markup, epistolary units of all kinds, Biblical and other formal references schemes. TEI and OSIS provide all this kind of stuff.<BR>
<BR>
If you go with "XHTML5", you will inevitably find yourself re-inventing OSIS-like conventions: What names/abbrevs will you use for books, translations, and the like? How will you punctuate References? What syntax will you use for range references? How will you represent the various kinds of notes, and where will you place them? What will you do when verses and paragraphs overlap? How will we distinguish the canonical texts from notes, headings, and so on?<BR>
<BR>
Countless such questions arise, and if you go with XHTML5 (or XHTML349.2, for that matter), you will have to make up your own answer to each. At that point, it shouldn't surprise you that every other project comes up with a slightly different set of answers. And that means that every time you pass data from project A to B, the developers of either A or B (or both) have to write converters. Sounds like a waste of time (= poor stewardship) to me. At the very start of a project many of these questions may seem trivial or irrelevant; but as your project grows they'll all arise and you'll either make a decision; or you can decide not to decide -- which is itself a decision against consistency, portability, and verifiability. <BR>
<BR>
It seems to me inaccurate to say that there is some massive range of tools for XHTML but not for XML. There are lots of HTML tools, but if you look at their output you'll find that they almost all produce HTML so messy (often invalid, seldom XHTML, and sometimes not even well-formed), that you'll either end up with data that can't be used in much of anything *except* browsers, or you'll end up writing all that conversion/cleanup code again. If I were a wagering man, I'd wager a lot of money that you've already had to do some of that. If you've got the development skills to modify open-source XHTML tools (which were you thinking of?) to support your own extensions, then you could modify them to do OSIS with little more work (and if you use XML tools, you get most of that support for free with XML Schema, Schematron, etc.). <BR>
<BR>
Is there any XHTML5 tool out there that can't deal with arbitrary XML? Not many; because it's a silly move on the developers' part to make one; that's because the incremental work is trivial -- if you already support styling tag X a certain way when X is a member of the fixed list of XHTML tags, you already know how to support styling tag X when X is *not* a member of that fixed list. There is also a vast range of general XML tools out there, and in general they provide far more functionality than HTML or XHTML tools (simply because you have to to not be laughed out of the XML marketplace).<BR>
<BR>
Steve DeRose<BR>
<BR>
<BR>
<BR>
<BR>
On Sat, 2009-12-19 at 00:22 -0800, Weston Ruter wrote:<BR>
<BLOCKQUOTE TYPE=CITE>
Thank you so much, Stephen. Your historical information is extremely helpful.<BR>
<BR>
Is anyone able to address the current state of OSIS and future plans for the standard? Namely, how is it currently addressing Stephen's points:
<OL TYPE=1>
<LI TYPE=1 VALUE=1>OSIS not being designed for delivery of partial documents,
<LI TYPE=1 VALUE=2>Its large metadata overhead,
<LI TYPE=1 VALUE=3>Ability to include “virtual” elements, as is required for partial documents.
</OL>
Furthermore:<BR>
<BR>
<BLOCKQUOTE>
For the ESV Study Bible in 2008, we again considered using OSIS as the primary XML format for the notes and quickly decided to go with XHTML5 instead. There are so many more tools for dealing with HTML designed to solve real-world problems; it was more efficient to use HTML even though it didn't map perfectly to our domain.<BR>
</BLOCKQUOTE>
<BR>
This identifies a concern I have about OSIS and how it relates to other XML vocabularies, namely XHTML5. OSIS defines many elements (a, abbr, figure, header, table, date, div, hi, list, p, q, etc.) which are already assigned rich semantics and presentational logic in the XHTML namespace: why not reuse existing XML vocabularies instead of independently (re)defining them? If OSIS depended on XHTML:
<OL TYPE=1>
<LI TYPE=1 VALUE=1>It would make OSIS able to be directly embedded into (X)HTML web pages and be properly understood by the browser: Bible websites could extend their existing HTML websites with OSIS markup to make them more semantically rich, readable both to machines and web browsers.
<LI TYPE=1 VALUE=2>Existing WYSIWYG HTML editors could be more easily extended to support the additional OSIS-specific markup.
<LI TYPE=1 VALUE=3>Having OSIS rely on XHTML would also greatly reduce the size of the OSIS specification, and new authors would require much less time to get up to speed because the spec would only define the elements unique to scriptural markup.
</OL>
So I wonder if an OSIS 3.0 could then explicitly reference the relevant elements from other XML vocabularies, especially XHTML5? Thoughts?<BR>
<BR>
Is there anyone currently active at the Bible Technologies Group?<BR>
<BR>
Blessings,<BR>
Weston<BR>
<BR>
<BR>
</BLOCKQUOTE>
<BLOCKQUOTE TYPE=CITE>
2009/12/16 Stephen Smith <<A HREF="mailto:stephen.smith@gmail.com">stephen.smith@gmail.com</A>><BR>
<BLOCKQUOTE>
There are several reasons why Crossway's XML differs from OSIS:<BR>
<BR>
1. As David Eyk notes, we created the existing XML documents in May-<BR>
June 2002, when OSIS was still in flux. In particular, the milestoning<BR>
process was much more complicated.<BR>
2. We were working from initial XML files provided by a vendor and<BR>
didn't want to change them too much.<BR>
3. OSIS is paragraph-based, rather than verse-based, making it<BR>
difficult to meet our immediate need--loading the data into a<BR>
relational database.<BR>
4. At the time, OSIS had some mandatory structural elements that we<BR>
weren't able to create.<BR>
5. I was hoping that someone else would take the XML from the web<BR>
service and write an XSLT to transform it into OSIS so we didn't have<BR>
to.<BR>
6. OSIS wasn't designed for delivery of partial documents: it wasn't<BR>
immediately clear to me how to structure the metadata in a response<BR>
when someone is only looking at, say, John 3:16. Further, the metadata<BR>
overhead in such a request, as compared to the desired content, was<BR>
prohibitive. Partial documents also require the use of "virtual"<BR>
elements--you need to add beginning and ending paragraph tags if<BR>
you're looking at a verse that appears in the middle of a paragraph,<BR>
for example, and open/close quotes properly. I don't believe that OSIS<BR>
has a handy facility for including these kinds of elements.<BR>
<BR>
As for mapping the Crossway XML onto OSIS, it should be<BR>
straightforward. Everything we did with the ESV we did with the goal<BR>
of producing a world-class OSIS ESV by 2012; I tried to do one big<BR>
project per year to create metadata required by OSIS. Between 2002 and<BR>
2007, we created metadata and evolved the schema to map cleanly to<BR>
OSIS--upgrading the quotation system, classifying footnotes, adding<BR>
catchwords, categorizing names, identifying speakers of quotes. All<BR>
this metadata uses OSIS vocabulary where possible. (Most of this<BR>
metadata isn't available through the API.) Even after this work, it<BR>
will still take many more hours to produce a document that fully<BR>
conforms to OSIS at the "Scholarly" level defined in the spec.<BR>
<BR>
The goal has always been to move away from the Crossway XML to a<BR>
compliant OSIS document. I just never felt we could produce documents<BR>
that conformed to the Scholarly OSIS Document / Trusted Quality<BR>
requirements. I saw no point in releasing anything at a lower<BR>
conformance level unless, as I mentioned, someone wanted to create an<BR>
interim XSLT. Further, as nearly all consumption of the ESV API was<BR>
through the HTML format, there wasn't a lot of demand for the XML.<BR>
<BR>
For the ESV Study Bible in 2008, we again considered using OSIS as the<BR>
primary XML format for the notes and quickly decided to go with XHTML5<BR>
instead. There are so many more tools for dealing with HTML designed<BR>
to solve real-world problems; it was more efficient to use HTML even<BR>
though it didn't map perfectly to our domain.<BR>
<BR>
I hope that answers your historical questions.<BR>
<BR>
Stephen
</BLOCKQUOTE>
</BLOCKQUOTE>
<BLOCKQUOTE TYPE=CITE>
<BLOCKQUOTE>
<BR>
</BLOCKQUOTE>
</BLOCKQUOTE>
<BLOCKQUOTE TYPE=CITE>
<BLOCKQUOTE>
<BR>
On Dec 16, 4:02 am, Weston Ruter <<A HREF="mailto:westonru...@gmail.com">westonru...@gmail.com</A>> wrote:<BR>
> Greetings Crossway, CrossWire, the Bible Technologies Group, SBL, and<BR>
> esteemed members of the Bible+Tech community:<BR>
><BR>
> I am researching data formats used to represent scripture—including XML<BR>
> vocabularies, DB schemas, and *ad hoc* text file formats—with the hope of<BR>
> contributing towards the development of a standard API that is able to<BR>
> commonly represent all of the constructs used by each. With such a standard<BR>
> API, the hope is that (web) developers would be able to access scriptural<BR>
> data from the array of Bible societies (e.g. Bible.org) using one<BR>
> standardized web service interface (i.e. that mashups of multiple<BR>
> translations from different sources would become easy to implement, for<BR>
> example: <<A HREF="http://pixelfaith.com/bible/#Luke/2">http://pixelfaith.com/bible/#Luke/2</A>>).<BR>
><BR>
> I have been studying the Crossway XML format and I am curious as to why<BR>
> Crossway didn't use OSIS. Were there any limitations in OSIS that caused you<BR>
> to develop your own XML vocabulary? Furthermore, why has development of OSIS<BR>
> seemed to have ceased with the last revision being over three years ago (6<BR>
> March 2006)? Moving forward, has any discussion happened regarding merging<BR>
> Crossway XML into an OSIS 3.0?<BR>
><BR>
> More to the crux of my inquiry, has Crossway considered any collaboration to<BR>
> standardize an API such as you provide to access the ESV? Or is anyone aware<BR>
> of any such effort currently being worked on? I am aware through Troy<BR>
> Griffitts of the web service API the CrossWire Bible Society has developed<BR>
> in coordination with the development of OSIS, and I am in no way wanting to<BR>
> supplant their excellent work. But I am interested in looking at what a<BR>
> Web-centric API would look like built from the ground up using the latest<BR>
> Internet standards with an eye for Ajax applications, web mashups, and (most<BR>
> importantly) semantically Linked Data. (I would hope any efforts in this<BR>
> area simply flow back into CrossWire's efforts for the next version of their<BR>
> API, which could perhaps then be more widely adopted.)<BR>
><BR>
> What OSIS seeks to do for markup, I would like to see done with an API to<BR>
> give developers a standard way of accessing the data in the texts on the<BR>
> Web. In other words and in short, I am interested in the development a<BR>
> standardized web service API and Document Object Model (DOM) for OSIS.<BR>
><BR>
> I am presenting this topic at the BibleTech:2010 Conference.<BR>
><BR>
> Obviously, any such standardization effort would have to be a joint effort<BR>
> by all of us. Looking forward to hearing from you!<BR>
><BR>
> Blessings and Merry Christmas!<BR>
> Weston Ruter<BR>
> OpenScriptures.org<BR>
<BR>
<BR>
</BLOCKQUOTE>
</BLOCKQUOTE>
<BLOCKQUOTE TYPE=CITE>
<BLOCKQUOTE>
<FONT COLOR="#888888">--</FONT><BR>
<BR>
<FONT COLOR="#888888">You received this message because you are subscribed to the Google Groups "Open Scriptures" group.</FONT><BR>
<FONT COLOR="#888888">To post to this group, send email to <A HREF="mailto:open-scriptures@googlegroups.com">open-scriptures@googlegroups.com</A>.</FONT><BR>
<FONT COLOR="#888888">To unsubscribe from this group, send email to <A HREF="mailto:open-scriptures%2Bunsubscribe@googlegroups.com">open-scriptures+unsubscribe@googlegroups.com</A>.</FONT><BR>
<FONT COLOR="#888888">For more options, visit this group at <A HREF="http://groups.google.com/group/open-scriptures?hl=en">http://groups.google.com/group/open-scriptures?hl=en</A>.</FONT><BR>
<BR>
<BR>
</BLOCKQUOTE>
</BLOCKQUOTE>
<BLOCKQUOTE TYPE=CITE>
<BR>
</BLOCKQUOTE>
</BODY>
</HTML>