[sword-devel] XML idea: modular spec

David Burry sword-devel@crosswire.org
Fri, 31 Aug 2001 00:43:15 -0700


Yes, that is a very good URL!  I got excited the first time I read 
that!  With the ideal modular system I'm envisioning, it should be possible 
(though extremely lengthy if the source data is very rich) to record the 
complete data in one XML file in a few different ways for exchange with 
other software, depending on how much data is in "the complete data," and 
also acceptable to provide just small glimpses or slices of that data for 
normal everyday and internal use.  It would be more efficient space-wise or 
speed-wise or both for bible software to develop their own binary format 
for storage in many cases, and still provide input/output from/to the full 
XML spec as well as smaller slices.

I'd be happy to join some working group if I can benefit the community, but 
my time is limited...  I've been working for TAGnet, a Christian 
organization dedicated to helping Christian ministries get online to 
develop their database and their XML interchange formats, heavy into XML 
and XSLT and PHP, etc, actually hiring a software company in India to write 
custom software for them according to the detailed specs we've 
designed.   Plus I have this full time job "on the side" ;o) at Adobe 
Systems, part of which has been to help develop a consistent XML spec for 
all our departments to represent online (web) documents using templates to 
dynamically (with early binding) convert the data to HTML in a variety of 
consistent ways.  If I were to join something, I would want to do so 
representing TAGnet since that is the Christian organization I'm aligned 
with, though technically in a volunteer capacity.  TAGnet hosts the "web 
bible" program that I wrote about 3 years ago at 
http://www.tagnet.org/bible/ before I knew about sword.

Dave

At 07:28 PM 8/30/2001 -0700, Troy A. Griffitts wrote:
>Thank you again!  You and Patrick
>(http://www.sbl-site2.org/Extreme2001/Concur.html) seem to be on the
>same page.  This facet of the implementation seems to benefit in many
>aspects with your approach.
>
>The other facet of WHAT, to which your approach beneficially facilitates
>dynamic change, is probably in what the Bible domain experts are
>interested, and can contribute.
>
>As techies, the HOW is where we find ourselves at home, and you have
>some excellent suggestions.  You should consider joining one or more
>working groups!
>
>For OSIS to achieve its goals and for us to realize its benefit-- to
>provide interchangeable solutions to organizations that meet the
>challenges of producing Bible related texts-- at least the base 'what',
>must also be defined, or all we've done is generally extend XML
>methodologies to all domains (which isn't a bad thing either).
>
>         -Troy.
>
>
>
>David Burry wrote:
> >
> > okey dokey...
> >
> > I kind of already mentioned this idea before, but here're some more 
> detailed thoughts and explanations on it:
> >
> > XML is ideally suited to represent hierarchical slices of very complex 
> data for exchange with other programs or display engines etc, but not 
> necessarily to efficiently store that extra complex data in its complete 
> form.  By "slices" I mean, let me illustrate:  suppose you have a 3D cube 
> with 3 dimensional data in it, and you want to be able to access any of 
> that data.  The final rendering of the data will be 2D (say for a flat 
> picture), so you can take many different views of that data, all 
> different, but none of them will completely (and efficiently!!) store 
> that data, without duplicating your whole 2D spatial grid a bazillion 
> times for that 3rd dimension!  And even then diagonal/rotated slices 
> could only be approximated from that source data or else the whole thing 
> done again another bazillion times for each possible rotation!  (Rotation 
> of the slice being almost like a 4th dimension.)
> >
> > Ok, that's the visual/mathematical idea, now how it works with us:  If 
> we **need** book/chapter/verse granularity for a particular end user 
> application, it's best to give that app an XML representation that does 
> exactly that, no more and no less.  No need for it to bother with verb 
> tenses and historical contexts and fancy quoting dealies, etc, unless the 
> app actually needs it and can handle it.  Likewise, suppose the 
> application is something that reads the text aloud using some computer 
> synthesized voices, you may want to do book/chapter & quote granularity, 
> for instance, because that's what you need in that case.  Using the 
> information about who's saying what it could say it using different voice 
> sounds/accents, and give the ability to jump to any chapter but not any 
> verse.  Another application that does complicated word analysis (i.e. 
> verbs/nouns/predicates/tenses/etc) would want much more detailed info 
> about each individual word, and it may require book/paragraph/sentence
> > granularity above the word level **not** chapter/verse, even though it 
> may still want chapter/verse markers (i.e. this time not part of the 
> nested tree structure) so it can give visual indicators to the end user 
> where they are.
> >
> > Above I've listed 3 different example slice types, and they do not 
> necessarily mix well together in one single cohesive rigid XML format, 
> but each **does** work quite well as its own independent XML 
> representation of the same general data underneath.
> >
> > So the question is.... are you trying to define one single rigid 
> everything-for-everyone-forever XML format or a more modular extensible 
> approach that can represent it in any way needed at the moment?  You can 
> probably see I'd very much prefer the latter  ;o)  Let the engine/library 
> decide what it wants to use for ultimate storage underneath, it may not 
> even be XML (and yet it may be XML if it wants, just that it may instead 
> be some custom binary/etc format for speed and/or space optimization on 
> disk/memory/etc).
> >
> > But if the library can transform this stored data into whatever XML 
> format is required on the fly, then that would be really cool, as the end 
> user app would only need light already-built robust tools to mine the 
> data out that it needs since the "picture" of the data it's getting is 
> already suited to its needs.  For instance, a simple XSLT could be 
> employed to create any HTML rendition you want.
> >
> > The only drawback might be a way of the app declaring to the library 
> what format it needs, perhaps pass it a DTD or something?  Xpointer and 
> Xpath don't seem suitable for that...  Just that a DTD seems overkill to 
> me, but perhaps not if the standard is kept simple...  but the app also 
> should be passing the library reference ranges and search queries, so...  hmm
> >
> > Anyway, that's where I'm at with this idea right now, you can see in 
> the last paragraph that the idea isn't complete yet, but I think it's 
> enough of a start that it deserves merit.
> >
> > If this idea doesn't make it into SWORD, it will likely put it into my 
> own existing side project eventually anyway, sample URL was sent earlier. 
> Probably with my own evolving XML spec if others don't agree with me--not 
> that doing my own is bad, it's a natural part of the evolution of these 
> things for someone to jump in and try it first, and then if it works well 
> for others to come on board and eventually for a better designed spec to 
> be agreed upon that is designed by the community at large.  That's 
> especially true for radical things that go against tradition! ;o)
> >
> > Dave
> >
> > At 05:07 PM 8/30/2001 -0700, Troy A. Griffitts wrote:
> > >        But, since you seem to be so xml proper :) ...  WE NEED YOUR
> > >(everyone's) FEEDBACK as to what tags should go into an XML markup
> > >standard.