[osis-core] OSIS editor
Patrick Durusau
osis-core@bibletechnologieswg.org
Tue, 03 Dec 2002 06:22:50 -0500
Harry,
Harry Plantinga wrote:
>Some talk was batted about at the Toronto meeting
>on modifying OpenOffice.org Writer to act as an
>OSIS editor. One issue is how you would "validate"
>markup as it is entered with the Writer interface.
>
>I thought y'all might be interested in the following
>Microsoft tidbit as it pertains to this issues:
>
>http://www.microsoft.com/office/xdocs/default.asp
>
>Basically, it sounds to me like an XML editor that
>supports schemas and allows you to build "rich
>forms" for data entry. It will be a part of Microsoft
>Office starting mid-2003.
>
Not much information is available but I suspect from the description
that it relies upon fairly simple schemas to build the forms. In other
words, once inside a paragraph (our <p>) I doubt rather seriously that
you will have the choice of a <list>, <lg>, or other possible container
element. Might but then the form would have to be able to re-configure
itself based upon user choices. Not that difficult to imagine or
implement but would be surprised if it has that level of customization.
Most XML schemas for data entry are fairly straight forward, since data
files tend to be rather simplistic. (At least when compared to documents.)
>
>It remains to be seen whether it will be as easy
>to use as a word processor for editing OSIS documents,
>but I'm not all that hopeful -- it'll probably be
>a lot like XMetaL or XML Spy. Though a debugged,
>faster XML Spy could actually be a fairly decent
>environment for OSIS editing by casual users.
>
>But my suspicion is that to make OSIS editing truly
>easy and error-proof, it will take a custom
>application with user-interface elements designed
>especially for OSIS.
>
Actually I am planning on testing your proposition after I return from
XML 2002 (too much to do between now and then) and would appreciate your
comments (and everyone else's) on the following proposal:
I am simplying the initial problem by assuming that all the header
information would be developed by an XML aware person and software, so I
am beginning with elements inside <osisText>.
1. I will create styles in OpenOffice that map to the various OSIS
elements. (I will post some material to their list to see if the styles
available within another style can be restricted dynamically, but
assuming not for the rest of these steps.)
2. I will take one of your texts and one from Sword and mark it up using
the styles created in #1 twice, once with what I think is "valid" OSIS
markup and one with deliberately "invalid" (Todd, no comments please!)
OSIS markup. ;-)
3. To process the markup from the saved file format, I will write XSLT
stylesheets to key on the styles that were imposed in OpenOffice.
4. Part of the stylesheet will be a function to construct things like
attribute values like osisIDs from styled information in the text. For
example,
<style="verse-cite">Gen.1.1</style><style="verse">When in the
beginning...</verse>
would result in: <verse osisID="Gen.1.1">When in the beginning....</verse>
Whereas,
Gen.1.1<style="verse">When in the beginning...</verse>
would not transform because the required content for the osisID is not
present and an error message would be returned, saying that the osisID
was not found, along with the portion of text that needs the osisID (to
help locating it in the text).
Admittedly this is not dynamic validation while the user is working but
it would allow a fairly gross level of markup to be imposed by very
inexperienced users.
This would not help with word level annotation, but most of that should
be done automatically in any event.
In the latter part of December I am actually going to be testing this
technique for markup on a portion of the Chicago Assyrian Dictionary,
which in some ways is more complex than OSIS documents to date but also
more fixed in terms of structure.
Even if we can't limit the use of styles within styles, I suspect even
before transformation into OSIS markup, we could actually write a script
that enforces an order in which styles must be used. Hmmm, will have to
think about that one. In other words, use container styles to denote
what other styles may be found therein. Again, not dynamic error
checking but easier for inexperienced users and good for gross level
markup (the bulk of markup in a document).
Might buy us some acceptance to have a free, WYSIWYG encoding for OSIS
documents while waiting on a true OSIS editor. Any thoughts on whether
the limitation of styles by context would be a good way to provide a
user interface for such an editor? Could have an active box with
attributes that can take a value for that "style" (read element).
The latest round of papers and submissions and reviews should be over by
December 18th (major presentation at the SBL) so look for postings on my
experiments between then and the end of the year.
Would appreciate any comments or suggestions on this proposal before
then as well. (References to other works would probably be something one
has to add in on a per document basis, i.e., match this string, then
output this osisRef, assuming the document had a fairly consistent style
of citing other works. Or perhaps just so marked in the transformation
for further processing in the next stage, i.e., the values are reported
with XPath locations so you can create another stylesheet to go through
and do all of those separately.)
Thoughts, comments?
Patrick
--
Patrick Durusau
Director of Research and Development
Society of Biblical Literature
pdurusau@emory.edu