[sword-devel] OSIS Schema

DM Smith dmsmith at crosswire.org
Sun Oct 14 15:19:44 MST 2012


The OSIS schema is a bit convoluted how it allows two different document models. I've been thinking that it might make sense to have three distinct OSIS schemas. The one we have now would be one of the three. The other two would be for the other two document models.

The problem I'm coming up against is that because nearly every "container" element has a milestone form, everything goes. Some examples:
1) milestoned elements allows for overlapping containers. e.g. <div sID="x"/><lg sID="y"/><div eID="x"/><lg eID="y"/>
2) text is allowed where it should not be. e.g. <lg sID="x"/>text<lg eID="x">
3) elements are allowed where they should not be. e.g. <div><l>...</l></div>

When these things happen, the SWORD and JSword engines may not produce the desired results and they are very hard to diagnose.

For best practice in creating an OSIS document, we recommend that book, chapter, div, lg, l, .... not be milestoned,  and that verse elements be milestoned. We call this BSP (Book/Section/Paragraph).
I think one of the schemas should properly represent this.

The following allow for milestones:
abbr
chapter
closer
div
foreign
lg
l
q
salute
seg
signed
speech
verse

The "rule" is that within a document an element be used either as milestoned form or as container form, but not both.

The <div> element is funny in that the schema requires that the div not be milestoned, but allows for milestoned markup. I take this to mean that the combination of an element with the value of type should be used to determine the form.

Regarding a BSP OSIS schema, the verse element would be milestonable.

Of the other elements above, I don't see that one would ever have to milestone abbr, closer, foreign, salute, signed.

"q" for quotes serve two purposes: marking quotations (what the marks are and where they go) and designating who is speaking. The latter is used to mark the words of Jesus. The <milestone> element is a mechanism to mark continuing quotes. These need to be allowed to be milestoned. It is highly likely that a richly structured document will have at least one occurrence that requires it.

Since speech is an analogous form for q, it will need to be milestoneable.

Poetry (lg, l) can certainly cross chapters, but it can be artificially started and stopped so as to not cross boundaries.

seg is problematic. The OSIS manual defines it as part of <word> [sic, they meant <w>] and for marking inline text with a type, e.g. type="benediction". I don't see that it needs to be milestoned.

I've seen one example of where chapter is crossed by div (last verse of John 7 and first 11 verses of chapter 8, marked as "problematic text"), but I'm not sure that it needs to be milestonable.

Any thoughts?

In Him,
	DM




More information about the sword-devel mailing list