[osis-core] USFM mapping issues

Chris Little osis-core@bibletechnologieswg.org
Tue, 03 Feb 2004 01:49:59 -0600


The following are based on my notes from the meeting between Todd, Kees, 
and myself, in which we went through the USFM manual and identified 
those USFM elements that could not easily map to OSIS equivalents.

-----------------
1) Block quotations

USFM uses various paragraph formatting conventions (e.g. 
intented/unindented block quotes) to mark letters, inscriptions, and 
potentially other large quotations from external sources.  These should 
be somehow indicated.

My initial suggestion was to use <q type="letter|inscription|...">. 
(Deprecating <inscription> and simplifying the model.  There are 
instances in the Apocrypha where very large inscriptions are quoted, 
which include paragraphs themselves.  These aren't possible in the 
current content model.)

Steve suggested that we use new <div> types for to mark these.

Troy thinks <q> should only refer to quotations of speech.  (Evidenced 
previously in his objection to marking OT quotations in the NT with <q> 
and more recently by his objection to marking commentary annotants with 
<q>.)


2) Selah and right/left justified & centered poetry

At the time we discussed it, my suggestion was to use <q 
level="center|right|selah"> or <q level="n">, where n is positive for 
left justification, negative for right justification, and 0 for centered.

Between Friday and Saturday's meetings, I think we have settled on:
<l type="selah"> for \qs
<l type="unknown" subType="start|center|end"> for \q1, \qc, & \qr, 
respectively, assuming LtR text.


3) Acrostic letters

There is a USFM element for marking acrostic letters in a line of 
poetry, by which the first letters of consecutive lines spell a word or 
are consecutive letters.  This is distinct from acrostic sections, as in 
Ps.119.

For this, <seg type="acrostic"> seemed the simplest way to handle the 
element.


4) Right/left justified & centered table cells

The contents of table cells, like poetic lines, may be right or left 
justified or centered.

An solution similar to that for <l> is one possibility.


5) Liturgical notes

USFM supports a \lit element that specifically marks liturgical notes. 
Adding a "liturgical" value to osisNotes seems the obvious solution.


6) Verse numbers in notes

Similar to the problem of unbalanced quotes, <note> elements may include 
verse numbers (likely in embedded <catchWord> or <rdg> elements).  These 
should not be marked by <verse> since they do not mark actual verses in 
the text.

The best solution we came up with was: <seg type="verseNumber">1</seg>

A very yucky solution, that assumes all verse numbers are rendered as 
superscript, would be: <hi type="super">1</seg>.


7) Identification of deutero-canonical material

USFM provides \dc, \fdc, and \xdc for marking deutero-canonical 
material, notes related to deutero-canonicals, and cross-references to 
deutero-canonical books respectively.

In most of these cases, <seg type="deuterocanonical">...</seg>, would be 
a good solution.  Identifying deutero-canonical books/sections might 
require a new <div> type.

The canonical value would not be sufficient to handle all of these cases 
since, whether it refers to proto-canonical or deutero-canonical 
material, a note is always not canonical.


8) Study notes types

The USFM manual's material on study notes is still under development, 
but, as it stands, currently has an element for study notes, which 
themselves specify a type of study notes.

The types that it names are: history, glossary, & textual criticism. 
These could be handled either by new note types or note subTypes for the 
study type.


9) Sidebars

USFM provides an element for identifying material that will appear in a 
sidebar.  <div type="sidebar"> would be the obvious solution, but that 
is rather presentation-oriented.  Another solution would be to force 
users to decide what types of material they are presenting in sidebars 
(e.g. background notes) and to indicate that.