[osis-core] USFM mapping issues
Chris Little
osis-core@bibletechnologieswg.org
Tue, 03 Feb 2004 01:49:59 -0600
The following are based on my notes from the meeting between Todd, Kees,
and myself, in which we went through the USFM manual and identified
those USFM elements that could not easily map to OSIS equivalents.
-----------------
1) Block quotations
USFM uses various paragraph formatting conventions (e.g.
intented/unindented block quotes) to mark letters, inscriptions, and
potentially other large quotations from external sources. These should
be somehow indicated.
My initial suggestion was to use <q type="letter|inscription|...">.
(Deprecating <inscription> and simplifying the model. There are
instances in the Apocrypha where very large inscriptions are quoted,
which include paragraphs themselves. These aren't possible in the
current content model.)
Steve suggested that we use new <div> types for to mark these.
Troy thinks <q> should only refer to quotations of speech. (Evidenced
previously in his objection to marking OT quotations in the NT with <q>
and more recently by his objection to marking commentary annotants with
<q>.)
2) Selah and right/left justified & centered poetry
At the time we discussed it, my suggestion was to use <q
level="center|right|selah"> or <q level="n">, where n is positive for
left justification, negative for right justification, and 0 for centered.
Between Friday and Saturday's meetings, I think we have settled on:
<l type="selah"> for \qs
<l type="unknown" subType="start|center|end"> for \q1, \qc, & \qr,
respectively, assuming LtR text.
3) Acrostic letters
There is a USFM element for marking acrostic letters in a line of
poetry, by which the first letters of consecutive lines spell a word or
are consecutive letters. This is distinct from acrostic sections, as in
Ps.119.
For this, <seg type="acrostic"> seemed the simplest way to handle the
element.
4) Right/left justified & centered table cells
The contents of table cells, like poetic lines, may be right or left
justified or centered.
An solution similar to that for <l> is one possibility.
5) Liturgical notes
USFM supports a \lit element that specifically marks liturgical notes.
Adding a "liturgical" value to osisNotes seems the obvious solution.
6) Verse numbers in notes
Similar to the problem of unbalanced quotes, <note> elements may include
verse numbers (likely in embedded <catchWord> or <rdg> elements). These
should not be marked by <verse> since they do not mark actual verses in
the text.
The best solution we came up with was: <seg type="verseNumber">1</seg>
A very yucky solution, that assumes all verse numbers are rendered as
superscript, would be: <hi type="super">1</seg>.
7) Identification of deutero-canonical material
USFM provides \dc, \fdc, and \xdc for marking deutero-canonical
material, notes related to deutero-canonicals, and cross-references to
deutero-canonical books respectively.
In most of these cases, <seg type="deuterocanonical">...</seg>, would be
a good solution. Identifying deutero-canonical books/sections might
require a new <div> type.
The canonical value would not be sufficient to handle all of these cases
since, whether it refers to proto-canonical or deutero-canonical
material, a note is always not canonical.
8) Study notes types
The USFM manual's material on study notes is still under development,
but, as it stands, currently has an element for study notes, which
themselves specify a type of study notes.
The types that it names are: history, glossary, & textual criticism.
These could be handled either by new note types or note subTypes for the
study type.
9) Sidebars
USFM provides an element for identifying material that will appear in a
sidebar. <div type="sidebar"> would be the obvious solution, but that
is rather presentation-oriented. Another solution would be to force
users to decide what types of material they are presenting in sidebars
(e.g. background notes) and to indicate that.