[sword-devel] Designing a modular versification system

Arnaud Vié unas.zole+avie at gmail.com
Mon Feb 19 12:49:52 EST 2024


Hi everyone,

As mentioned in other threads, I'd like to design a new, standard way to
specify "versifications", that would allow to build custom versifications
for each individual bible easily, while retaining and improving the ability
to map and align different bibles with one another.
I won't dive into actual format discussions yet : I was originally thinking
of going for XML in order to easily integrate in OSIS headers, but using
and extending the JSON format of the Copenhagen Alliance
<https://github.com/Copenhagen-Alliance/versification-specification/> (if
possible, even integrating directly with their standard) could be nice.

The goal of this mail is to present, from a "functional" point of view, the
core principles on which such a versification system would be built.
I try to use both generic terms applicable to any type of document
("reference system"), and explain how it concretely maps to OSIS bibles
("versification").

*Principle 1 : A "reference system" specifies only a set of unique IDs and
a unique meaning for each ID.*

A reference system is basically a map "ID -> Meaning".
In two documents that use the same reference system, elements that have the
same ID must have the same meaning.

Concretely, for OSIS bibles, ID will typically be an OSIS ID, and the
meaning will typically be defined by a specific text extracted from a
reference bible.
Between two bibles using the same versification, two elements with the same
ID must be the translation of the same sequence of words.

We could for example define a Rahlfs-LXX versification, backed by the URL
to a place where the original text can be found
<https://archive.org/details/alfred-rahlfs-the-septuagint-lxx-with-apocrypha-morphological-data>
:
this reference clearly defines the "meaning" for each ID.

*Consequence 1-1 : Ordering defined by documents, not by the reference
system.*

This is an inconsistent design of the current versification system, which
provides two possibly contradicting sources for the ordering of elements
within a text.

Any document obviously has a natural order : the order in which the
elements appear in the document source.
But the current sword implementation of versifications also attaches a
specific ordering of elements to the reference system, which leads notably
to versifications which differ ONLY by the order of the books (see MT and
Leningrad).

A clean reference system should not contain any notion of ordering : the
only source of truth for ordering is the document itself.

*Consequence 1-2 : No "compromised" or ambiguous versifications*

Currently, some versifications are explicitely "compromised" (see LXX for
example) in that they try to cover many possible bibles each with minor
differences.
Principle 1 requires a unique meaning for each ID, usually specified by a
single reference text, preventing this.

Similarly, in the current system, versifications are ambiguous, in that
actual documents may or may not use "0" IDs for pre-verse canonical
contents, like the psalm canonical titles in KJV.
Principle 1 requires that each ID is explicitly defined : if a
versification defines a meaning for ID 0, it must do so explicitly.

*Principle 2 : A reference system may be defined as a subset of another
reference system*

In the previous example, if a Rahlfs-LXX versification is defined, we may
define a Rahlfs-LXX-Psalms by considering only the IDs that belong to the
book of psalms.

*Principle 3 : A reference system may be defined as an aggregation of
several others*

In that case, all IDs defined in one of the underlying ref systems are
valid in the resulting one, and map to the same meaning as in their
original ref system.

For example, if we have a Rahlfs-LXX-Psalms versification, a Vulg-Esth
versification, and others for each book, they may be combined to build a
versification covering a full bible.

The only requirement here is the unicity of meaning for each ID : we can't
aggregate two ref systems that define a common ID. We must first substract
this ID from all aggregated systems except one, to remove any ambiguity.

*Principle 4 : A reference system may be defined by a mapping table to
another reference system*

This mapping table defines the set of IDs defined in this new ref system
(left hand side), and which IDs from the base ref system they correspond to
(right hand side).

"One-to-many" and "many-to-one" mappings should be possible - to represent
verses that are split or merged between the base and new versifications.

The general idea of the mapping table is similar to jsword's current
versification mapping files
<https://github.com/AndBible/jsword/blob/develop/src/main/resources/org/crosswire/jsword/versification/Catholic2.properties>,
except that in jsword the right hand side is always KJV (or KJVA since my
contribution to the AndBible fork).
Here, it can be any other versification.

*Practical application*

The practical application of these principles leads to the following setup :
- One specific "root" versification can be chosen by CrossWire and embedded
in sword, to be used as central point for mapping. That could be KJVA (as
it's already the current central point for mapping) by referencing a
specific edition of KJVA.
- A small set of "major" versifications are defined by CrossWire and
embedded in sword, along with an accurate mapping to KJV. These major
versifications should be for versions that we consider "very influential",
ie many bibles mostly follow their verse splits (ex. Rahlfs LXX, possibly
one MT version, etc.)
- Finally, each bible can either reuse one of these major versifications
directly, or embed a custom versification built by
substracting/aggregating/mapping to any of the major ones.

This allows each bible to accurately define its own versification without
ambiguity, while still inheriting as much of the mappings as possible from
the "major" versifications.

For example, one bible may use Rahlf's LXX for all books except Esther, and
define a specific versification for Esther with explicit mapping to KJVA.

Other example : we no longer need to explicitly maintain NSRV and NSRVA :
it's very easy for these bibles to just reuse KJVA with one small mapping
for the only difference.


And that's all for today, I think that description is long enough already !

Let me know your thoughts !
If we have a consensus on these principles, we can then start working on
defining an actual format.

Regards,

Arnaud Vié
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://crosswire.org/pipermail/sword-devel/attachments/20240219/c59639e4/attachment.htm>


More information about the sword-devel mailing list