<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
</head>
<body bgcolor="#FFFFFF" text="#000000">
I've been working on a more automated conversion of USFM -> USFX
-> OSIS -> Sword modules. I made some progress, but more
remains to be done. Please note that this conversion is only for
Scriptures, not commentaries, general books, or kitchen sinks. Some
noncanonical stuff gets converted, but I'm focusing on what is
actually used by Sword in Bible modules. It doesn't have 100%
coverage of the USFM standard, but it does have more coverage than
any other converter I've seen so far. It does something appropriate
with every tag in the 274 live projects in my test suite, although
sometimes the appropriate action is to skip the tag and/or the
content, since it is something the Sword front ends don't know
about, at least in terms of what the translators were thinking when
they inserted those tags. The essentials are there, though:
everything canonical, footnotes, cross references, subtitles, psalm
headings, poetry and prose structure, and common special text.<br>
<br>
The OSIS output isn't, strictly speaking, OSIS that fully complies
with the OSIS User Manual, because of some minor deviations from
that document, so I'm calling it Modified OSIS (MOSIS). It does,
however, validate against the OSIS schema. The deviations are, as
far as I know, irrelevant to Sword usage, which is my only purpose
for generating MOSIS. For example, words of Jesus are marked on a
per-verse basis (like USFM does) and not just at the very beginning
and end of each quotation. That actually makes it easier to display
text properly that is retrieved from the back end on a
verse-by-verse basis.<br>
<br>
One big missing link, right now, is <reference> tags, which
are needed to easily make hot links from Scripture references. USFM
doesn't have an equivalent of those. All references in USFM in cross
references, foot notes, introductions, etc., are intended for human
readers in the vernacular only. We can figure the vast majority of
these out programmatically, and insert <ref> tags into the
USFX, and from that, create OSIS <reference> tags. (The
<ref> tags in USFX are a new addition.) The other missing link
is automatically selecting the best versification system to use
based on what is actually in the translation. There are many verse
bridges (a range of verses translated as a unit), but I don't think
those are a problem. There are several sparse translations (i. e.
translations of only selected passages). Those might be a problem.
Most of the translations are fairly close to a subset of NRSVA
versification. However, there probably needs to be a new
versification system supported to properly handle the Septuagint and
translations of the Septuagint.<br>
<br>
I'm doing the conversion in C# because (1) it is my favorite
programming language, and (2) it can do all of the strange things
required to convert between fundamentally different Scripture
formats. This means transforming from flat, simple, and terse to
more deeply structured, complex, and verbose; and making some
implicit information explicit and making some explicit information
part of the structure instead of the elements and attributes. The
transformation is similar to translating from one natural human
language to another, since the tags (words) have different ranges of
meanings (semantic domains) and the structure (grammar) is
different. Several USFM tags just don't have useful standard OSIS
equivalents. You can represent almost ANYTHING in OSIS, but that
doesn't mean anyone else will know what you meant by it or what to
do with it, let alone write software to handle it, so I try to stick
to the actual main standard as much as possible, and hope that you
guys do so, too... or at least let me know where the variations are.<br>
<br>
If you are really curious, you can try my partially-done USFM ->
USFX -> MOSIS conversion out using free and open source (LGPL 3)
Haiola software from <a class="moz-txt-link-freetext" href="http://haiola.org">http://haiola.org</a>. The software isn't very user
friendly, but it works.<br>
<br>
<div class="moz-signature">-- <br>
<meta http-equiv="CONTENT-TYPE" content="text/html; charset=UTF-8">
<title>signature</title>
<p><font color="#000000">Aloha,<br>
<i>Michael Johnson</i></font><br>
<font color="#000070"><a href="http://mljohnson.org">mljohnson.org</a><br>
PO BOX 5278<br>
KAILUA KONA HI 96745-5278<br>
USA<br>
<br>
Phone: +1 808-333-6921<br>
Skype: kahunapule</font></p>
</div>
</body>
</html>