<html>

  <head>

    <meta content="text/html; charset=UTF-8" http-equiv="Content-Type">

  </head>

  <body bgcolor="#FFFFFF" text="#000000">

    On 03/05/2012 08:07 AM, Peter von Kaehne wrote:

    <blockquote cite="mid:4F5500D8.7070504@gmx.net" type="cite">

      <pre wrap="">On 05/03/12 17:33, Greg Hellings wrote:

</pre>

      <blockquote type="cite">

        <pre wrap="">On Mon, Mar 5, 2012 at 11:28 AM, Kahunapule Michael Johnson

<a class="moz-txt-link-rfc2396E" href="mailto:kahunapule@mpj.cx"><kahunapule@mpj.cx></a> wrote:

</pre>

        <blockquote type="cite">

          <pre wrap="">On 03/05/2012 03:20 AM, Greg Hellings wrote:

</pre>

          <blockquote type="cite">

            <pre wrap="">You seem quite taken with USFM, but remember that CrossWire and SWORD

do not support USFM as an import or display format. Therefore

information beyond just how to convert USFM into OSIS or ThML or GBF

which are supported is not really of importance.

</pre>

          </blockquote>

          <pre wrap="">

USFM is the format that literally hundreds of minority-language Bible translations exists in. Are you saying that the Sword Project is not interested in importing those?

</pre>

        </blockquote>

      </blockquote>

      <pre wrap="">

I am not entirely clear what you are aiming at and I must say I do get

somewhat irritated with your tone. I do have a feeling over the last few

days that you are itching to get a fight. Why is that? Is this simply a

misunderstanding?</pre>

    </blockquote>

    <br>

    It is most likely a misunderstanding. Perhaps I have also been

    misunderstanding some of the messages that seem to be opposed to

    USFM. I'm not trying to suggest that USFM be made an additional

    internal format for Sword for Bible search and display, like GBF and

    OSIS.<br>

    <br>

    Please let me be clear about what my goals, agenda, and purpose

    really are.<br>

    <br>

    I have many USFM Bible texts in many languages. I will soon have

    access to many more. I would like to convert them to various formats

    for distribution and use, publishing them in ways that maximize

    their usefulness and accessibility and study by many people in their

    own languages. My primary focus is with minority languages, although

    I have a few translations in languages that have many more speakers

    that I will be converting. Sword is one of many possible outputs for

    these Scriptures.<br>

    <br>

    Because of the large number of translations involved, and frequent

    updates in the case of translations in progress, I'm not interested

    in manual processes. I am only interested in automated processes

    that are reasonably efficient and very reliable.<br>

    <br>

    As far as I'm concerned, it doesn't matter to me what formats you

    store or display Bibles in. It can be the current Sword format set

    defined by your API. It can be COBOL code and structured Latin if

    you can make it work. What I do care about is that when I convert a

    Bible (or portion) translation into one of your import formats, and

    you import it and display it, that:<br>

    <ol>

      <li>You accurately preserve all of the original text and

        punctuation (including quotation punctuation) exactly as it was

        in the original USFM. This involves the complete process from

        module creation to display in all front ends. This is an

        absolute requirement with respect to the canonical text. If this

        condition isn't met, then I don't have permission to convert

        these Scriptures to Sword format, nor do you have such

        permission.<br>

      </li>

      <li>I would prefer to have formatting such as prose and poetry

        preserved, and to have noncanonical text such as introductions

        and subtitles passed through for display in a way that

        differentiates it from the canonical text, although it would

        probably be acceptable to strip this information out or make it

        conditionally display.</li>

      <li>I would prefer to have footnotes displayed in a format that

        makes sense for the platform.<br>

      </li>

    </ol>

    In other words, I care about the end-to-end system, primarily.<br>

    <br>

    I don't care if the Sword Project ever supports USFM in any way

    except to import it, directly or indirectly through OSIS or another

    format, into Sword. I never suggested using USFM or its XML kin in

    any other way within the Sword project. I don't care how you display

    USFM on your web sites, wiki or otherwise, or what formats you use

    internally to the Sword project, as long as it works end to end

    without losing a single jot or tittle. However, I do think it is

    important that you document the best ways to convert USFM to a

    format you can import. I think you do, too, really.<br>

    <br>

    I am aware that you have some tools to import a small subset of USFM

    to a form of OSIS that works with osis2mod, and have created some

    modules with it. I'm also aware of the OSIS manual section that

    contains a list of OSIS near equivalents for most (but not all) of

    the current USFM tags that actually appear in the Bibles I'm working

    with. My tests using those tools so far have found them wanting. I'm

    going to try to fix that by doing my own conversion from USFM to

    OSIS. Please forgive what may have appeared to be criticism without

    a constructive purpose. I'm trying to convert Scripture files on a

    scale and with speed that is apparently unprecedented.<br>

    <br>

    I intend to write a USFX-to-OSIS converter that produces output that

    should validate against the current OSIS schema, and which will

    import correctly into Sword modules. (GBF might be an option, too,

    but I think that if the difficulties with OSIS can be overcome, it

    would be better to use OSIS.) At least that is what I'm going to

    try. If I succeed, you need not deal with USFM and its XML kin

    directly ever at any time. You can just send people to a different

    open source project for that piece of important functionality.<br>

    <br>

    There are some things that I will do that may not fit the way some

    members of the committee that designed OSIS envisioned things. For

    example, in the OSIS files that I generate, all of the quotation

    punctuation will be left as part of the Bible text, and never

    included in a <q> marker, either implicitly or explicitly with

    a "marker" attribute. If I need to mark direct quotes of Jesus

    Christ in a particular translation, I will do so by converting USFM

    \wj ...\wj* markers directly into <q who="Jesus" marker=""

    sID=""/>...<q marker="" eID=""/>, where the marker

    attribute is always empty. This should, according to

    <a class="moz-txt-link-freetext" href="http://crosswire.org/wiki/OSIS_Bibles#Marking_Quotations">http://crosswire.org/wiki/OSIS_Bibles#Marking_Quotations</a>, result in

    lossless display of the proper quotation punctuation in all front

    ends that comply with that same interpretation. I don't plan to use

    <q> for anything other than direct quotes of Jesus. This usage

    is philosophically compatible with USFM and OXES. It is also

    actually easier to render, since the Paratext interpretation of USFM

    does not allow \wj ...\wj* markers to cross verse boundaries.

    Therefore, you don't have to process beyond the beginning of the

    current verse to determine if you should turn on an optional red

    attribute or not, even in an extended quotation like <i>The Sermon

      on the Mount.</i><br>

    <br>

    Another thing I will do is convert legacy (deprecated) "display"

    markup for bold and italics directly from USFM to <hi

    type="bold"> and <hi type="italic"> markers. The reason for

    that is that I have translations where I have tried to replace

    "display" markup with the appropriate "semantic" markup, only to

    find that USFM does not have a suitable replacement for the way

    certain translators have chosen to use these text attributes.

    Fidelity to the translation and deference to the translation

    committees wins out over abstract arguments about separation of

    semantics from presentation forms. In essence, these attributes that

    are considered in some languages to be mostly a presentation issue

    are actually a semantic issue in other languages. This is not a

    winable argument, so I just perpetuate the use of this kind of

    markup and hope that front ends will honor that markup. The

    consequences of not doing so are presentation of writing that is

    less clear and ugly in the subject languages. There may also be

    cases where I preserve the bold and italic markup just because it is

    too time-consuming to try to figure out what it should have been in

    each case, based on where it is, but in a language I can't read.<br>

    <br>

    I hope this helps...<br>

    <br>

    Shalom,<br>

    Michael<br>

    <br>

    <br>

  </body>

</html>