[osis-users] OSIS cross-reference questions

davidtroidl at aol.com davidtroidl at aol.com
Wed Nov 21 13:57:54 MST 2012

Deut. 32:15,17,22–26 could be marked up
<reference osisRef="Deut.32.15">Deut. 32:15</reference>, <reference osisRef="Deut.32.17">17</reference>, <reference osisRef="Deut.32.22-Deut.32.26">22-26</reference>

osisID's are meant to be unique identifiers for book, chapter and verse elements in a bible.  They should not be used in notes, and they should not contain multiple references.




Deut. 32:15,17,22–26




-----Original Message-----
From: Markku Pihlaja <markku.pihlaja at sempre.fi>
To: osis-users <osis-users at crosswire.org>
Sent: Wed, Nov 21, 2012 11:34 am
Subject: Re: [osis-users] OSIS cross-reference questions

Well, no replies to my previous message.
My explanations might have been too long for anyone to take the effort of reading - sorry about that. So now I'll try reducing and simplyfying the questions that I still need an answer for. Read further to previous conversations if you need more details.

1) How do I make a difference between a list of three separate crossreferences and a single compound crossreference that consists of three separate verses (or even ranges)?
Deut. 32:15; Deut. 32:17; Deut. 32:22–26

Deut. 32:15,17,22–26

I emphasize that the second example is just a single reference to a non-contiguous set of verses, and can also be one of several separate references on a list like the first example.


2) Is listing multiple individual verses separated by a space really allowed in a) osisIDs b) osisRefs? 
I tried parsing the OSIS schema for osisIDRegex and osisRefRegex but couldn't find anything that would allow this - probably missed just the crucial character somewhere. The Durusau manual does give an example about a) under "15.4. Grouping". But the manual says that "a single osisRef cannot identify a discontiguous range of a work", so the answer to b) is probably "no" and the latter of the two examples below incorrect?


<note type="crossReference" osisID="Deut.32.15 Deut.32.17">

<reference osisRef="Deut.32.15 Deut.32.17">


3) If the answer to 2a) is yes, what is allowed with a compound ID like that? Specifically, can I use sub-identifiers? 
If I name a note with such a grouped osisID, can I append !crossReference to it - and where should I place it? After an extra space after the last verse listed or connected to the las verse (in which case it looks like it only applies to that verse)?

<note type="crossReference" osisID="Deut.32.15 Deut.32.17 !crossReference">

<note type="crossReference" osisID="Deut.32.15 Deut.32.17!crossReference">

or something else?

Merely <note type="crossReference" osisID="Deut.32.15!crossReference"> won't do because I need to make a difference between a note attached to just verse 15 and to verses 15 and 17 together. And I do need to refer to the very note instead of the verses, that's why I need that sub-identifier.


4) If the answer to 2b) is yes, what is allowed in that compound ref? Specifically, are ranges allowed in such compound refs?

<reference osisRef="Deut.32.15 Deut.32.17 Deut.32.22-Deut.32.26">

I guess this goes back to question 1) especially if 2b) was wrong.


5) If 2b (and thus also 4) is wrong, how do I make a cross-reference to a note whose source passage consists of incontiguous verses? Also, since annotateRef takes an osisRef value, how can I indicate an incontiguous source in that?

Example, apparently with at least an invalid osisRef:
<note type="crossReference" osisID="Deut.32.15 Deut.32.17 !crossReference">
<reference osisRef="Deut.32.15 Deut.32.17 !crossReference">Deut. 32.15,17</reference>


6) Is it possible to have a reference's osisRef with a sub-identifier without a corresponding osisID having that (or any) sub-identifier?

<verse osisID="Deut.32.15" sID="Deut.32.15" />Jeshurun grew fat and kicked; filled with food, he became heavy and sleek. He abandoned the God who made him and rejected the Rock his Savior.<verse eID="Deut.32.15" />
<reference osisRef="Deut.32.15!part2">


Complicated questions, I hope you have some answers or at least workarounds!


2012/11/19 Markku Pihlaja <markku.pihlaja at sempre.fi>

Thanks DM,

(Others are also welcome to share their views! And also to check the one new question at the end, after the second  "-----------" marker)

That didn't quite solve my problem. You say I shouldn't nest references. But I do need some way of making a difference between a compound reference and a list of separate references. An example:

In Gen. 46:12, we have three references: 
Gen. 38:7,10; Num. 26:19-21; 1. Chr. 4:1

The first one, to Gen., is indeed just one reference even though it refers to separate verses. As far as I can figure out, an unnested note wouldn't be able to tell whether Gen.38.7 and Gen.38.10 are parts of the same reference or two independent references:
<note type="crossreference">

        <reference osisRef="Gen.38.7">Gen. 38:7</reference>,

        <reference osisRef="Gen.38.10">10</reference>;

        <reference osisRef="Num. 26:19-21">Num. 26:19-21</reference>;
        <reference osisRef="1Chr.4.1">1. Chr. 4:1</reference>

Of course, to a human those first two refs would probably look like one reference, but the computer needs to rely solely on the markup and not what's between.

If, on the other hand, I list that as three subsequent notes, the semicolons wouldn't be embedded in any tags and thus would be rendered even when reference notes should be hidden.

<note type="crossreference">
        <reference osisRef="Gen.38.7">Gen. 38:7</reference>,
        <reference osisRef="Gen.38.10">10</reference>
 <note type="crossreference">
        <reference osisRef="Num. 26:19-21">Num. 26:19-21</reference>
 <note type="crossreference">
        <reference osisRef="1Chr.4.1">1. Chr. 4:1</reference>

I guess it is also true what you wrote about note tags: they represent the marker(s) in the text (even though most of our printed Finnish Bibles don't include markers within the text; the notes are listed after certain passages with references to the position of the note instead). Also this would imply that I shouldn't use the later example with three subsequent notes.

You mentioned one more approach, listing all parts of the compound reference in one osisRef. That would seem to work somehow:

 <note type="crossreference">
        <reference osisRef="Gen.38.7 Gen.38.10">Gen. 38:7,10</reference>;
        <reference osisRef="Num. 26:19-21">Num. 26:19-21</reference>;
        <reference osisRef="1Chr.4.1">1. Chr. 4:1</reference>

This osisRef / osisID style, however, is missing from Durusau's User Manual. There is section "15.4 Grouping" that does give an example of such notation with osisIDs, but "Appendix J - osisIDs: Construction Rules" doesn't say anything about this. And I've found nothing whatsoever about osisRefs like this. So is this certainly valid markup?

Also, assuming "Gen.38.7 Gen.38.10" would be a valid osisRef, would also for example "Gen.38.7 Gen.38.10-Gen.38.12" be? We also have a few compound references consisting of separate verses AND one or more ranges.


As for my question number 3) - the subdivision of  a referenced verse - I tried to explain that there is no automatic or even easy manual way of determining where each subdivision of the verse begins. We would need a Bible content expert to do that, and we don't have one for this project.

So referring to a fine-grained position of a verse is no option since we don't know where each exact position would be.

I'll refine my question:
Is there any way of determining a "vague" division of a verse? For example, does the extension part of an osisRef always need to have a corresponding osisID somewhere? Or could we have a verse like this:
  <verse osisID="Xxx.2.14" sID=.... />
  Some text here. Some more text here. Even some more text here. And more and more text.
  <verse eID="... />

and then have a reference like this:

  <reference osisRef="Xxx.2.14!c">Xxx 2:14</reference>

with just the osisID "Xxx.2.14" declared but not "Xxx.2.14!c"?

I know this is vague, but so is our current notation, and I'm trying to find some means of including the info in the current notation also in the markup. My plan B would then be to just encode all the references to the whole verse and let only the | separators indicate to the reader that the references point to different parts of the verse, just as in the printed versions now.


And now for one new somewhat related question.

We also have something that could be called indirect references. Our notation
Gen. 24:7+
tells us that this reference doesn't refer to Gen. 24:7 itself, but it shares the references listed for that verse instead. For example, this Gen. 24:7 has references:
<note osisID="Gen.24.7!crossReference">

    <reference osisRef="Gen.50.24">Gen. 50:24</reference>;

    <reference osisRef="Deut.1.8">Deut. 1:8</reference>;

    <reference osisRef="Josh.1.6">Josh. 1:6</reference>;

    <reference osisRef="Judg.2.1">Judg.2:1</reference>


Now when another verse lists "Gen. 24:7+" as its reference, it means that this reference list should be used as the reference list for this verse, too. Unfortunately replacing the plus notation with the complete list isn't an option here - apparently the fact that these verses share the same references is of importance itself.

In normal cases, this would probably be rather simple: refer to the note in Gen. 24:7 with
<reference osisRef="Gen.24.7!crossReference" ...>Gen. 24:7+</reference>.

But things get tricky when the referred verse in the plus notation is more than a single verse. We have notations like 
"Deut. 4:41,43+". or "Gen. 15:19–21+".

We might be able to cope with the first one, assuming the "Grouping" notation discussed earlier is valid. But is it ok to add the sub-identifier "!crossReference" to an ID like this: "Deut.4.41 Deut.4.43", and where do I add it?

But it gets worse with the latter notation, since ranges aren't allowed in osisIDs - and thus I also can't create an osisRef "Gen.15.19-Gen.15.21!crossReference". Or that osisRef might still be valid, but at least the corresponding osisID wouldn't, and thus that reference wouldn't make sense.

One solution would obviously be to use the osisID of just the first verse - that would mean "Deut.4.41!crossReference" or "Gen.15.19!crossReference" in my examples. But that is not possible since there might already be references for that verse alone. Also, omitting the other verses from the ID would mean that nothing at all in the markup would tell that this note is related to more than one verse:
<note osisid="Gen.15.19!crossReference">
    <reference osisRef="Exod.3.8">Exod. 3:8</reference>

Any ideas?

Phew, these things are complicated to explain in an understandable manner... And impossible to do it with only a few short lines.

Once again, thanks in advance to those who take the effort of reading all this!


2012/11/14 DM Smith <dmsmith at crosswire.org>

On Nov 14, 2012, at 8:54 AM, Markku Pihlaja <markku.pihlaja at sempre.fi> wrote:

I'll also need to return to some questions that already got answered ages ago - halfway to meet my final needs, as it now turned out.

2012/4/26 David Troidl <DavidTroidl at aol.com>

            How should I encode cross-references to non-contiguous verse            ranges? For example, I have this reference (in our standard            notation): Matt. 27:17,22. This is formally just one            reference to verses 17 and 22, not two separate references.            OSIS requires that "a single osisRef cannot identify a            discontiguous range of a work". So how should this be done?            Making one note that contains two references might be a step            towards what I want, but there would still be two separate            references.

Here is the way to encode discontiguous references:
    <note type="crossReference"><reference    osisRef="Matt.27.17">Matt. 27:17</reference>, <reference    osisRef="Matt.27.22">22</reference></note>

So, when I have a list of separate references, some of which are non-contiguous ones such as above, should I create a nested note to contain the different notes?

For example, if I have the following three references for one verse:
Matt. 27:17,22 ; 2. Sam. 7:16; Matt. 9:27

should that be coded as:

<note type="crossreference">

        <note type="crossreference">
                <reference osisRef="Matt.27.17">Matt. 27:17</reference>, 
                <reference osisRef="Matt.27.22">22</reference>

        <note type="crossreference">
                <reference osisRef="2Sam.7.16">2. Sam. 7:16</reference>

        <note type="crossreference">
                <reference osisRef="Matt.9.27">Matt. 9:27</reference>


No. Don't nest.
You can also use references such as <reference osisRef="Matt.27.17 Matt.27.22 2Sam.7.16 Matt.9.27">Matt 27:17,22; 2 Sam 7:16; Matt 9:27</reference>.
Note that some systems (e.g. SWORD Project) cannot handle this. And having 4 refs is better.

Putting all the <reference>'s within just one <note> container would to me mean one reference to extremely non-contiguous verses. And if I omit the outer <note> tags, then the semicolon separators between the different notes would fall outside any note and be rendered even when notes are hidden.

If that suggestion was right, what should we do in simpler cases where there is a group of contiguous references?Should I still enclose them in a second level of <note>'s for consistency, or would it be ok to use only one level like this (assuming here that there is no 27:22 in the first reference):

<note type="crossreference">

        <reference osisRef="Matt.27.17">Matt. 27:17</reference>;
        <reference osisRef="2Sam.7.16">2. Sam. 7:16</reference>;

        <reference osisRef="Matt.9.27">Matt. 9:27</reference>


Just one level. Just like this.

            Our cross-references are currently listed on a            verse-by-verse basis in a separate file. Each verse might            have a number of references, most of them separated by a            semi-colon. However, in some cases the separator is the            vertical line character, | (or the pipe sign). This            indicates a fine grained division of the source verse.            That's *source*, not target. For example,
               Luuk. 2:4-7 ¦ Dan. 1:20
            would say that the beginning of the referring verse refers            to Luke 2:4-7, and the end to Daniel 1:20. There can be up            to 4 divisions like this in one verse. However, there is no            automatic way of determining what the exact division of the            source verse is. In fact, in some cases even I can't read            the verse and tell the division without reading the            referenced verses first.
            This means that in any case I'll probably need to leave the            OSIS coding vague in this respect. My question here: is            there a way to somehow indicate the existence of this            division within the tags, or is the only way to continue            marking it like it was done until now, like this:
            <reference section1a.... />; <reference            section1b.... /> | <reference section2.... /> |             <reference section3a.... />; <reference            section3b.... />
            Could that be done by using osisID's like
            Matt.1.1!crossReference.section1.b etc.
            or is there a better way?
    I'm not exactly clear what you are asking here.  If you want to mark    up the notes, without changing the markup of the Bible text, you    could use word numbers within the verse, to indicate where the note    applies.

And I'm not quite clear if I got your point :).
Let me give you a quite precise example.

This is Acts 3:13:
"The God of Abraham, Isaac and Jacob, the God of our fathers, has glorified his servant Jesus. You handed him over to be killed, and you disowned him before Pilate, though he had decided to let him go."

For that verse, we have three different references which are marked like this:
Exod. 3:6 |  Isa. 52:13 | Luke 23:16
The | separators (as opposed to semicolons that are normally used as separators in reference lists) indicate that the Exodus reference is related to the beginning of our verse, the Isaiah reference to the middle part and the Luke reference to the end.

As you can see, even though the reference list applies that there are three sections in the verse, there is no automatic way of determining what exactly are "the beginning", "the middle" and "the end", or sections 1, 2 and 3 of that verse. In some cases it is even unclear after you've carefully read the verse and the references and tried to use common sense based on the contents to manually figure out what those sections are. So the aim of placing the reference notes separately in the text exactly where they should appear is rather impossible.

My question is: is there a way of indicating in a reference itself that the source of the reference is some sub-part of the verse? In this way, applications might be able to e.g. show an extra tag "from middle of verse" or something like that. Could we use subdivided osisID's for this purpose, like this:

<note type="cross-reference">
                <reference osisID="Acts.3.13!crossReference.1" osisRef="Exod.3.6">Exod. 3:6</reference> |
                <reference osisID="Acts.3.13.crossreference.2" osisRef="Isa.52.13">Isa. 52:13</reference> |
                <reference osisID="Acts.3.13.crossreference.3" osisRef="Luke.23.16">Luke 23:16</reference>

and with even further fine-tuning if there were for example two references before the first "|":
osisID="Acts.3.13!crossReference.1.a" and osisID="Acts.3.13!crossReference.1.b" ?

More information about the osis-users mailing list