[osis-users] OSIS cross-reference questions

Markku Pihlaja markku.pihlaja at sempre.fi
Mon Nov 26 09:22:22 MST 2012

2012/11/24 DM Smith <dmsmith at crosswire.org>

Haven't been able to reply earlier.

No problem, great to hear from you even now! Or well, Saturday.

<note type="crossreference">
>         <reference osisRef="Gen.38.7">Gen. 38:7</reference>,
>         <reference osisRef="Gen.38.10">10</reference>;
>         <reference osisRef="Num. 26:19-21">Num. 26:19-21</reference>;
>         <reference osisRef="1Chr.4.1">1. Chr. 4:1</reference>
> </note>
> Close. The osisRef range has to have 2 osisIDs separated by a dash. So,
> Num.26.19-Num.26.21. Also, the separator for an osisID or an osisRef is
> never a colon, but only a period.

Oops... My mistake, didn't convert that "human reference" to an osis
reference. I wrote that after a long day at work :).

If, on the other hand, I list that as three subsequent notes, the
> semicolons wouldn't be embedded in any tags and thus would be rendered even
> when reference notes should be hidden.
> <note type="crossreference">
>         <reference osisRef="Gen.38.7">Gen. 38:7</reference>,
>         <reference osisRef="Gen.38.10">10</reference>
> </note>
> ;
>  <note type="crossreference">
>         <reference osisRef="Num. 26:19-21">Num. 26:19-21</reference>
> </note>
> ;
>  <note type="crossreference">
>         <reference osisRef="1Chr.4.1">1. Chr. 4:1</reference>
> </note>
> I guess it is also true what you wrote about note tags: they represent the
> marker(s) in the text (even though most of our printed Finnish Bibles don't
> include markers within the text; the notes are listed after certain
> passages with references to the position of the note instead). Also this
> would imply that I shouldn't use the later example with three subsequent
> notes.
> This won't work as the ; are now part of the main text. It appears here
> that you are trying to get three foot note markers separated by semi-colon.

Yes, just as I assumed in the text before the example. I wasn't trying to
suggest a correct way here but rather demonstrate the problem in the
"obvious" solution to my original problem.

> The <note> element specifies the placement of a footnote marker and it's
> content is the content of the footnote. It is really as simple as that.

Yes, exactly!

...listing all parts of the compound reference in one osisRef. That would
> seem to work somehow:
>  <note type="crossreference">
>         <reference osisRef="Gen.38.7 Gen.38.10">Gen. 38:7,10</reference>;
>         <reference osisRef="Num. 26:19-21">Num. 26:19-21</reference>;
>         <reference osisRef="1Chr.4.1">1. Chr. 4:1</reference>
> </note>
> ...So is this certainly valid markup?
> It is valid, but not a good idea. There is a wide variety of software that
> handles OSIS, e.g. SWORD and JSword. The former is focused on chapter at a
> time presentation, so expects each reference to be a contiguous range,
> presenting the chapter and perhaps highlighting the first contiguous range.
> JSword takes each reference as a verse list and presents the contents of
> each of the verses.

Did I understand right? JSword can handle even such osisRefs as
<reference osisRef="Gen.38.7 Gen.38.10 Num.26.19-Num.26.21">Gen.
38:7,10, Num. 26:19-21</reference>

> At this time, I'm not aware of other open source OSIS software.

Ok, this was valuable information. I haven't really found any extensive
lists of commercial or open source OSIS software, so this was good to know.

Also, assuming "Gen.38.7 Gen.38.10" would be a valid osisRef, would also
> for example "Gen.38.7 Gen.38.10-Gen.38.12" be? We also have a few
> compound references consisting of separate verses AND one or more ranges.
> Yes this is valid. Any number of verses and ranges are allowed in osisRefs.

But not the best possible idea, as you mentioned, since SWORD can't handle
it properly (as JSword can), right? Or were you talking here about a
different case from the one we just talked about?

For example, does the extension part of an osisRef always need to have a
> corresponding osisID somewhere? Or could we have a verse like this:
>   <verse osisID="Xxx.2.14" sID=.... />
>   Some text here. Some more text here. Even some more text here. And more
> and more text.
>   <verse eID="... />
> and then have a reference like this:
>   <reference osisRef="Xxx.2.14!c">Xxx 2:14</reference>
> with just the osisID "Xxx.2.14" declared but not "Xxx.2.14!c"?
> Yes. This allowed. Any work can have references to another work using the
> reference system of that other work. As a result, there is no required
> referential integrity between an osisRef and an osisID.

Will the reference to Xxx.2.14!c render as a reference to Xxx.2.14 or be
ignored as referring to an unknown point? Well, that probably depends on
the software, but how about SWORD and JSword?

I'd suggest that you'd determine why the master text has it that way and to
> what you are targeting the OSIS and how to best represent it in OSIS. I'd
> suggest that you fully encode the note in both spots and somehow indicate
> either in markup or text that the list is the same as in another location.

I'm afraid I might not have the luxury of being allowed to do that. Either
our translation committee in late 1980's or someone even earlier created
the different conventions of marking crossreferences, including this
indirect type of a reference.

These conventions with our official translation are quite strict, and
making changes to them would unfortunately require a decision from our
General Synod which assembles twice a year...

When I first saw these different reference types when starting this work, I
immediately asked if the indirect references could be made direct by
duplicating the target reference. But the immediate answer was no.

> Final suggestion, think of markup as a language. You are translating from
> one language (the master text) into another language (OSIS). The
> translation, as you are finding out, is not one-to-one, but rather
> thought-for-thought. It sounds like your master text is structured for
> print-only. OSIS is meant to be neutral to the target, not presuming paper,
> phone, tablet, computer, ....

This is exactly how I'm trying to think of this project. I'm not a Bible
expert, nor a printed book expert but mainly a web expert and thus think
exactly in terms of flexibility and even future unknown application types -
as much as possible.

Yes, our master text, published in 1992, was (obviously) structured for
print. It means that we can only dream of marking up quotes, for example,
since there are no consistent start and end markers in cases of multi-level
nested quotes.

But on the other hand, of the markup that does exist in the source, there
isn't much such really print-specific markup or semantics that couldn't be
reproduced digitally - most of it can actually be better
implemented digitally than in print. The "vague" reference being probably
the only one that doesn't fit well into the digital world, all
other cross-reference-related issues discussed here are very well suited
for electronic publishing and hyperlinks, even though OSIS - or at least
some OSIS implementations - have a hard time handling them.

About that "vague" reference: I currently consider dropping that - or
actually not dropping but just implementing the "vagueness" pretty much the
same way as in print: using the "|" separators between these references
instead of semi-colons, but dropping the vagueness from the actual
osisRef. That's loyal to our source and good OSIS, too.

But I think I'll have to stretch the boundaries of OSIS (or at least
current applications) a little with the indirect references. As you
mentioned, all (or even any) ready OSIS software might not be able to
handle that.

Our main goal is not to produce a perfect OSIS source file but find a
format that is exact and can contain all the structural information our
translation contains. Being able to strictly conform to some format that
already has ready-made tools to handle everything would be a bonus, but
that comes only second to preserving all structure (including indirect or
compound references, for example).

This is indeed preparing for the future instead of just ancient print:
using markup that might not be handled by any current software, to mark
structures that yet can have user-friendly implementations in the future.
In  some Bible-reading software, the indirect reference might be for
example "See references of Matt. 8:1" and only after that provide a list or
popup with those references. But in our OSIS file we'll have to stick to
providing just the indirect reference, and the rest is up to the
application. And I believe we are stretching the OSIS boundaries but not
crossing them - you did say that I'm "free to encode it as you like.
However, software that I'm familiar with won't handle exotic uses of OSIS."
I'll certainly document well all exotic uses.

*I.* How do I markup a single but compound cross-reference that refers to
> non-adjacent verses or ranges, so that it (structurally) differs from a
> (more typical) note containing separate references to the same
> verses/ranges?
> There is no such thing as a single but compound cross-reference.

I guess that's a matter of definitions. In our notation, these two examples
mean quite different things:

Lev. 3:17; Lev. 7:26-27
Lev. 3:17, 7:26-27

The former one is a list of two separate cross-references (indicated by the
semi-colon and space separating the two verses/ranges). The latter one
(indicated by the comma spearator) is a single reference consisting
of cross-references to Lev. 3:17 and Lev. 7:26-27. Of course, some
definition of "cross-reference" might not accept the "recursion": a
cross-reference consisting of several cross-references, but that's just
terminology. For us, a single compound cross-reference is something real.
And that's how the software needs to understand it, too. Once again, it
might not be readily understood by some (or any) current OSIS
implementation, but that's a price we have to pay in our case: limiting the
selection of ready-made software that can be used.

*II.* How do I markup a reference to a note whose source is more complex
> than just one verse or a contiguous range?
> As separate multiple ones.

I hope my example above, with Lev. 3:17 and 7:26-27, demonstrated why this
is not possible in our case. That compound reference might in turn be one
element on a list of separate references, and it simply must not break
apart into two independent elements of that list.

To sum up:

We probably won't be able to produce OSIS code that's 100% compatible with
all current implementations. For us OSIS is not a value in
itself (sorry!) - no one has requested for the digital Bible source file to
be exactly OSIS. However, OSIS is one of the rather few ready formats we
could consider to supply the necessary structural information with the
Bible text, and flexible enough to be rather easily converted to any
further format that users of this source file (print or digital publishers
etc.) might need.

Even though we might need take some small steps off the perfect OSIS path,
talking to you about these things has been very valuable in order to find
ways that will keep us as close to that path and enable us to produce
as ready-usable OSIS as possible. And we are very thankful to all you for

Blessings to all,


PS. I still might to get back to you on some issue, but hopefully not too
many times now.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.crosswire.org/pipermail/osis-users/attachments/20121126/e29daec1/attachment-0001.html>

More information about the osis-users mailing list