[osis-users] Validation related OSIS questions

Markku Pihlaja markku.pihlaja at sempre.fi
Tue Nov 13 09:29:46 MST 2012

Thanks again!

I'll first give you one further question and then comment on your previous

What would be a good way of including language versions of verse and
chapter id's in the markup? I previously checked here that osisID's have to
use the standard keywords and syntax. But I'd love to be able to supply the
Finnish abbreviation of each verse as additional information. That is: when
the osisID of a verse is "Gen.3.8", it would make life much easier for
utilizers of this OSIS file if the verse also somehow contained the Finnish
standard notation "1. Moos. 3:8".

The "obvious" way would be to be able to add a new attribute to the verse
tag, like:
<verse osisID="Gen.3.8"  sID="Gen.3.8" FI_ID="1. Moos. 3:8" />
but that probably isn't possible, is it? Or can I somehow declare new
custom attributes like Chris declared new custom dash entities in his last

And now for the previous answers. There's also at least one new question,
at the end of the comment to Peter's answer.

2012/11/9 Chris Little <chrislit at crosswire.org>
> The encoding is usually indicated with a line like:
> <?xml version="1.0" encoding="UTF-8"?>
> The doctype shouldn't be necessary since you'll generally want to indicate
> the schema itself, but I think you can add a doctype declaration like the
> following if you want to:
> <!DOCTYPE osis>
> This won't help you to use the w3 validator since that's just for HTML,
> XHTML, & other web format (unless there's a validator I haven't found).

Silly me... Didn't someone mention something about HTML experts? ;)
Of course I was trying to use an irrelevant validator and that's why I was
wondering how the doctype declaration (required by that validator) could be

I downloaded the 30-day trial version of oXygen, and that will probably do
just the job I need to do. The full version is a bit too expensive for this
single project, but if there will be more XML projects in the future, I'll
certainly consider purchasing it.

If you don't want to encode the characters as Unicode, you can use &#x2013;
> for the en dash and &#x2014; for em dash. I believe you could also declare
> your own entities in the DTD:
> <!DOCTYPE osis [
>         <!ENTITY ndash   "&#x2013;">
>         <!ENTITY mdash   "&#x2014;">
> ]>

Yes, this worked!

2012/11/8 Peter von Kaehne <refdoc at gmx.net>

> > I'd like to be able to use some code or entity instead of an actual dash
> > characters (– or —), at least in some places, since we have two
> > different semantics for the dashes and I'd like to keep them separate in
> the code.
> Don't have an answer for that, but what is the semantic and is there not a
> better way to code it than the somewhat arbitrary length of a dash
> character?

That's a fair question. Indeed it would be nice to find a better way (I'm
not using the length to separate these cases but just different notations
of the same length), but I haven't (at least yet) found the better way.

The two different cases are normal em dashes within sentences as
punctuation – just like the dashes in this sentence – and then to indicate
a range of chapters and verses in some headings. The latter is not in the
markup but in the content to be printed (or otherwise shown to the reader).
For example: "Second Speech of Moses (4:44–11:32)" just before Deut.4.44.
The range has been included in the official translation by the translation
committee and thus cannot be omitted.

At least in Finnish we nowadays use the em dash to indicate ranges as well
as punctuation. And I'd just like to enable the users of this OSIS file to
search for one or the other without getting ambiguous or extra results.

My solution right now is to use Chris's way of declaring the &mdash; entity
myself and use that for punctuation, and use the actual character "–" for
ranges. Not very elegant but it does the job.

If you have a semantically better suggestion, I'll be happy to use it.

Actually... would something like this work?
<milestone type="x-punctuation-dash" marker="mdash" />
<milestone type="x-range-dash" marker="mdash" />

 2012/11/9 Chris Little <chrislit at crosswire.org>

> How would you suggest that an exception like this should be coded? Add
>> some custom type attribute value to indicate special handling in layout?
> This was exactly the case for which <chapter> was made milestonable. You
> can switch all of your chapter elements to milestones:

I was hoping for some other solution. My impression is that these milestone
versions of structure indicators weaken the value and usability of markup:
I'd guess there are numerous tools that assume "strong" markup where at
least the basic structures are marked with proper start and end tags
instead of milestones.

But I guess it has to be done like that, and we do already have the other
basic structure block (verses) marked with milestones, so they'll need to
understand milestones anyway.

I've done the chapters with containter tags now, but it's quite simple to
convert them to milestones.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.crosswire.org/pipermail/osis-users/attachments/20121113/defcaac1/attachment.html>

More information about the osis-users mailing list