[sword-devel] OSIS ids and refs

DM Smith dmsmith at crosswire.org
Wed Feb 11 11:51:06 MST 2015


I'm still working on examining how JSword handles modules and am finding errors in modules.

The OSIS specification says that an osisID has a very well defined vocabulary. To give it briefly, (not looking for accuracy), it can consist of
Words consist of letters, numbers and _ (and sometimes escaped non-whitespace)
Spaces are not allowed (except to separate one osisID from another).
'.' is used as a separator between words within a part
Work:
	(((\p{L}|\p{N}|_)+)((\.(\p{L}|\p{N}|_)+)*)?:)?
	an optional prefix consisting of one or more words terminated by a colon 
Single reference:
	((\p{L}|\p{N}|_|(\\[^\s]))+)((\.(\p{L}|\p{N}|_|(\\[^\s]))+)*)?
	a required list of one or more words
	The significant addition here is \\[^\s] which says any non-whitespace can be escaped.
Grain:
	(!((\p{L}|\p{N}|_|(\\[^\s]))+)((\.(\p{L}|\p{N}|_|(\\[^\s]))+)*)?)?
	an optional suffix preceded by a ! and consisting of one or more words

An osisRef allows for a single osisID, a - to separate two single osisIDs and a space separated list of osisRefs.

In the GenBook module I'm seeing spaces and slashes. Such as:
osisRef="BaxterPastor:The Reformed Pastor/C3/S2/PI"

I may be reading the above regex incorrectly, but I don't think this will validate, failing on /.

The following would pass:
osisRef="BaxterPastor:The Reformed Pastor\/C3\/S2\/PI"
But this is actually 3 osisIDs:
BaxterPastor:The
Reformed
Pastor\/C3\/S2\/P1

The upshot is that JSword, which is written against the OSIS spec, chokes on this osisRef.

What should it be?
It seems best to follow the intention of the OSIS spec.
Instead of spaces use '_'.
Use . instead of \/ to separate path parts.
If a . is in the path then escape it, \.
If an _ is in the path then escape it, \_

This would then be
osisRef="BaxterPastor:The_Reformed_Pastor.C3.S2.PI"

Or is this a "bug" in the OSIS spec that needs to be fixed?

In Him,
	DM





More information about the sword-devel mailing list