[sword-devel] [bt-devel] Judges: Judg or Jud? and the handling of invalid references

Sat Mar 28 02:22:46 MST 2009

ISBE is now updated, BTW (in beta).

Jonathan Marsden wrote:
> Chris Little wrote:
> 
>> There's no mention of which version of ISBE was tested here, but I'll
>> address this as if it were the latest edition that is being discussed
>> since the problem does exist there. (But it's my suspicion that this
>> report is actually based on the previous version.)
> 
> I only got involved a couple of months ago, so I have no SWORD modules I
> downloaded earlier than that.  I used the ISBE, downloaded from
> CrossWire, that says in isbe.conf that it is:
> 
>   Version=1.6
>   SwordVersionDate=2008-11-26
> 
> I'm retesting now on a fresh download from the CrossWire repository,
> just in case something "unique" happened earlier.

That's the latest public version, but we have a more recent version in beta.

>> The problem, in any event, is in the module (at least for now).
> 
> OK... I think it would be "nice" (helpful, and appropriate) if the front
> end software could warn the end user of such problems (since this would
> both make the user aware of the immediate issue, and also probably lead
> to feedback to module creators fairly quickly, and so to getting the
> relevant modules fixed).
> 
> Since I expect SWORD can take a reference of the form Book X:Y and
> convert it to some internal SWORD key format, and also convert that
> internal format back again into a textual reference... testing whether
> any given Book X:Y reference is canonical could be as simple as
> converting to internal form and back, and comparing the two strings for
> equality??

Textual reference ("Book X:Y") to internal representation is a many to 
one mapping. Internal representation to textual form is a one to many 
mapping, for which we support only one mapping (one to one).

So we take Judg 1:1, Judges 1:1, Jdg 1:1, etc. and map them all to 
(let's say... though this is not actually correct) a reference like 
Judg.1.1. But if we want to turn that into a textual reference, we can 
only map that to Judges 1:1 (or its localized equivalent). So this kind 
of string comparison definitely would not work.

> Of course, once the norm for input source is an OSIS XML document, then
> one can (and should!) validate that XML; but even before that becomes
> the norm, surely the above simpler approach, or something like it, could
> be run over the existing module set and catch this kind of thing,
> improving overall accuracy and utility of the module collection?

A couple small points. First (and this is just a technical point), we're 
not talking about OSIS here, rather TEI. But we are using OSIS 
references with TEI documents.

More importantly, OSIS references are not validated beyond their 
conformance to a very general format. They aren't even validated to 
confirm that they conform to a (bookName).number.number type of format 
or that the bookName portion matches some set of pre-defined book names. 
Because of the broad range that an osisRef must cover, it wouldn't even 
be possible.

> I found what is probably "the standard".  On the OSIS web site, per the
> OSIS 2.1.1 User Manual, Appendix C, Judg and Jude are apparently
> considered "normative".

Sure, but Sword has to be able to take input from the user and try to 
guess at their intended meaning. All of the OSIS book names will 
definitely work, if the user inputs them, but we obviously can't assume 
that all users will use OSIS refs. The references that appear in ISBE 
1.6 are simply parsed by the engine through the same mechanisms as that 
by which user input is parsed.

--Chris