[sword-devel] Parsing osisID and osisRef
Weston Ruter
westonruter at gmail.com
Fri Nov 20 08:27:49 MST 2009
A year ago I did some initial work developing a PHP implementation for such
a parser:
http://open-scriptures.googlecode.com/svn/branches/php-prototypes/reference-parser.lib.php
Maybe the algorithms would be of use to you. I only refined it for New
Testament references.
On Fri, Nov 20, 2009 at 5:28 AM, DM Smith <dmsmith at crosswire.org> wrote:
> SWORD's ability to parse arbitrary input into a list of verses is awesome.
> It is far more powerful than what is needed for an osisID or an osisRef.
>
> The structure of these for biblical references is very well defined.
>
> Here is a partial BNF for it. (I've simplified/extended the BNF with [ ] to
> represent optional instead of using ε for the empty production and allow
> them to be anywhere.)
>
> # An osisRef can be a space separated list of osisRefs
> # or two osisIDs separated by a dash
> osisRef ::= <osisRef> " " <osisRef>
> | <osisID> [ "-" <osisID> ]
>
> # An osisID is a reference with optional work prefix and/or grain,
> osisID ::= [ <workPrefix> ":" ] <reference> [ "!" <grain> ]
>
> # A reference has a book name and can be followed by a chapter and a verse,
> separated by a period '.'
> reference ::= <bookname>
> | <bookname> "." <number>
> | <bookname> "." <number> "." <number>
>
> #Book names are normalized to a particular list, including the
> deuterocanonical books.
> bookname := "Gen" | "Exod" | "Lev" | ... skipping for
> brevity... | "Rev"
>
> # the numbers are a nonzero and never have leading zeros
> number ::= <nzdigit> [ <digits> ]
>
> nzdigit ::= "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9"
> digit ::= "0" | <nzdigit>
>
> digits ::= <digit> [ <digits> ]
>
> workPrefix ::= .....
> grain ::= .....
>
> I'd like to write parseOsisRef and parseOsisID and use it within osis2mod.
> Right now, I have to munge the osisRefs and osisIDs to a form that
> ParseVerseList will understand.
>
> The code will be much simpler and much faster than ParseVerseList. Here are
> some of the specialties of ParseVerseList that don't need to be handled.
> a) It understands internationalized book names
> b) It understands all kinds of abbreviations for book names
> c) It allows roman numerals in book names.
> d) It does not require a book name for a reference, but uses the last seen
> reference's book name as a basis.
> e) Likewise, it does not require a chapter number for a verse references,
> but uses the last seen reference's book and chapter as a basis.
> f) It allows special constructs such as "v 3", "c 4" and "9f" and "12ff"
> for verse, chapter, next verse and to the end of the chapter. (There are
> other special constructs.)
>
> Any input?
>
> In Him,
> DM
>
>
>
>
>
> _______________________________________________
> sword-devel mailing list: sword-devel at crosswire.org
> http://www.crosswire.org/mailman/listinfo/sword-devel
> Instructions to unsubscribe/change your settings at above page
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.crosswire.org/pipermail/sword-devel/attachments/20091120/477a2c07/attachment.html>
More information about the sword-devel
mailing list