A year ago I did some initial work developing a PHP implementation for such a parser: <a href="http://open-scriptures.googlecode.com/svn/branches/php-prototypes/reference-parser.lib.php">http://open-scriptures.googlecode.com/svn/branches/php-prototypes/reference-parser.lib.php</a><br>
<br>Maybe the algorithms would be of use to you. I only refined it for New Testament references.<br><br><div class="gmail_quote">On Fri, Nov 20, 2009 at 5:28 AM, DM Smith <span dir="ltr"><<a href="mailto:dmsmith@crosswire.org">dmsmith@crosswire.org</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">SWORD's ability to parse arbitrary input into a list of verses is awesome. It is far more powerful than what is needed for an osisID or an osisRef.<br>
<br>
The structure of these for biblical references is very well defined.<br>
<br>
Here is a partial BNF for it. (I've simplified/extended the BNF with [ ] to represent optional instead of using ε for the empty production and allow them to be anywhere.)<br>
<br>
# An osisRef can be a space separated list of osisRefs<br>
# or two osisIDs separated by a dash<br>
osisRef ::= <osisRef> " " <osisRef><br>
| <osisID> [ "-" <osisID> ]<br>
<br>
# An osisID is a reference with optional work prefix and/or grain,<br>
osisID ::= [ <workPrefix> ":" ] <reference> [ "!" <grain> ]<br>
<br>
# A reference has a book name and can be followed by a chapter and a verse, separated by a period '.'<br>
reference ::= <bookname><br>
| <bookname> "." <number><br>
| <bookname> "." <number> "." <number><br>
<br>
#Book names are normalized to a particular list, including the deuterocanonical books.<br>
bookname := "Gen" | "Exod" | "Lev" | ... skipping for brevity... | "Rev"<br>
<br>
# the numbers are a nonzero and never have leading zeros<br>
number ::= <nzdigit> [ <digits> ]<br>
<br>
nzdigit ::= "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9"<br>
digit ::= "0" | <nzdigit><br>
<br>
digits ::= <digit> [ <digits> ]<br>
<br>
workPrefix ::= .....<br>
grain ::= .....<br>
<br>
I'd like to write parseOsisRef and parseOsisID and use it within osis2mod.<br>
Right now, I have to munge the osisRefs and osisIDs to a form that ParseVerseList will understand.<br>
<br>
The code will be much simpler and much faster than ParseVerseList. Here are some of the specialties of ParseVerseList that don't need to be handled.<br>
a) It understands internationalized book names<br>
b) It understands all kinds of abbreviations for book names<br>
c) It allows roman numerals in book names.<br>
d) It does not require a book name for a reference, but uses the last seen reference's book name as a basis.<br>
e) Likewise, it does not require a chapter number for a verse references, but uses the last seen reference's book and chapter as a basis.<br>
f) It allows special constructs such as "v 3", "c 4" and "9f" and "12ff" for verse, chapter, next verse and to the end of the chapter. (There are other special constructs.)<br>
<br>
Any input?<br>
<br>
In Him,<br>
DM<br>
<br>
<br>
<br>
<br>
<br>
_______________________________________________<br>
sword-devel mailing list: <a href="mailto:sword-devel@crosswire.org">sword-devel@crosswire.org</a><br>
<a href="http://www.crosswire.org/mailman/listinfo/sword-devel" target="_blank">http://www.crosswire.org/mailman/listinfo/sword-devel</a><br>
Instructions to unsubscribe/change your settings at above page</blockquote></div><br>