[sword-devel] Verse parsing

Troy A. Griffitts sword-devel@crosswire.org
Tue, 17 Jun 2003 15:45:35 -0700


The problem is resolved with adding the additional:
"1 BOOK..." entry for all the:
"1. BOOK..." entries.

I would think these would have been there anyway, since some users may 
be lazy and not include the '.'.  For English, we include all kinds of 
variations, including, for example, "1 John", "1John", "I John", 
"IJohn", so adding a "1 BOOK" shouldn't be a problem, but even an 
enhancement to the locale.

JUST TO REITERATE.  THE END USER CAN STILL USE "1. BOOK"; it's the 
locale that must include "1 BOOK" syntax.

Our verse parser, which parses all kinds of formats, should AT LEAST be 
able to parse basic osisRef syntax.  If you want to extend it to support 
the [work:] modifier for an osisRef, then more power to ya :)

Currently, since we don't have that 'concept' in the engine, I'm not 
worried about the current requirement that the client of the parser 
would be required to strip any [work:] prefix from the reference and do 
what they need with it accordingly.

I think maybe you missed my comment that the end user will be perfectly 
able to use this '.' indicator, assumed from your first statement, below.

	-Troy.


PS.  Joachim has already updated the de.conf file
PSS.  Martin, yes, any other locale that ONLY includes '1. BOOK' syntax 
should also have '1 BOOK' syntax added-- if you'd like to help update those.



Chris Little wrote:
> As Martin said, using '.' as an indicator of an ordinal value is 
> EXTREMELY common cross-linguistically.
> 
> On the other hand, I don't see why OSIS parsing needs to occur in the 
> same parser.  OSIS references are completely controlled and don't 
> require the complexity of our general parser.  The occurrence of OSIS 
> references is also controlled: they only occur within osisRef/osisID 
> attributes in specific elements of OSIS documents.  Besides that, 
> without looking at the code, I imagine our parser does not currently 
> parse the work portion of an osisID/osisRef, which is completely 
> inappropriate for the verse parser.  So why combine them?
> 
> --Chris
> 
> 
> Troy A. Griffitts wrote:
> 
>> Joachim and Martin,
>>     I spent some time trying to track down the German verse parsing error
>> today and have some information for you.
>>
>>     To handle the "1. Book" syntax, the parser decides if it has a book
>> name yet.  It is does not, it strips the '.' and continues parsing.
>> This leaves the book name as "1 Book".  The German locale does not have
>> entries that resolve for this.  Maybe you have an argument that the
>> parser shouldn't strip the '.', but we are doing it now to conform to
>> OSIS, which reserves the '.' for a BOOK, CHAPTER, VERSE separator (e.g.
>> osisRef syntax for 1 John 1:1 is '1Jn.1.1').  So, for now, could we just
>> get the German locale updated with entries for books without the '.'?
>> The user will still be able to enter '1. Johan', but the locale must be
>> able to match '1 Johan'.  Let me know if this wasn't clear to you and I
>> can explain further.
>>
>>     Thanks for all your great work!  Looking forward to a release soon!
>>
>>         -Troy.
> 
> 
> 
> 
> 
> _______________________________________________
> sword-devel mailing list
> sword-devel@crosswire.org
> http://www.crosswire.org/mailman/listinfo/sword-devel