xml:whitespace: was Re: [osis-core] <hi> types

Troy A. Griffitts osis-core@bibletechnologieswg.org
Thu, 21 Aug 2003 12:16:54 -0700


Chris, Todd, and Patrick,

	Thank you for your detailed responses.

> In other words, the XML parser does not "do" anything to the whitespace, 
> but merely passes it along and gives the application notice of it, along 
> with how it "should" be processed by the application.
> 
> There is no guarantee that the application will honor this intention. 
> Note that browsers are a good example of applications that have default 
> rules for handling whitespace.

Right, thank you for correcting my 'xml:whitespace' to 'xml:space'
I was asking if we should HONOR the attribute if it is set.  Is it 
something we, as an OSIS body of recommenders want to say:

	'If xml:space="preserve" is set, an OSIS application is expected to 
honor this.'

Also, we would presumably do what W3C says we must, if we want to use 
this attribute, and declare the attribute per their specification, and I 
would suggest it be declared only on something high up, as we don't 
really want to check it all over and toggle different behaviour at any 
given tag.


> Why would I need "2 spaces between sentences?"

Because it's proper English. :)


>  >     2 spaces between STATE and ZIP in an address?
 >
 > Or here?

Because it's proper Engligh. ;)


>  >     Extra spaces before GOD in Chinese?
> 
> Assume this is a rendering requirement? Suggest 
> <divineName>God</divineName>, assuming you can mark occurrences with a 
> script for imposition of the style.

Not to reopen the issue with divineName, but divineName is ONLY for the 
anomoly when the Hebrew Tetragrammaton is changed in the text to 'LORD', 
'GOD', or OTHER because the Jews were AFRAID to mispeak the name of God.

Chinese does not place 2 spaces before the Tetragrammaton, but they use 
2 spaces like we capitalize God, Him, et. al.  We don't strip our CAPS, 
why should we remove their double spaces?


> 
>  >     Preserve TABs?
> 
> Do you mean as in tables? That's an ugly problem. Seems like I saw a 
> partial solution to that years ago, let me poke around in my SGML 
> archives for a while.

In an excerpt like this (OK, I guess now I'm looking for OSIS markup...):


Here is my contact information:

	P. O. Box 2528
	Tempe, AZ  85280-2528

	Phone:	(602) 628-7771
	Fax:	(602) 628-7771


	Sincerely,



		Troy A. Griffitts
		Director
		CrossWire Bible Society
		http://www.crosswire.org



> 
>  >     Preserve NewLines?
> 
> Not sure what you mean here?

See above for New Lines.


> Don't think we should dismiss your idea of large amounts of litely 
> marked texts nor ignore Chris's suggestion that some cleanup is probably 
> not that hard. It really isn't an either/or situation.

Right, I agree the we can probably find some solution with <lb/>
I don't like the 'Unicode NBSP (0xA0)' suggestion for double spaces 
because it would introduce what I consider binary, non-human-readable 
codes into an otherwise 'plaintext' editable document.  But maybe we 
need to rethink what we consider a 'plaintext' document.  I still tend 
to think not yet.


> Suggestion: Can we find a place for migration to OSIS on next week's 
> calendar?

Yes.  I would love to deal with these issues.  The present very real and 
immediate problems that we are trying to resolve in our processing of 
OSIS texts.  Chris has some good suggestions on best practices of WHERE 
to handle them and possibly HOW, but I think we need to struggle with 
these ourselves, and present our recommendations.


	Thanks again for your attention and time,
		-Troy.