[osis-core] whitespace
Chris Little
osis-core@bibletechnologieswg.org
Fri, 08 Aug 2003 15:59:02 -0700
Troy A. Griffitts wrote:
>> Realizing that I probably hold the minority position, ;-), I would
>> recommend normalizing as part of the application (note not the XML
>> parser), all the white space in your example to single spaces.
>
> no, no; I know of at least one other that might agree with you.
That would be me. Contiguous whitespace should be equivalent to a
single instance of any type of whitespace.
My best reason for saying that is that encoders will treat the situation
as such if they have knowledge of HTML. Editors like XMLSpy also
happily insert whitespace for pretty formatting (though they might quit
doing that if xml:space="preserve" were assigned).
I think Troy's example should reduce to:
<seg osisID="entry">This is an entry. I was just going to make 2 points:
o this is point 1 o and this is point 2</seg>
and that the person who encoded this should be chided, harshly. Adding
linebreak elements is simple and retains most of the important
formatting of this.
All that said, I also forsee that there ARE a very few instances where
contiguous whitespace itself needs to be encoded. Stylistics like
double space between sentences are one. Another might be encoding some
kind of manuscript or document facsimile where multiple spaces are
interpreted to exist within the original.
Adding an entity seems like a pretty painless and helpful
shortcut to add (since people could already use 0xA0), but might send
the wrong message by encouraging presentation formatting. Adding an
element like HTML's <pre> would be another (extremely unpleasant, in my
opinion) possibility.
--Chris