[sword-devel] XML whitespace - significant and insignificant?

David Haslam dfhdfh at protonmail.com
Sat Feb 9 08:42:43 MST 2019


That's a good point to include!


Best regards,

David

Sent with ProtonMail Secure Email.

‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Saturday, 9 February 2019 15:40, DM Smith <dmsmith at crosswire.org> wrote:

> I’d also add that using XSLT can introduce unwanted white space.
>
> — DM Smith
>
> > On Feb 9, 2019, at 9:41 AM, DM Smith dmsmith at crosswire.org wrote:
> > There are several things in play wrt to whitespace in an OSIS document as it pertains to a CrossWire module rendered by SWORD or JSword to a frontend.
> >
> > 1.  osis2mod’s handling of whitespace.
> >     1a) The parser that osis2mod uses to read the OSIS document is not a validating parser. This means that whitespace between elements is always considered important.
> >     1b) Newlines \n are replaced by a space. Note: carriage returns \r which are part of a Windows style document are not permitted in valid XML. Nor are tabs. If present they are passed as is.
> >     1c) Multiple spaces are folded into a single space.
> >     1d) Verses are trimmed of leading and trailing space.
> >     1e) Verses in the index have a trailing dos newline, even if not present in the input.
> >
> > 2.  Rendering
> >     2a) The parser that SWORD uses to render an OSIS module is not a validating parser. This means that whitespace between elements is always considered important.
> >     2b) HTML and RTF are different beasts. In HTML elements such as <div>, <p>, <br> produce newlines in the output which are rendered by CSS, perhaps implicit. RTF is precise and controlled by the document.
> >
> > 3.  Pretty print of an OSIS XML document.
> >     3a) Nearly all pretty printers will introduce spaces between elements.
> >     <?xml version="1.0" ?>
> >     <List name="Fruit List">
> >     <Item>Apple</Item>
> >     <Item>Banana</Item>
> >     <Item>Pear</Item>
> >     </List>
> >     This introduces text.
> >     If the pretty printing put the newlines and spaces within the element it would not have introduced extra content.
> >     <?xml version="1.0" ?>
> >     <List name="Fruit List"
> >
> >     > <Item>Apple</Item
> >     > <Item>Banana</Item
> >     > <Item>Pear</Item
> >
> >
> > > </List>
> >
> > 3b) Some pretty printers will introduce spaces at the beginning of text.
> > <?xml version="1.0" ?>
> > <List name="Fruit List">
> > <Item>
> > Apple
> > </Item>
> > <Item>
> > Banana
> > </Item>
> > <Item>
> > Pear
> > </Item>
> > </List>
> > If the pretty printing put the newlines and spaces within the element it would not have introduced extra content.
> > <?xml version="1.0" ?>
> > <List name="Fruit List"
> >
> > > <Item
> > > Apple</Item
> > > <Item
> > > Banana</Item
> > > <Item
> > > Pear</Item
> > > </List>
> >
> > Best advice for an OSIS module:
> > Verse per line.
> > Don’t put spaces or new lines after an opening <div>.
> > In Him,
> > DM
> >
> > > On Feb 8, 2019, at 2:02 PM, David Haslam dfhdfh at protonmail.com wrote:
> > > Here's a question that I'd like our OSIS experts to ponder.
> > > In XML, there's a longstanding topic relating to whitespace.
> > > See http://usingxml.com/Basics/XmlSpace
> > > When we make a module from an OSIS file, are there any aspects of XML whitespace that can make a significant difference to how the module displays text or features?
> > > E.g. Might we inadvertently get a space inserted between a tagged word and a note tag?
> > > i.e. As maybe the result of performing a "pretty print" operation on the OSIS source text.
> > > cf. I'm sure you can think of other potential areas of interest.
> > > AFAIK, this has never been discussed before among us.
> > > With various software tools available for making "innocuous" changes to XML files, it's certainly the case that there's nothing to dissuade module providers from using them to "prettify" the OSIS file, even though there might - theoretically at least - be consequences.
> > > Best regards,
> > > David
> > > Sent with ProtonMail Secure Email.
> > >
> > > sword-devel mailing list: sword-devel at crosswire.org
> > > http://www.crosswire.org/mailman/listinfo/sword-devel
> > > Instructions to unsubscribe/change your settings at above page
> >
> > sword-devel mailing list: sword-devel at crosswire.org
> > http://www.crosswire.org/mailman/listinfo/sword-devel
> > Instructions to unsubscribe/change your settings at above page





More information about the sword-devel mailing list