[sword-devel] Fix for &
Chris Little
chrislit at crosswire.org
Wed Sep 27 22:27:10 MST 2006
Karl Kleinpaste wrote:
> DM Smith <dmsmith555 at yahoo.com> writes:
>> Entities that are not handled via html should not be passed
>> through. So, if there were an entity &disclaimer; for example, it
>> should be stripped.
>
> I believe I disagree. If some &symbol; is unknown to Sword (say,
> because some new HTML standard has come along, already implemented in
> GtkHTML [which GS uses], so that a Sword module is produced which
> contains it, yet Sword itself has not been updated to recognize it),
> why shouldn't Sword simply pass it through? The fact that Sword
> doesn't know about &disclaimer; is no guarantee that both the module
> author and the end-line HTML renderer can't be perfectly happy with
> it -- Sword may quite possibly be behind the curve.
Sword XML content comes in two flavors: ThML and OSIS. Both have static
sets of defined entities. The entities of ThML are exactly those present
in HTML 4.0. The entities of OSIS are exactly those present in XML (a
tiny subset of those present in HTML).
Anything not in the HTML 4.0 set is an encoding error.
What is more, there is absolutely no necessity to use any entity other
than & and < in Sword. Entities other than the XML set (&,
", ', <, >) are not supported at all in Sword and should
not be used. There is no good reason to do so.
Any other character should be encoded as UTF-8, not named entities.
> And in fact, it surely is, in a few small areas. For example,
> WinSword/BibleCS doesn't implement <u> or <font color=...>, though it
> implements <b> and <i>. Conversely, GtkHTML implements <u> and <font
> color=...> but does not have support for <sup>. So pass the source
> material and let the renderer take its best shot.
These are ThML elements. No ThML modules currently use these elements
and we no longer support active development of the ThML code because our
emphasis is on OSIS development.
--Chris
More information about the sword-devel
mailing list