[sword-devel] OSIS and XML entities

DM Smith dmsmith555 at yahoo.com
Tue May 10 07:11:58 MST 2005

Yeah, I forgot & in my list.

I agree that entities should not be used where UTF-8 will suffice. Guess 
that needs to be added to the Sword OSIS best practice page ;) Perhaps 
on the Module encoding pages, too.

Everywhere in HunUj, &nbsp; is used around <reference 
...>...</reference>. It appears that it is to bind the reference to the 
adjacent words.

Also, it is using character entities frequently.

Deu 26.2 is an example of both. Here is the source:
akkor vedd a föld termésének a legjavát, amelyet behordasz földedről, 
melyet Istened, az ÚR ad neked&#59; tedd egy kosárba, és menj el arra a 
helyre, amelyet kiválaszt Istened, az ÚR, hogy ott lakjék az ő 
neve.&nbsp;<reference osisRef="Exod.23.19">2Móz 23:19</reference>&nbsp;

I took a look at it in Sword 1.5.6 and 1.5.8pre3 (both look the same) 
and it appears that the &nbsp; and &#59; are being dropped.

Should the module be re-encoded to remove as many entities as possible?

Should osis2mod do this kind of normalization?

Since JSword is a "port" of Sword, I will get it to handle the same set 
of entities. Perhaps all of them.

Also, the Options -> Cross References -> Off does not work.

I looked at the OSIS schema and it allows <reference> to be embedded 
almost any place.

And the links for some adjacent references are handled badly. You can 
see this in Deu 26.5.

Chris Little wrote:

> amp, apos, lt, gt, and quot are the only entities handled by Sword. In 
> practice, however, lt and amp are the only ones that should ever be 
> used. There is no good reason to incur the extra processing needed to 
> handle other entities, so they should be encoded as UTF-8.
> --Chris
> DM Smith wrote:
>> What entities can a Sword OSIS module contain?
>> I am guessing that it is the standard 5 (i.e. &nbsp; &lt; &gt; &quot; 
>> &apos;) and character entities (e.g. &#55;)
>> Anything else?
>> I am asking because JSword needs to be modified to handle them and I 
>> found them in HunUj.

More information about the sword-devel mailing list