[sword-devel] OSIS and XML entities
DM Smith
dmsmith555 at yahoo.com
Tue May 10 07:11:58 MST 2005
Yeah, I forgot & in my list.
I agree that entities should not be used where UTF-8 will suffice. Guess
that needs to be added to the Sword OSIS best practice page ;) Perhaps
on the Module encoding pages, too.
Everywhere in HunUj, is used around <reference
...>...</reference>. It appears that it is to bind the reference to the
adjacent words.
Also, it is using character entities frequently.
Deu 26.2 is an example of both. Here is the source:
akkor vedd a föld termésének a legjavát, amelyet behordasz földedről,
melyet Istened, az ÚR ad neked; tedd egy kosárba, és menj el arra a
helyre, amelyet kiválaszt Istened, az ÚR, hogy ott lakjék az ő
neve. <reference osisRef="Exod.23.19">2Móz 23:19</reference>
I took a look at it in Sword 1.5.6 and 1.5.8pre3 (both look the same)
and it appears that the and ; are being dropped.
Should the module be re-encoded to remove as many entities as possible?
Should osis2mod do this kind of normalization?
Since JSword is a "port" of Sword, I will get it to handle the same set
of entities. Perhaps all of them.
Also, the Options -> Cross References -> Off does not work.
I looked at the OSIS schema and it allows <reference> to be embedded
almost any place.
And the links for some adjacent references are handled badly. You can
see this in Deu 26.5.
Chris Little wrote:
> amp, apos, lt, gt, and quot are the only entities handled by Sword. In
> practice, however, lt and amp are the only ones that should ever be
> used. There is no good reason to incur the extra processing needed to
> handle other entities, so they should be encoded as UTF-8.
>
> --Chris
>
> DM Smith wrote:
>
>> What entities can a Sword OSIS module contain?
>> I am guessing that it is the standard 5 (i.e. < > "
>> ') and character entities (e.g. 7)
>> Anything else?
>>
>> I am asking because JSword needs to be modified to handle them and I
>> found them in HunUj.
>
More information about the sword-devel
mailing list