<html><head><meta http-equiv="Content-Type" content="text/html charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class="">There are 4 standard entities that are predefined for XML. (I used to think that it was 5 with both &quot; and &apos; being defined.) XML allows decimal entities of the form &#ddd;. Any others need to be defined in a DTD. A schema (an xsd in the case of OSIS) does not allow for the defining of entities. (I’m not familiar with other schemas types.)<div class=""><br class=""></div><div class="">Regarding parsing and validator: An xml document may be well-formed, but not valid. The former is the responsibility of the parser. The latter is the responsibility of a validator. A validator takes it’s content from the parser, which may be an in memory tree and compares it to a schema or DTD. What the validator gets, as far as I know, is without entities.</div><div class=""><br class=""></div><div class="">— DM</div><div class=""><br class=""><div><blockquote type="cite" class=""><div class="">On Dec 12, 2014, at 9:01 AM, Greg Hellings <<a href="mailto:greg.hellings@gmail.com" class="">greg.hellings@gmail.com</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><p dir="ltr" class="">If that's the case, how does it handle escaping <>? I believe entity replacement is after XML validation but before passing them to a transformer or such.</p>
<div class="gmail_quote">On Dec 12, 2014 7:52 AM, "DM Smith" <<a href="mailto:dmsmith@crosswire.org" class="">dmsmith@crosswire.org</a>> wrote:<br type="attribution" class=""><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word" class="">Best I can recall:<div class="">Nope. An entity is merely an alternate way of specifying a character. The XML parser is supposed to replace the entity with the corresponding code point before the value is evaluated against the schema.</div><div class=""><br class=""></div><div class=""><div class=""><blockquote type="cite" class=""><div class="">On Dec 12, 2014, at 8:49 AM, Greg Hellings <<a href="mailto:greg.hellings@gmail.com" target="_blank" class="">greg.hellings@gmail.com</a>> wrote:</div><br class=""><div class=""><p dir="ltr" class="">It should be possible to escape any such characters with an XML entity, no?</p>
<div class="gmail_quote">On Dec 12, 2014 7:44 AM, "DM Smith" <<a href="mailto:dmsmith@crosswire.org" target="_blank" class="">dmsmith@crosswire.org</a>> wrote:<br type="attribution" class=""><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><br class="">
> On Dec 12, 2014, at 8:26 AM, Peter Von Kaehne <<a href="mailto:refdoc@gmx.net" target="_blank" class="">refdoc@gmx.net</a>> wrote:<br class="">
><br class="">
> Gesendet: Freitag, 12. Dezember 2014 um 13:16 Uhr<br class="">
> Von: "Troy A. Griffitts" <<a href="mailto:scribe@crosswire.org" target="_blank" class="">scribe@crosswire.org</a>><br class="">
><br class="">
>> Not sure, but I thought we used optional prefixes to specify the kind of gloss if there are multiple, e.g., > gloss="en_US:18&nbsp;wheeler en_UK:articulated&nbsp;lorry"<br class="">
><br class="">
> Should there be an option to escape colons?<br class="">
<br class="">
IMHO:<br class="">
Yes.<br class="">
<br class="">
The definition of gloss in the schema is xs:string, not osisGenRegex.<br class="">
The former places no semantic on the content an allows for an empty string.<br class="">
<br class="">
If gloss should have a semantic, then it should be changed in the OSIS spec.<br class="">
<br class="">
The latter is used by lemma and morph and is specified as:<br class="">
((((\p{L}|\p{N}|_)+)(\.(\p{L}|\p{N}|_))*:)?([^:\s])+)<br class="">
which basically is work:value.<br class="">
If I read this right it does not allow for : to be escaped. I know we allow lemma=“x:a y:b” but I don’t see that this allows for the pattern to be repeated, separated by spaces.<br class="">
<br class="">
The pattern would need to change ([^:\s])+ to (<a class="">\\:|[^:\s]</a>)+ [ not tested ]<br class="">
<br class="">
In His Service,<br class="">
DM<br class="">
_______________________________________________<br class="">
sword-devel mailing list: <a href="mailto:sword-devel@crosswire.org" target="_blank" class="">sword-devel@crosswire.org</a><br class="">
<a href="http://www.crosswire.org/mailman/listinfo/sword-devel" target="_blank" class="">http://www.crosswire.org/mailman/listinfo/sword-devel</a><br class="">
Instructions to unsubscribe/change your settings at above page</blockquote></div>
_______________________________________________<br class="">sword-devel mailing list: <a href="mailto:sword-devel@crosswire.org" target="_blank" class="">sword-devel@crosswire.org</a><br class=""><a href="http://www.crosswire.org/mailman/listinfo/sword-devel" target="_blank" class="">http://www.crosswire.org/mailman/listinfo/sword-devel</a><br class="">Instructions to unsubscribe/change your settings at above page</div></blockquote></div><br class=""></div></div><br class="">_______________________________________________<br class="">
sword-devel mailing list: <a href="mailto:sword-devel@crosswire.org" class="">sword-devel@crosswire.org</a><br class="">
<a href="http://www.crosswire.org/mailman/listinfo/sword-devel" target="_blank" class="">http://www.crosswire.org/mailman/listinfo/sword-devel</a><br class="">
Instructions to unsubscribe/change your settings at above page<br class=""></blockquote></div>
_______________________________________________<br class="">sword-devel mailing list: <a href="mailto:sword-devel@crosswire.org" class="">sword-devel@crosswire.org</a><br class=""><a href="http://www.crosswire.org/mailman/listinfo/sword-devel" class="">http://www.crosswire.org/mailman/listinfo/sword-devel</a><br class="">Instructions to unsubscribe/change your settings at above page</div></blockquote></div><br class=""></div></body></html>