[sword-devel] overline

Troy A. Griffitts scribe at crosswire.org
Fri Sep 20 06:19:40 MST 2013


Chris,

Thanks for the feedback. I've implemented some code toward your 
suggestion. The use case for this is the NA28 module used as the base 
text for transcribing and collating here at the institute. Our TEI 
WYSIWYG web component editor initially used markup like this for nomina 
sacra:

<abbr type="nomSac"><hi rend="ol">θν</hi></abbr>

This is pulled into the editor to seed a transcription of a particular manuscript from the service at:

http://crosswire.org/study/fetchdata.jsp?key=Jn.1.1&mod=NA28&format=raw

(view source to see the TEI)

The OSISPlain filter generates overline combining characters from these so this can be used in the collation process when we compare the base text to all other transcriptions:

http://crosswire.org/study/fetchdata.jsp?key=Jn.1.1&mod=NA28&format=plain


Here's the final service which does all the comparisons.  You can ask for the finished collation in a number of different formats:

http://ntvmr.uni-muenster.de/community/vmr/api/collate/?documentGroupID=22&regUserID=intfadmin&baseText=NA28&verse=Mark.3.4&&format=text
http://ntvmr.uni-muenster.de/community/vmr/api/collate/?documentGroupID=22&regUserID=intfadmin&baseText=NA28&verse=Mark.3.4&&format=atable
http://ntvmr.uni-muenster.de/community/vmr/api/collate/?documentGroupID=22&regUserID=intfadmin&baseText=NA28&verse=Mark.3.4&&format=tei
http://ntvmr.uni-muenster.de/community/vmr/api/collate/?documentGroupID=22&regUserID=intfadmin&baseText=NA28&verse=Mark.3.4&&format=graph
http://ntvmr.uni-muenster.de/community/vmr/api/collate/?documentGroupID=22&regUserID=intfadmin&baseText=NA28&verse=Mark.3.4&&format=graphml
http://ntvmr.uni-muenster.de/community/vmr/api/collate/?documentGroupID=22&regUserID=intfadmin&baseText=NA28&verse=Mark.3.4&&format=dot

I realize this doesn't satisfy your justified desire to make our code conform to the specification, but wanted to explain why things are there.  I have moved the code closer to what you have requested, and commented it very clearly that we do not intent to support this markup.  Once we get our modules cleaned up here, I will deprecate the code.

Troy

  


The editor has since been updated to use the proper markup but when we made the base text, we



On 09/11/2013 03:26 PM, Chris Little wrote:
> Troy,
>
> In r2973, you added some handling for <hi rend="ol">...</hi> in OSIS 
> documents being converted to plaintext that I would like to see 
> disappear quickly and forever on account of being all kinds of bad.
>
> First, this is not valid OSIS. rend is not an attribute on any OSIS 
> element. Switching rend to type at least brings it closer to OSIS 
> validity.
>
> But "ol" is still not a valid type for hi, and doesn't conform to the 
> standards for naming character styles in OSIS (or TEI). In both OSIS 
> and TEI, the corresponding CSS property value is used, so it should be 
> "overline" in either case. I see there's apparently a bug in OSIS and 
> overline was left out of the hi types, so you'd need to use 
> "x-overline" for OSIS (or "overline" for a corresponding TEI document).
>
> More generally, the implementation of overline markup to plaintext 
> handling has some undesirable features. Using the filter inserts 
> characters into the character string to simulate a textual style with 
> actual additional characters. I can't imagine the use case in which 
> interleaving real character data with presentation-simulating 
> characters is the right course of action. We don't do any sort of 
> presentation-simulation for other hi types, but the nearest thing I 
> can imagine to appropriate plaintext presentation would be something 
> like markdown, e.g. *bold*, /italic/, _underline_, ¯overline¯.
>
> \uXXXX escapes are only supported by C++11. Using this needlessly cuts 
> off support for older compilers, which are likely to still be in use 
> on plenty of aging platforms.
>
> I see now that the OSIS to (X)HTML filters got support for <hi 
> rend="ol"> back in 2011, and I would likewise strenuously advocate the 
> removal of that in favor of support for markup that is at least valid 
> OSIS (<hi type="x-overline">).
>
> --Chris
>
> _______________________________________________
> sword-devel mailing list: sword-devel at crosswire.org
> http://www.crosswire.org/mailman/listinfo/sword-devel
> Instructions to unsubscribe/change your settings at above page




More information about the sword-devel mailing list