<html><body><div dir="ltr"><div>
</div><div><div dir="ltr">The wiki is and has always been work in progress, and everybody is invited to improve upon it. David and I did the bulk of it at one point and used for such and similar aspects the code as we understood it and the mailing list together with the old documentation site as base . </div><div dir="ltr">We have been quite explicit about it. Including the fact that neither of us are programmers. </div><div dir="ltr"><br></div><div dir="ltr">In that sense it is a bit upsetting and annoying to read you here , Jaak. You could have easily then pointed this out so that we either improved upon it or at the least commented upon our lack of clarity and any confusion or shortcomings in the code.</div><div dir="ltr"><br></div><div dir="ltr">That all said, much of the coverage of all the filters has always been a moving target - if anyone asked with a good use case for improved or clarified coverage of whatever this has often been added. I certainly did a lot of that in the sword filters - making them comply better with OSIS, xhtml , rtf and whatever. </div><div dir="ltr"><br></div><div dir="ltr">Ok, my moan is over. I suggest you make out of the list of your moans a list of bugs and then they might end up getting squashed one by one. </div><div dir="ltr"><br></div><div id="ms-outlook-mobile-signature">Sent from <a href="https://aka.ms/o0ukef">Outlook for iOS</a></div>
<div> </div><hr style="display:inline-block;width:98%" tabindex="-1"><div id="divRplyFwdMsg" dir="ltr"><font face="Calibri, sans-serif"><b>From:</b> sword-devel <sword-devel-bounces@crosswire.org> on behalf of Jaak Ristioja <jaak@ristioja.ee><br><b>Sent:</b> Friday, April 26, 2024 2:12 AM<br><b>To:</b> sword-devel@crosswire.org <sword-devel@crosswire.org><br><b>Subject:</b> Re: [sword-devel] RTF in conf files<div> </div></font></div>When I tried to write a similar parser some years ago (or rewrite the
<br>libsword parser(s) in Sword++), I discovered to my dismay that the wiki
<br>page is quite insufficient. The lack of a formal specification for the
<br>configuration format leads to various serious ambiguities or questions
<br>when wanting to write a parser. Some examples:
<br>
<br> * How should different parsing errors be handled?
<br> * What are the phases for parsing? Should the output of each phase be
<br>a single string, or a list of strings parsed separately by next phases
<br>(e.g. lines in case of continuations)?
<br> * Should continuations be handled in a phase before or after parsing
<br>RTF? How should "\\\\\n\n" be parsed?
<br> * How to include a literal backslash? If escaped, in which phase of
<br>parsing?
<br> * Should official Microsoft RTF syntax rules be used for RTF control
<br>word tokenization and semantics? Which version(s) of RTF exactly? The
<br>rules on the Crosswire wiki page might differ from RTF specs.
<br> * The wiki page states that "using the actual UTF-8 character is
<br>preferred" to RTF "\u" escapes, but the RTF syntax only allows 7-bit
<br>ASCII characters. Does this mean that all UTF-8 characters should be
<br>converted to "\u"-style RTF escapes before handing off to the RTF
<br>parser? Since the "\u" escapes can only handle code points U+0000 to
<br>U+FFFF, how should other UTF-8 code points beyond U+FFFF be handled?
<br>
<br>The original libsword implementation also seemed to suffer from various
<br>issues and was not of much help to me, thus I eventually ended up
<br>abandoning this effort.
<br>
<br>J
<br>
<br>On 16.04.24 10:20, domcox wrote:
<br>>
<br>> Only a very small, restricted subset of RTF markup is supported, see:
<br>> https://wiki.crosswire.org/DevTools:conf_Files#RTF
<br>>
<br>>
<br>> "David \"Judah's Shadow\" Blue" <yudahsshadow@gmx.com> writes:
<br>>
<br>>> I'm working on an info command to display some basic info about
<br>>> modules, and I
<br>>> ran into the fact that, at least in the About entry, the conf file can
<br>>> contain
<br>>> RTF formatting. As it stands I strip out \pard, replace \par with \n, and
<br>>> strip out the tag portion of any anchor/link tags found. My question
<br>>> is, are
<br>>> there any other tags that are likely to appear in conf entries that I
<br>>> should
<br>>> be either handling or stripping (since my front end does no formatting
<br>>> of text
<br>>> whatsoever)?
<br>>
<br>>
<br>
<br>_______________________________________________
<br>sword-devel mailing list: sword-devel@crosswire.org
<br>http://crosswire.org/mailman/listinfo/sword-devel
<br>Instructions to unsubscribe/change your settings at above page
<br></div></div></body></html>