<div dir="ltr">Ok, all good then, we are covered, this is a different use case.<div><br></div><div><br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Mon, Dec 18, 2023 at 3:46 PM Matěj Cepl <<a href="mailto:mcepl@cepl.eu">mcepl@cepl.eu</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On Mon Dec 18, 2023 at 2:38 PM CET, Kristof Szabo wrote:<br>
> I wrote some time back <a href="https://github.com/krisek/sword-test" rel="noreferrer" target="_blank">https://github.com/krisek/sword-test</a>, with quite a<br>
> few test cases, which, I think, covers your use case as well.<br>
<br>
Couple of differences on the first look:<br>
<br>
1. Functionally, I prefer my script which stops when the first<br>
unpaired character is found, thus allowing fixing the problem.<br>
2. I use SAX API (xml.sax from the standard library) and it seems<br>
to me like better suited for the Bible processing than the<br>
traditional DOM (or LXML) interface. It nicely hides away all<br>
hard work going on in the background and let me work only on<br>
what’s relevant to my task. See<br>
<a href="https://gitlab.com/crosswire-bible-society/CzeCSP/-/blob/master/CEPtoOSIS.py" rel="noreferrer" target="_blank">https://gitlab.com/crosswire-bible-society/CzeCSP/-/blob/master/CEPtoOSIS.py</a><br>
for an example of much more complicated processing (and also,<br>
it is ten-fold or something like that faster than processing<br>
with Java and Saxon/XSLT).<br>
<br>
> > Temporarily the script is in its own repo<br>
> > (<a href="https://gitlab.com/mcepl/bible-freq-counter" rel="noreferrer" target="_blank">https://gitlab.com/mcepl/bible-freq-counter</a>) and attached to<br>
> > this message, but I would like to submit it to sword-utils. How<br>
> > to do it?<br>
<br>
Just an update … I have moved the script to<br>
<a href="https://git.crosswire.org/mcepl/bible-freq-counter" rel="noreferrer" target="_blank">https://git.crosswire.org/mcepl/bible-freq-counter</a>.<br>
<br>
Best,<br>
<br>
Matěj<br>
<br>
-- <br>
<a href="http://matej.ceplovi.cz/blog/" rel="noreferrer" target="_blank">http://matej.ceplovi.cz/blog/</a>, @mcepl@floss.social<br>
GPG Finger: 3C76 A027 CA45 AD70 98B5 BC1D 7920 5802 880B C9D8<br>
<br>
Nemo plus iuris ad alium transfere potest quam ipse habet.<br>
_______________________________________________<br>
sword-devel mailing list: <a href="mailto:sword-devel@crosswire.org" target="_blank">sword-devel@crosswire.org</a><br>
<a href="http://crosswire.org/mailman/listinfo/sword-devel" rel="noreferrer" target="_blank">http://crosswire.org/mailman/listinfo/sword-devel</a><br>
Instructions to unsubscribe/change your settings at above page<br>
</blockquote></div>