[sword-devel] Python script for checking pairwise characters (PROFF-OF-CONCEPT)

Matěj Cepl mcepl at cepl.eu
Tue Dec 19 04:47:56 EST 2023


On Tue Dec 19, 2023 at 1:30 AM CET, Timothy Allen wrote:
> As a data point, when I was writing scripts for manipulating and 
> updating the BSB module, I found the `xml.etree.ElementTree` module in 
> the Python standard library to be many times faster than the SAX API. 
> The SAX API is a perhaps a bit more convenient, because you can just 
> subscribe to whatever events are meaningful for whatever processing you 
> want to do, but ElementTree is just so much faster I found it was worth it.

I have heard a good things about XMLPullParser, but I have never
tried to use it in anger.

I am not sure how the plain ElementTree (and I suppose you mean
its `findall()` method) could help me here. My main focus is on
the SAX `characters()` method, and here ElementTree with its
`.text` and `.tail` attributes doesn’t help much, although if
there was `findalltext()` method, it could get interesting.

> LXML is probably faster again, but that's a third-party dependency, and 
> that adds enough hassle for people who aren't Python developers that I 
> drew the line there.

Certainly, I am a big fan of writing just in the confines of the
Python standard library.

> If you've already written things using the SAX API that work well for 
> you, there's probably no point rewriting them, but if you're writing 
> more tools in the future, you might want to give it a try!

I know ElementTree and it is very useful for simple searches
in the XML tree, but I am not sure how it would help with this
project or with my CzeCSP conversion.

Blessings,

Matěj

-- 
http://matej.ceplovi.cz/blog/, @mcepl at floss.social
GPG Finger: 3C76 A027 CA45 AD70 98B5  BC1D 7920 5802 880B C9D8
 
“Anything essential is invisible to the eyes”, the little prince
repeated, in order to remember.
“It’s the time you spent on your rose that makes your rose so
important.”
“It’s the time I spent on my rose …,” the little prince
repeated, in order to remember.
“People have forgotten this truth.” the fox said. “But you
mustn’t forget it.  You become responsible forever for what
you’ve tamed. You’re responsible for your rose…”
“I’m responsible for my rose…,” the little prince repeated, in
order to remember.
    -- Antoine de Saint-Exupéry: The Little Prince
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 216 bytes
Desc: not available
URL: <http://crosswire.org/pipermail/sword-devel/attachments/20231219/ce083d9b/attachment.sig>


More information about the sword-devel mailing list