[sword-devel] Creating a "SWORD-over-network" protocol for remote SWORD repo access?

Mon Jul 29 04:10:17 EDT 2024

On Sun, 28 Jul 2024 23:08:33 -0500
Greg Hellings <greg.hellings at gmail.com> wrote:

> So looking at this, you need to understand the goal of the libsword
> library and its repository system.
> 
> The goal for a repository is to support the simplest methods of
> access. Especially to support access by people who have no network so
> that an entire repository can be loaded directly onto a CD, DVD, USB
> stick, or other external media and passed around. Libsword can then
> install directly from that. FTP was first implemented because it
> allowed the same super simplistic process of pointing an FTP server
> at a working repository and then anyone who can access that FTP
> server can also access and install modules from there.
> 
> The goal of repository access and installs is not to create or define
> a standard. It is, rather, to have the very simplest and easiest
> access possible. Others have come through and implemented some
> parsing for the HTML served up over HTTP/HTTPS which can be used if
> libsword is compiled with the optional libcurl support. When I first
> helped contribute to those code bases they strove to support both the
> Apache and Nginx form of HTML that was served up by the automatic
> indexing those servers offer. Again, the goal was not to provide
> cryptographic security, it was not to sign files, it was not to
> define a standardized server process, or use some more robust
> standard like WebDAV or what have you. The goal was simplicity.
> Initially to allow iPhone users access to remote repositories while
> on cell networks where FTP was blocked by many carriers and possibly
> the device itself to some extent. Again, the goal was simplicity of
> access - pointing the root of a folder to a working repository
> installation would allow someone to remotely access all of the
> resources on that remote repository.
> 
> So the rest of the concerns were not addressed because they are not
> part of the goal. It's not the goal of the process to specify a
> strict format the repository needs to be exposed by over the network
> nor to ensure cryptographically signed files are transmitted. If
> those are needs for someone's use cases, then they should be
> implemented outside of the SWORD library and its native support. The
> goal of the library is to be very small, very fast, and as broadly
> portable as possible to more or less any device for which there is a
> C compiler available, and the goal of its support for remote
> repositories is to make it as simple as possible to get the data onto
> those devices. Thus, no standardized parser is required (though
> anyone using the library is free to extend its code to use one)
> because that becomes less portable and more heavyweight. Libcurl
> isn't even required - though without it access to HTTP/HTTPS sources
> vanishes because the library does not provide an implementation of
> that.
> 
> Again, small size, speed, and nimbleness are the goals of the library.
> Anything else that needs to be implemented for someone's requirements
> is up to them to implement above the library's level. Nothing stops
> someone from writing an application that connects over WebDAV to a
> server, fetches the SWORD files, checks them against cryptographic
> signatures, and uses well known libraries to handle all of that. But
> it's not the goal of libsword to offer that. That is much higher
> friction than the goal of the underlying library.

Good background information to have, thank you!

> ------
> 
> Now, to switch to the idea of a specialized SWORD protocol to address
> the user who does not want to fetch the entirety of a module: why?
> The library can already generate HTML documents and document
> fragments. Just do the rendering on the server and pass the fragment
> to the client over HTTP. Wrap the rendered string into a JSON object
> if you need to. Why try to pass the binary blob of some random data
> to the remote unit when you could already render it on the server?

The idea is to make it so that *existing* SWORD clients can be able to
access data on remote servers without downloading the whole thing. I
laid out some reasons why this is helpful in certain use cases in my
first email. Existing SWORD clients are meant to retrieve information
from libsword and then render it in somme way, thus to maximize the
possibility of adoption, my hope was to implement in libsword the
ability to fetch "raw" data from a remote server and then pass it
through to the client, which already has code for rendering it however
the client chooses. Ideally a client should need to do nothing more
than point an SWMgr object at the remote server and then use it exactly
the same way it would use a local repository (perhaps with some extra
error checks for things like timeouts, interrupted connections, and
whatnot).

> A simple REST library written in something like Go could easily be
> linked to the libsword C library. It could query libsword to get the
> list of modules and expose them, along with certain query parameters
> specifying the format request. Then serve the resulting text over
> HTTP. So a client library could hit something like
> http://mylibrary.com/texts/KJV/Gen/1/1?format=html and it will get back
> {"osisRef": "Gen.1.1", "text": "<p>In the beginning...</p>"}. You
> wouldn't need to write some low level application protocol. You would
> save the client device from needing to render the text and have extra
> knowledge of the module. You wouldn't have to alter the library in
> any fashion.

This is similar to what I was thinking. I wasn't sure if JSON was the
best wrapper to do it in, but I don't see any reason to use
anything else, other than SWORD's apparent preference for XML-like
formats. However my "text" field would probably look more like "text":
"$$$Revelation of John 22:19\n<w lemma="strong:G2532 lemma.TR:και"
morph="robinson:CONJ" src="1">And</w>..." or some such (this is what
mod2imp spit out when I used it to get an example).

> A simple application like this could be written up, distributed in a
> static binary, and anyone would be able to hit it for a REST
> accessed, rendered format of a given text. Going back to the goal of
> simplicity: this application could be run by anyone on any computer
> where a SWORD library already existed, and it could serve the
> baseline of those peoples' needs.
> 
> That's just an idea I've had bouncing around in my head for a long
> time. I just have no need to access the scripture over REST or I
> would have already written it. All the bits are already out there.
> There are lots of good REST frameworks, every language with them has
> the ability to encode JSON, and most of the popular ones we have
> bindings for the language in (Python, PHP, Java) or it can easily be
> integrated directly (CGO).

This is a really good idea, and if this is going beyond what libsword
is designed for, that's probably the route I'll take. I have a
preference for C# for development tasks along these lines so I'll
probably try to resurrect that first (the actual VS solution in
libsword is *old* but the SWIG code should be up-to-date, so I don't
imagine it will be *too* hard to get it going again - failing that,
C++/Qt is probably my next choice though Qt is a bit of a strange
choice for a server application). Then I'll probably implement something
more-or-less like what you're mentioning here. Might not catch on, but
if nothing else it will be interesting.

Hope you're doing well,
Aaron

> --Greg