[sword-devel] Creating a "SWORD-over-network" protocol for remote SWORD repo access?

Jaak Ristioja jaak at ristioja.ee
Sun Jul 14 14:08:08 EDT 2024


Hello,

+1, however this is not a small feat. Having also considered this, I 
would like to share some toughts on this topic which I hope you find useful.

As far as I understand libsword, it tries to support both FTP and 
HTTP(S) repositories.
   * Libsword seems to include a hand-written parser to parse the 
non-standardized FTP directory listings in order to figure out the 
modules present on the remote repository.
   * Similarly for HTTP(S), libsword expects to the web server to 
provide (Apache HTTPD style?) HTML directory indexes, for which it seems 
to include an overly-simplistic hand-written parser.

Reliance on these non-standardized server-specific index files/directory 
listings is very fragile, as slight deviations of server output might 
cause the respective parsing in libsword to be unreliable. The quality 
of these (and other[*]) hand-written parsers in libsword is 
questionable, and I would not be suprised to find in it bugs which put 
users in danger. ;(

Cryptographic signing of Sword modules and/or repository index files 
would only marginally alleviate the situation while also introducing 
biggers problems such as public key distribution and secure handling of 
private keys. This might still be a good optional feature in some later 
design, but more important things first...

Another problem is that a single Sword modules consist of multiple 
files: the configuration file and one or more files with the actual 
content or content indexes (e.g. old testament content, old testament 
content index, new testament content, new testament content index). 
These are distributed in different repository directories and require 
multiple client requests to download. The module file and directory 
names do not contain a version identifier, nor is there any checksumming 
between the files. So when a server updates a module when a client is in 
the middle of downloading these files, this might cause the client to 
download files pertaining to different versions of the module or 
download partially uploaded files, leading to all kinds of nasty 
problems. Proper versioning in filenames and checksumming could help 
alleviate this.

It might be a blocker that libsword does not support having multiple 
versions of a single module installed.

It might be a blocker that libsword does not have a namespacing scheme 
for modules e.g. there can only be one module named "KJV" and it might 
be problematic if two repositories (vendors) provide their own different 
"KJV" modules. And it would probably be a bad idea to try reserve the 
use of identifiers like "KJV" to specific vendors e.g. by using some 
kind of registry.

Another obstacle to defining a new repository format/protocol is that 
there is no complete and sound formal specification for the module 
configuration file format and its fields. The descriptions in the SWORD 
wiki are incomplete and contain ambiguity.

While perhaps not strictly be a blocker to creating a new repository 
format/protocol, but there are no formal specifications for the module 
content and content index files. I remember these formats having being 
described as internal libsword details which don't require 
specification, because the format and libsword might change. However, I 
think this reasoning is incorrect, because files of these formats are 
exchanged over the wire, used in multiple repositories not all which are 
managed by Crosswire, and libsword wants to retain backwards 
compatibility with older modules as well.

In my opinion the repository format should not much depend on the 
underlying transport protocol (HTTP(S), FTP, local filesystem) and 
should not require special handling on the server side. For HTTP this 
means that all repository files may be served statically on a regular 
web server without requiring extra server-side scripting. Just files and 
directories, no parsing of directory indexes, only retrieval of regular 
files by their path.

In the most simple case, the client would retrieve the (root) index file 
from a fixed location in the repository (e.g. using HTTP GET), parse it, 
and proceed to download selected modules, where each module version is a 
single archive file in the repository. Various specific repository 
(directory) layouts are possible. Since SWORD repositories are 
relatively small it might probably suffice for only one (root) index 
file which would contain all necessary metadata from all the module 
archives in the repository. I recommend JSON to be used for index files 
(for interoperability), and an extensible versioned JSON schema to be 
defined.


Best regards,
Jaak


[*] Rewriting just the repository logic would not prevent other libsword 
parser bugs from being exploited.

On 13.07.24 07:30, Aaron Rainbolt wrote:
> As it stands, SWORD users face some disadvantages when accessing SWORD
> resources - they have to be downloaded in their entirety, installed
> onto the end user's system, and then stay there for as long as the
> user wishes to access them. While it is possible and even easy to copy
> modules from one system to another in theory, the system that views
> the modules must still *have* the modules in order to view them.
> 
> Since SWORD already is basically a universal Bible-related data access
> system, it seems to me like it could be useful to take the concept one
> step further - allowing access to SWORD modules over the network,
> where a viewing device must only request the part of a module it wants
> to view, and simply discards it when it's done with it.
> 
> Some advantages of this over what SWORD already does:
> 
> * The device used for viewing no longer has to be the device used for
> storing the modules. Individuals can stand up a "SWORD server" and
> then access the modules from any network-capable SWORD client on any
> device.
> * The device used for module storage can be located on the Internet,
> allowing individuals to access a potentially large library of modules
> without installation.
> * Assuming a properly secured, encrypted connection can be established
> and the server is not obvious as a SWORD server, individuals in
> persecuted countries could potentially access SWORD modules over the
> Internet, allowing them to access the Bible without leaving a trace on
> their devices.
> * Organizations with permission to redistribute copyrighted texts
> could provide those texts via a SWORD server, allowing them to be
> accessed by network-capable SWORD clients (i.e., this could
> potentially allow people to legally access texts such as the ESV and
> NIV in their favorite SWORD client rather than being forced to resort
> to clunky websites, proprietary software, or piracy). Server-side,
> open-source DRM measures could be enforced to make downloading entire
> modules for offline use more difficult, providing some level of
> peace-of-mind to copyright owners.
> 
> Advantages this would have over existing "access the Bible online" solutions:
> 
> * It would provide a standardized interface for accessing the Bible
> and Bible-related resources over the Internet, rather than every
> project coming up with its own storage conventions and network
> protocols.
> * People could theoretically use almost any SWORD client to access the
> modules, allowing access to the Bible using a native desktop or mobile
> application, rather than having to resort to a web browser, clunky
> "cross-platform" (read: doesn't work quite right anywhere) app, or a
> tracker-laden mess like YouVersion.
> * Since the SWORD server itself would essentially be a SWORD client
> that provided access to its modules over the network, one SWORD server
> could daisy-chain to another one, thus acting as a proxy. This way
> small, non-suspicious websites could provide access to major SWORD
> servers via a proxy, making it easier to help individuals in
> persecuted situations to access the Bible.
> * Given the above proxying mechanism, blocking access to SWORD servers
> could become very difficult, for much the same reasons why blocking
> access to Matrix chat is very difficult. (In a world with unlimited
> time and development resources, a full federation system could be
> implemented so that anyone could access any module that anyone else
> hosted... but that's almost without question overkill and impractical.
> Just proxying would be cheap to implement and powerful in use.)
> * Anyone could self-host a SWORD server and provide themselves, their
> family, their community, or even the world easier access to the Bible
> and Bible-related resources.
> * Advanced features like fast Lucene search could be provided
> server-side, giving a much faster search experience than almost any
> modern Web-based Bible application I've used.
> * If used alongside a feature like BibleSync, it could be a powerful
> tool for churches and Bible studies to use. People could simply
> connect to the church's SWORD server and enable BibleSync, then be
> able to follow along perfectly with everyone else, with access to the
> same resources that their pastor, study leader, etc. is actively
> using. No prep work needed (beyond having the proper app installed and
> knowing how to point it to the right server).
> 
> In the event this idea is actually worth pursuing, it seems to me like
> there would be four things needed to make it a reality.
> 
> * A SWORD network protocol specification. This would probably be the
> hardest thing to get right since it has to be gotten right the first
> time and then only incrementally updated in the future in a
> backwards-compatible manner for best results.
> * An actual SWORD server implementation. Once the specification
> exists, writing this should theoretically be easy.
> * Server access support in the SWORD library itself. This would enable
> existing SWORD clients to adopt network support with little effort.
> * Adoption of the new feature by SWORD frontends. This of course is up
> to (and at the discretion of) each SWORD frontend developer, but if
> the SWORD library made accessing network resources act almost
> identically to accessing local resources, it would hopefully be easy
> to take advantage of the feature, and thus it would hopefully gain
> traction.
> 
> So there's my brain-dump of all the reasons I think this is worth
> doing and how I think it should be done :P Let me know what you think
> and if you have any advice or feedback. This whole thing popped into
> my head tonight and I just wanted to share it to see if it's worth
> pursuing, or if maybe something similar to this was tried already in
> the past.
> 
> Thanks for reading my wall of text. God bless.
> _______________________________________________
> sword-devel mailing list: sword-devel at crosswire.org
> http://crosswire.org/mailman/listinfo/sword-devel
> Instructions to unsubscribe/change your settings at above page



More information about the sword-devel mailing list