[sword-devel] Fwd: Making it easier to import OSIS documents in sword
Arnaud Vié
unas.zole+avie at gmail.com
Tue Jul 2 17:54:07 EDT 2024
Hi all,
Has anyone given any thought to simplifying the import of OSIS documents in
sword ?
With my bible-scraper, I'm giving users a way to easily generate OSIS
documents.
The next step is to allow them to easily import the resulting document in
sword... But the current process is quite painful in this regard :
- Usage of osis2mod CLI, with relatively obscure options, and manual
writing of a module conf file, is reserved to a "technical elite".
Unless I'm missing something,
*non-technical users have no easy way to import an OSIS document into
sword. *
- Even if I want to develop a simpler frontend hiding this complexity,
ideally browser-based, osis2mod being distributed as a binary makes it *hard
to integrate into a portable frontend* to automate the process.
I have a few ideas to improve this situation, on which I would like your
opinion - as well as historical context where appropriate.
*Strategy 1 : Rewrite or recompile osis2mod in a more portable fashion*
For example, it may be possible to represent most of the XML structure
changes done by osis2mod (described here
<https://wiki.crosswire.org/Osis2mod#Transformations>, implemented here
<https://git.cepl.eu/cgit/sword/sword/tree/utilities/osis2mod.cpp#n1428>)
as an XSLT sheet or similar. This would make it easy to write portable
osis2mod implementations (in java, JS...) without duplicating the
maintenance for all this transformation part.
A smaller impact variant would be to keep the osis2mod code mostly
unchanged, but compile it into a WASM module using emscripten, that could
be executed natively by web browsers. I have yet to try this, though.
*Strategy 2 : Allow libsword/jsword to consume OSIS documents directly*
OSIS is a well-documented, mostly well-specified and readable open format,
whereas "sword modules" are much more tied to one specific implementation
(osis2mod).
By accepting OSIS documents in input, instead of only sword modules, we
would be moving from a mostly closed environment to a truly open one.
I understand that the transformations/normalisations/indexes computed by
osis2mod have a purpose to improve the runtime efficiency of accessing the
bibles (not decompressing and loading in RAM a full bible all the time,
etc.), so I'm not suggesting we completely get rid of them.
However, they could be taken care of at "module installation" time by the
lib itself.
The lowest-impact change for libsword would be :
- Embed osis2mod logic into libsword core
- Update InstallMgr::installModule
<https://git.cepl.eu/cgit/sword/sword/tree/src/mgr/installmgr.cpp#n640>
to no longer require a "mods.d", but also accept archives containing a
single OSIS XML document.
In that case, plug the call to osis2mod logic to process the OSIS
document and generate the actual modules.
With this, the installation of a such an OSIS module would take a few more
seconds than for the usual modules, but in exchange would make the whole
ecosystem easier to interact with.
The problem here, of course, is that we'd have to duplicate that logic into
jsword - unless we're also making it more portable as per solution 1.
*What are your thoughts on these two strategies ?*
I'm also interested in *any historical insight on this sword module format*,
which at first glance seems much more complex than it needs to be.
For example :
- What is the purpose of offering multiple compression formats ? (half
of which are not supported in the debian libsword builds by the way)
- Why does osis2mod force bibles to fit into a versification (squashing
all remaining texts into the last verse of a chapter) instead of building a
specific index that accurately represents the contents of the original OSIS
document ?
- Why are contents always split by testament (ot/nt.bzs/v/z) ? Seems a
bit arbitrary, especially since OSIS allows any kind of bookGroups.
Thanks, and sorry for yet another very long email !
Cheers,
Arnaud
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://crosswire.org/pipermail/sword-devel/attachments/20240702/194465cd/attachment.htm>
More information about the sword-devel
mailing list