<div dir="ltr"><div class="gmail_quote"><div dir="ltr"><div dir="auto">Hi all,<div dir="auto"><br></div>Has anyone given any thought to simplifying the import of OSIS documents in sword ?</div><div dir="auto"><br></div><div dir="auto"><div>With my bible-scraper, I'm giving users a way to easily generate OSIS documents.<br>The next step is to allow them to easily import the resulting document in sword... But the current process is quite painful in this regard :<br></div><div dir="auto"><ul><li>Usage of osis2mod CLI, with relatively obscure options, and manual writing of a module conf file, is reserved to a "technical elite".<br>Unless I'm missing something, <b>non-technical users have no easy way to import an OSIS document into sword.<br></b></li><li>Even if I want to develop a simpler frontend hiding this complexity, ideally browser-based, osis2mod being distributed as a binary makes it <b>hard to integrate into a portable frontend</b> to automate the process.</li></ul></div>I have a few ideas to improve this situation, on which I would like your opinion - as well as historical context where appropriate.<br><div dir="auto"><br></div><div dir="auto"><font size="2"><b>Strategy 1 : Rewrite or recompile osis2mod in a more portable fashion</b></font></div><div dir="auto"><br></div><div dir="auto">For example, it may be possible to represent most of the XML structure changes done by osis2mod (<a href="https://wiki.crosswire.org/Osis2mod#Transformations" target="_blank">described here</a>, <a href="https://git.cepl.eu/cgit/sword/sword/tree/utilities/osis2mod.cpp#n1428" target="_blank">implemented here</a>) as an XSLT sheet or similar. This would make it easy to write portable osis2mod implementations (in java, JS...) without duplicating the maintenance for all this transformation part.</div><div dir="auto"><br></div><div>A smaller impact variant would be to keep the osis2mod code mostly unchanged, but compile it into a WASM module using emscripten, that could be executed natively by web browsers. I have yet to try this, though.<br></div><div dir="auto"><br></div><div dir="auto"><font size="2"><b>Strategy 2 : Allow libsword/jsword to consume OSIS documents directly</b><br></font></div><div dir="auto"><br></div><div>OSIS is a well-documented, mostly well-specified and readable open format, whereas "sword modules" are much more tied to one specific implementation (osis2mod).</div><div>By accepting OSIS documents in input, instead of only sword modules, we would be moving from a mostly closed environment to a truly open one.<br></div><div dir="auto"> </div><div>I understand that the transformations/normalisations/indexes computed by osis2mod have a purpose to improve the runtime efficiency of accessing the bibles (not decompressing and loading in RAM a full bible all the time, etc.), so I'm not suggesting we completely get rid of them.<br></div><div>However, they could be taken care of at "module installation" time by the lib itself.</div><div dir="auto"><br></div><div>The lowest-impact change for libsword would be :</div><div><ul><li>Embed osis2mod logic into libsword core<br></li><li>Update <a href="https://git.cepl.eu/cgit/sword/sword/tree/src/mgr/installmgr.cpp#n640" target="_blank">InstallMgr::installModule</a> to no longer require a "mods.d", but also accept archives containing a single OSIS XML document.<br>In that case, plug the call to osis2mod logic to process the OSIS document and generate the actual modules.</li></ul></div><div>With this, the installation of a such an OSIS module would take a few more seconds than for the usual modules, but in exchange would make the whole ecosystem easier to interact with.</div><br><div>The problem here, of course, is that we'd have to duplicate that logic into jsword - unless we're also making it more portable as per solution 1.<br></div><div dir="auto"><br></div><div><b>What are your thoughts on these two strategies ?</b><br></div></div><div><br></div><div>I'm also interested in <b>any historical insight on this sword module format</b>, which at first glance seems much more complex than it needs to be.<br></div><div>For example :</div><div><ul><li>What is the purpose of offering multiple compression formats ? (half of which are not supported in the debian libsword builds by the way)</li><li>Why does osis2mod force bibles to fit into a versification (squashing all remaining texts into the last verse of a chapter) instead of building a specific index that accurately represents the contents of the original OSIS document ?<br></li><li>Why are contents always split by testament (ot/nt.bzs/v/z) ? Seems a bit arbitrary, especially since OSIS allows any kind of bookGroups.<br></li></ul></div><div><br></div><div>Thanks, and sorry for yet another very long email !</div><div><br></div><div>Cheers,</div><div><br></div><div>Arnaud<br></div></div>
</div></div>