[sword-devel] module driver reorganization proposal

DM Smith dmsmith at crosswire.org
Tue Mar 18 07:58:05 MST 2014


Your suggestion is very similar to JSword's implementation. It has simplified code maintenance.

There are three types of module files: index, compression index and data files. It may do well to handle these separately.
The index consists of fixed sized entries consisting of parts. For a raw module it is: offset and size.  For a compressed module it is: block, offset and size.
The block and offset are always 32bits. But it is the size that varies in width. Today, either 2 or 4 bytes.

So I'd suggest two more classes: RawIndex and a sub-class ZIndex. (Maybe 4, also struct/class RawIndexEntry and ZIndexEntry).

A couple of observations. A row in the file is of fixed width. The size of the file divided by the width of the row gives the number of entries. Finding the i-th entry is simple and obvious.

We've started the above, but still have code duplication related to the index code being in more than one module driver.

Also, I don't see the point of the 3 byte entry. The only thing it affects is the size of the index file. In memory it will be 32bit. For a Bible it would save about 65K to have a 3 byte rather than a 4 byte. Rather I'd suggest that from now on our module making tools only make 4 byte index files. For a Bible, this would add about 128K to the module size.

In Him,

On Mar 18, 2014, at 2:43 AM, Chris Little <chrislit at crosswire.org> wrote:

> We've got quite a few classes in Sword that essentially duplicate code found elsewhere in Sword, with minor changes. The module drivers are a prime example.
> Specific examples include RawText & RawText4, RawCom & RawCom4, zText & zText4 (new as of today), zCom & zCom4 (new as of today), and RawLD & RawLD4, each pair of which differs in that one member uses a 16-bit value to store entry size and the other member uses a 32-bit value. (The 16-bit sizes permit entries up to 64KiB; the 32-bit sizes permit entries up to 4GiB.)
> There are also the pairs RawText & RawCom, RawText4 & RawCom4, zText & zCom, and zText4 & zCom4, each pair of which differs very little.
> My proposal is to collapse the above classes into three classes:
> RawText, zText, and RawLD
> Each of these classes would support entry sizes of 2, 3, or 4 bytes (16-bit = 64KiB entries, 24-bit = 16MiB entries, 32-bit = 4GiB entries). Internally, the classes would always store sizes as a uint32_t, but would serialize as 2, 3, or 4 byte size integers, depending on the parameters passed to the constructor. This will necessitate changing many of the class method signatures to accept uint32_ts instead of shorts & longs.
> Similarly, the classes zVerse & zVerse4, RawVerse & RawVerse4, and RawLD & RawLD4 would be condensed into zVerse, RawVerse, & RawLD capable of reading files with 2, 3, or 4-byte entry sizes.
> This would not require changes to existing modules. A RawLD4 module will still work, but we'll use the RawLD driver to read it and parse the '4' form the end of the driver name to determine that we will read 4-byte entry sizes.
> RawCom, zCom, & SWCom classes would then be derived from RawText, zText, & SWText respectively. Maybe we can even eliminate the *Com classes and simply add a member variable to indicate whether to act like a commentary or a Bible.
> Advantages of this proposal include all of the things that come with reduced code duplication:
> Less code, reduced API complexity, smaller library size, etc.
> Greater consistency, without having to page through half a dozen distinct classes to keep code consistent.
> Bugs only need to be fixed in one location instead of many.
> Whatever else makes DRY practices better than WET.
> The method described also makes it trivial for us to add the 3-byte entry size drivers, which should be enough for anything practical (up to 16MiB per entry). And down the road, we could add 5-byte entry size support with ease for entry sizes up to 1TiB. (No, I'm not suggesting that.)
> If you're wondering why RawGenBook & zLD are left out of the proposal, it's because they both use 4-byte entry sizes already and no 2-byte versions exist.
> --Chris
> _______________________________________________
> sword-devel mailing list: sword-devel at crosswire.org
> http://www.crosswire.org/mailman/listinfo/sword-devel
> Instructions to unsubscribe/change your settings at above page

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 4145 bytes
Desc: not available
URL: <http://www.crosswire.org/pipermail/sword-devel/attachments/20140318/d4ccc36b/attachment.p7s>

More information about the sword-devel mailing list