[sword-devel] task

Helmer Krämer sword-devel@crosswire.org
Sun, 9 Sep 2001 17:52:57 +0200


On Fri, 7 Sep 2001 02:31:06 -0700
"Chris Little" <chrislit@chiasma.org> wrote:

Since I've never commited any code to this project yet, feel free to scold
me if I'm doing anything wrong ;)

> Ciphering of LD texts--
> This shouldn't be too difficult since you'd be mimicking functionality
> in the RawText/RawCom classes in RawLD/RawLD4
One would have to write some tool to cipher the texts and make 
{RawLD,RawLD4} rawFilter() each read entry, right?

> Ciphering + compression (on LD, Bibles, & commentaries)--
> Ultimately I'd like ALL texts to be compressed on the site, though I
> guess we could do a client-side util to uncompress for people who really
> think the speed improvement is that much more important than disk space
> lost.
> 
> Implement SCSU (de)compression drivers--
> SCSU is the Standard Compression Scheme for Unicode
> (http://www.unicode.org/unicode/reports/tr6/), which compresses Unicode
> streams by using the fact that most characters in a string come from the
> same code pages and therefore repeat a lot of information.  Basically,
> if you use SCSU and then ZIP the result, you'll get something smaller
> than either of the compression schemes alone would produce.  I'll have
> SCSU (along with UTF-8/16/32) code from ICU in CVS sometime pretty soon,
> but it'll still need to be worked into the library.

I'm willing to do it, so here's my idea (just for discussion):
AFAIK, deciphering is done via rawFilter(). What about doing 
{zlib,SCSU}decompression the same way? We could add a config parameter
that describes the type of compression of the module data and addRawFilter()
the relevant filters. Thus we'd have {zlib,SCSU}compression available for
all type of modules at once.

This would of course imply that we can only compress one entry at a time,
probably resulting in bigger compressed files....

Helmer