org.crosswire.jsword.book.sword
Class GZIPBackend

java.lang.Object
  extended by org.crosswire.jsword.book.sword.Backend
      extended by org.crosswire.jsword.book.sword.GZIPBackend
All Implemented Interfaces:
Activatable

public class GZIPBackend
extends Backend

A backend to read GZIPped data files. While the text file contains data compressed with GZIP, it cannot be uncompressed using a stand alone zip utility, such as WinZip or gzip. The reason for this is that the data file is a concatenation of blocks of compressed data.

The blocks can either be "b", book (aka testament); "c", chapter or "v", verse. The choice is a matter of trade offs. The program needs to uncompress a block into memory. Having it at the book level is very memory expensive. Having it at the verse level is very disk expensive, but takes the least amount of memory. The most common is chapter.

In order to find the data in the text file, we need to find the block. The first index (comp) is used for this. Each verse is indexed to a tuple (block number, verse start, verse size). This data allows us to find the correct block, and to extract the verse from the uncompressed block, but it does not help us uncompress the block.

Once the block is known, then the next index (idx) gives the location of the compressed block, its compressed size and its uncompressed size.

There are 3 files for each testament, 2 (comp and idx) are indexes into the third (text) which contains the data. The key into each index is the verse index within that testament, which is determined by book, chapter and verse of that key.

All numbers are stored 2-complement, little endian.

Then proceed as follows, at all times working on the set of files for the testament in question:

 in the comp file, seek to the index * 10
 read 10 bytes.
 the block-index is the first 4 bytes (32-bit number)
 the next bytes are the verse offset and length of the uncompressed block.
 
 in the idx file seek to block-index * 12
 read 12 bytes
 the text-block-index is the first 4 bytes
 the data-size is the next 4 bytes
 the uncompressed-size is the next 4 bytes
 
 in the text file seek to the text-block-index
 read data-size bytes
 //decipher them if they are encrypted
 unGZIP them into a byte array of uncompressed-size
 
TODO(DM): Testament 0 is used to index an README file for the bible. At this time it is ignored.

Distribution Licence:
JSword is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License, version 2 as published by the Free Software Foundation.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
The License is available on the internet here, or by writing to: Free Software Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA
The copyright to this program is held by it's authors.

Version:
$Id: GZIPBackend.java,v 1.30 2005/03/18 15:43:51 joe Exp $
Author:
Joe Walker [joe at eireneh dot com]
See Also:
Licence

Field Summary
private  boolean active
          Are we active
private static int COMP_ENTRY_SIZE
          How many bytes in the comp index?
private  File[] compFile
          The array of compressed random access files?
private  RandomAccessFile[] compRaf
          The array of compressed random access files?
private static int IDX_ENTRY_SIZE
          How many bytes in the idx index?
private  File[] idxFile
          The array of index random access files
private  RandomAccessFile[] idxRaf
          The array of index random access files
private  int lastBlockNum
           
private  int lastTestament
           
private  byte[] lastUncompressed
           
private static Logger log
          The log stream
private static String SUFFIX_COMP
           
private static String SUFFIX_INDEX
           
private static String SUFFIX_PART1
           
private static String SUFFIX_TEXT
           
private  File[] textFile
          The array of data random access files
private  RandomAccessFile[] textRaf
          The array of data random access files
 
Constructor Summary
GZIPBackend(SwordBookMetaData sbmd, File rootPath, BlockType blockType)
          Simple ctor
 
Method Summary
 void activate(Lock lock)
          Called to indicate that the Book should initialize itself, and consume whatever system resources it needs to be able to respond to other queries.
protected  void checkActive()
          Helper method so we can quickly activate ourselves on access
 void deactivate(Lock lock)
          Called to indicate that the Book should release whatever system resources it can to make way for other uses.
 String getRawText(Key key)
          Get the bytes alotted for the given verse
 boolean isSupported()
          Returns whether this Backend is implemented.
 Key readIndex()
          Initialise a Backend before use.
 
Methods inherited from class org.crosswire.jsword.book.sword.Backend
decipher, getBookMetaData, getRootPath
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

SUFFIX_COMP

private static final String SUFFIX_COMP
See Also:
Constant Field Values

SUFFIX_INDEX

private static final String SUFFIX_INDEX
See Also:
Constant Field Values

SUFFIX_PART1

private static final String SUFFIX_PART1
See Also:
Constant Field Values

SUFFIX_TEXT

private static final String SUFFIX_TEXT
See Also:
Constant Field Values

lastTestament

private int lastTestament

lastBlockNum

private int lastBlockNum

lastUncompressed

private byte[] lastUncompressed

active

private boolean active
Are we active


log

private static final Logger log
The log stream


idxRaf

private RandomAccessFile[] idxRaf
The array of index random access files


textRaf

private RandomAccessFile[] textRaf
The array of data random access files


compRaf

private RandomAccessFile[] compRaf
The array of compressed random access files?


idxFile

private File[] idxFile
The array of index random access files


textFile

private File[] textFile
The array of data random access files


compFile

private File[] compFile
The array of compressed random access files?


COMP_ENTRY_SIZE

private static final int COMP_ENTRY_SIZE
How many bytes in the comp index?

See Also:
Constant Field Values

IDX_ENTRY_SIZE

private static final int IDX_ENTRY_SIZE
How many bytes in the idx index?

See Also:
Constant Field Values
Constructor Detail

GZIPBackend

public GZIPBackend(SwordBookMetaData sbmd,
                   File rootPath,
                   BlockType blockType)
            throws BookException
Simple ctor

Throws:
BookException
Method Detail

activate

public final void activate(Lock lock)
Description copied from interface: Activatable
Called to indicate that the Book should initialize itself, and consume whatever system resources it needs to be able to respond to other queries.

Parameters:
lock - An attempt to ensure that only the Activator calls this method

deactivate

public final void deactivate(Lock lock)
Description copied from interface: Activatable
Called to indicate that the Book should release whatever system resources it can to make way for other uses.

Parameters:
lock - An attempt to ensure that only the Activator calls this method

getRawText

public String getRawText(Key key)
                  throws BookException
Description copied from class: Backend
Get the bytes alotted for the given verse

Specified by:
getRawText in class Backend
Parameters:
key - The key to fetch
Returns:
String The data for the verse in question
Throws:
BookException - If the data can not be read.

readIndex

public Key readIndex()
Description copied from class: Backend
Initialise a Backend before use. This method needs to call addKey() a number of times on SwordDictionary

Specified by:
readIndex in class Backend

isSupported

public boolean isSupported()
Description copied from class: Backend
Returns whether this Backend is implemented.

Specified by:
isSupported in class Backend
Returns:
true if this Backend is implemented.

checkActive

protected final void checkActive()
Helper method so we can quickly activate ourselves on access


Copyright ? 2003-2004