<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv=Content-Type content="text/html; charset=us-ascii">
<meta name=Generator content="Microsoft Word 12 (filtered medium)">
<style>
<!--
/* Font Definitions */
@font-face
        {font-family:"Cambria Math";
        panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
        {font-family:Calibri;
        panose-1:2 15 5 2 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
        {margin:0in;
        margin-bottom:.0001pt;
        font-size:11.0pt;
        font-family:"Calibri","sans-serif";}
a:link, span.MsoHyperlink
        {mso-style-priority:99;
        color:blue;
        text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
        {mso-style-priority:99;
        color:purple;
        text-decoration:underline;}
span.EmailStyle17
        {mso-style-type:personal-compose;
        font-family:"Calibri","sans-serif";
        color:windowtext;}
.MsoChpDefault
        {mso-style-type:export-only;}
@page Section1
        {size:8.5in 11.0in;
        margin:1.0in 1.0in 1.0in 1.0in;}
div.Section1
        {page:Section1;}
-->
</style>
<!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang=EN-US link=blue vlink=purple>
<div class=Section1>
<p class=MsoNormal>I have looked, but I cannot seem to find any information regarding
the structure of the ztext file format.<o:p></o:p></p>
<p class=MsoNormal><o:p> </o:p></p>
<p class=MsoNormal>I understand that there are 3 files. For
example, looking at the KJV New Testament, there are:<o:p></o:p></p>
<p class=MsoNormal><o:p> </o:p></p>
<p class=MsoNormal>nt.bzs<o:p></o:p></p>
<p class=MsoNormal>nt.bzv<o:p></o:p></p>
<p class=MsoNormal>nt.bzz<o:p></o:p></p>
<p class=MsoNormal><o:p> </o:p></p>
<p class=MsoNormal>I can only look at so much C++ code before I get lost in what
the code is doing.<o:p></o:p></p>
<p class=MsoNormal><o:p> </o:p></p>
<p class=MsoNormal>My goal is to create an open source C# reader from scratch
if possible. I just need to understand the file format, and how it
is compressed, and how is it indexed. <o:p></o:p></p>
<p class=MsoNormal><o:p> </o:p></p>
<p class=MsoNormal>Perhaps, it is all explained in some header file that I
missed, but I hadn’t found it.<o:p></o:p></p>
<p class=MsoNormal><o:p> </o:p></p>
<p class=MsoNormal>I understand that the file is compressed in a block size of
either verse, chapter, or book. How is that determined?<o:p></o:p></p>
<p class=MsoNormal><o:p> </o:p></p>
<p class=MsoNormal>Also, when you do a search for, lets say, the word, “God”,
does the search uncompress each block and search the uncompressed block?
Or is there an index that is created for each word? The
reason I ask is because the search is reasonably fast, and I am curious how it can
be fast with the constant decompression of each block. I know
that in my own tests, the time to uncompress a raw text bible could take 3-5
seconds depending on the speed of the computer, not including the time to
search for the text which would take another 2 or 3 seconds. The
SWORD lib could search for “God” in a compressed Bible and return
results nearly instantly.<o:p></o:p></p>
<p class=MsoNormal><o:p> </o:p></p>
<p class=MsoNormal>Any help would be appreciated.<o:p></o:p></p>
<p class=MsoNormal><o:p> </o:p></p>
<p class=MsoNormal>Wade<o:p></o:p></p>
<p class=MsoNormal><o:p> </o:p></p>
<p class=MsoNormal><o:p> </o:p></p>
<p class=MsoNormal><o:p> </o:p></p>
<p class=MsoNormal><o:p> </o:p></p>
<p class=MsoNormal><o:p> </o:p></p>
<p class=MsoNormal><o:p> </o:p></p>
<p class=MsoNormal><o:p> </o:p></p>
<p class=MsoNormal><o:p> </o:p></p>
</div>
</body>
</html>