[sword-devel] intro + testing programs

Thu, 5 Jul 2001 14:30:02 +0100

This is a multi-part message in MIME format.

------=_NextPart_000_0028_01C1055E.FA486E00
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

  >Hi Chris,

  >welcome on the team!

  whoa, that was easy.

  >I am not responsible for assigning tasks to you, there are some
  >things that need to be done, but I am not sure when and how.
  >Troy or Chris, could you say something?=20

  that's cool.  just let me know, and i'll see what i can do...

  >A good compression algorithm is definitely a very interesting
  >issue. Could you give some statistics on it, how much it
  >compresses various kinds of data compared to the common=20
  >algorithms (zip, bzip, rar etc.). A good and fast (contradiction?)
  >compression will be necessary because Chris will be adding HUGE
  >modules soon... (At least theoretically up to 4GB ;)

  it's a good compression algorithm, and fast as it un-compresses "on =
the fly", so it is a very efficient way of compressing / uncompressing.  =
I'll put together a benchmark against zip, rar, and a few others...

  >It should be possible to use sword (with the most important=20
  >modules) on handheld devices which have very low space.

  in theory, yes, it is possible.  i'm working on getting a bible =
program on my psion V.  not sword, and therefore causes problems (such =
as being written for the particular processor, doh!.  and it's a =
commercial application.), but, it's well on the way.  i could take a =
look at the possibilities of it working for psion's if you like.  let me =
know...

  >Another important issue is searching, indexing etc.

  as in word searching and indexing.   build a simple index for the =
words, in alphabetical order, compress it, and then write a lexical =
search engine.  it'll cut out all the bother of going through the entire =
module. ie:

  searching for the word Jesus

  it would take the J, and say, right.  i know that the J's start at =
index number 10,000, and ends at 55,000.  ignore rest.

  then take that, and search for the letter e in second place.

  from 15,000 to 21,000

  then take the letter s in third possition

  from 19,000 to 20,000

  then take the u in forth position

  from 19,500 to 19,750

  then letter s in fifth possition

  19,518

  i think it's called subtractive index referencing, and i'll try it out =
on a lexicon, and see how quick it is.  the theory is good, as you only =
need to do a search for one letter at a time, and find the first and =
last of it.  could build and index of the first three letter =
combinations (17,576 references, start and end as one ref).  not =
realtime, tho.  could work on that.

  >Martin

  yours in christ,

  Christopher Miller

  ps, does your c++ compiler support assembly sub-routines.  some don't

------=_NextPart_000_0028_01C1055E.FA486E00
Content-Type: text/html;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META content=3D"text/html; charset=3Diso-8859-1" =
http-equiv=3DContent-Type>
<META content=3D"MSHTML 5.00.2314.1000" name=3DGENERATOR>
<STYLE></STYLE>
</HEAD>
<BODY bgColor=3D#ffffff>
<BLOCKQUOTE=20
style=3D"BORDER-LEFT: #000000 2px solid; MARGIN-LEFT: 5px; MARGIN-RIGHT: =
0px; PADDING-LEFT: 5px; PADDING-RIGHT: 0px">
  <DIV>&gt;Hi Chris,<BR><BR>&gt;welcome on the team!<BR></DIV>
  <DIV><FONT face=3DArial size=3D2>whoa, that was easy.</FONT></DIV>
  <DIV>&nbsp;</DIV>
  <DIV>&gt;I am not responsible for assigning tasks to you, there are=20
  some<BR>&gt;things that need to be done, but I am not sure when and=20
  how.<BR>&gt;Troy or Chris, could you say something? <BR></DIV>
  <DIV><FONT face=3DArial size=3D2>that's cool.&nbsp; just let me know, =
and i'll see=20
  what i can do...</FONT></DIV>
  <DIV><BR>&gt;A good compression algorithm is definitely a very=20
  interesting<BR>&gt;issue. Could you give some statistics on it, how =
much=20
  it<BR>&gt;compresses various kinds of data compared to the common=20
  <BR>&gt;algorithms (zip, bzip, rar etc.). A good and fast=20
  (contradiction?)<BR>&gt;compression will be necessary because Chris =
will be=20
  adding HUGE<BR>&gt;modules soon... (At least theoretically up to 4GB=20
  ;)<BR></DIV>
  <DIV><FONT face=3DArial size=3D2>it's a good compression algorithm, =
and fast as it=20
  un-compresses "on the fly", so it is a very efficient way of =
compressing /=20
  uncompressing.&nbsp; I'll put together a benchmark against zip, rar, =
and a few=20
  others...</FONT></DIV>
  <DIV>&nbsp;</DIV>
  <DIV>&gt;It should be possible to use sword (with the most important=20
  <BR>&gt;modules) on handheld devices which have very low =
space.<BR></DIV>
  <DIV><FONT face=3DArial size=3D2>in theory, yes, it is possible.&nbsp; =
i'm working=20
  on getting a bible program on my psion V.&nbsp; not sword, and =
therefore=20
  causes problems (such as being written for the particular processor,=20
  doh!.&nbsp; and it's a commercial application.), but, it's well on the =

  way.&nbsp; i could take a look at the possibilities of it working for =
psion's=20
  if you like.&nbsp; let me know...</FONT></DIV>
  <DIV><BR>&gt;Another important issue is searching, indexing =
etc.<BR></DIV>
  <DIV><FONT face=3DArial size=3D2>as in word searching and =
indexing.&nbsp;&nbsp;=20
  build a simple index for the words, in alphabetical order, compress =
it, and=20
  then write a lexical search engine.&nbsp; it'll cut out all the bother =
of=20
  going through the entire module. ie:</FONT></DIV>
  <DIV>&nbsp;</DIV>
  <DIV><FONT face=3DArial size=3D2>searching for the word =
Jesus</FONT></DIV>
  <DIV>&nbsp;</DIV>
  <DIV><FONT face=3DArial size=3D2>it would take the J, and say, =
right.&nbsp; i know=20
  that the J's start at index number 10,000, and ends at 55,000.&nbsp; =
ignore=20
  rest.</FONT></DIV>
  <DIV>&nbsp;</DIV>
  <DIV><FONT face=3DArial size=3D2>then take that, and search for the =
letter e in=20
  second place.</FONT></DIV>
  <DIV>&nbsp;</DIV>
  <DIV><FONT face=3DArial size=3D2>from 15,000 to 21,000</FONT></DIV>
  <DIV>&nbsp;</DIV>
  <DIV><FONT face=3DArial size=3D2>then take the letter s in third=20
  possition</FONT></DIV>
  <DIV>&nbsp;</DIV>
  <DIV><FONT face=3DArial size=3D2>from 19,000 to 20,000</FONT></DIV>
  <DIV>&nbsp;</DIV>
  <DIV><FONT face=3DArial size=3D2>then take the u in forth =
position</FONT></DIV>
  <DIV>&nbsp;</DIV>
  <DIV><FONT face=3DArial size=3D2>from 19,500 to 19,750</FONT></DIV>
  <DIV>&nbsp;</DIV>
  <DIV><FONT face=3DArial size=3D2>then letter s in fifth =
possition</FONT></DIV>
  <DIV>&nbsp;</DIV>
  <DIV><FONT face=3DArial size=3D2>19,518</FONT></DIV>
  <DIV><BR>&nbsp;</DIV>
  <DIV><FONT face=3DArial size=3D2>i think it's called subtractive index =

  referencing, and i'll try it out on&nbsp;a lexicon, and see how quick =
it=20
  is.&nbsp; the theory is good, as you only need to do a search for one =
letter=20
  at a time, and find the first and last of it.&nbsp; could build and =
index of=20
  the first three letter combinations (17,576 references, start and end =
as one=20
  ref).&nbsp; not realtime, tho.&nbsp; could work on that.</FONT></DIV>
  <DIV>&nbsp;</DIV>
  <DIV>&gt;Martin</DIV>
  <DIV><BR><FONT face=3DArial size=3D2>yours in christ,</FONT></DIV>
  <DIV>&nbsp;</DIV>
  <DIV><FONT face=3DArial size=3D2>Christopher Miller</FONT></DIV>
  <DIV>&nbsp;</DIV>
  <DIV><FONT face=3DArial size=3D2>ps, does your c++ compiler support =
assembly=20
  sub-routines.&nbsp; some don't</FONT></DIV></BLOCKQUOTE></BODY></HTML>

------=_NextPart_000_0028_01C1055E.FA486E00--