[jsword-devel] Indexing issues

DM Smith dmsmith555 at yahoo.com
Wed Jun 30 07:23:04 MST 2004


Before 0.9.7 there was a problem with the KJV w/ Strongs as the XML is 
not valid (it seems to be well formed) and the use of a validating 
parser choked on the input. (i.e. on the "resp" element in the NT)

I have had problems in the past where the program would run out of 
memory. The few searches that I have tried did not work as I expected. 
As a result of this and since there was enough to do elsewhere in the 
program, I have not spent any time in the past with search (including 
testing it).

So tried the following tests on KJV (1769) with Strongs:
Remove the index directory, ~/.jsword/sword-KJV
Display Rev 21 via passage lookup.
Search for "chrysoprasus" (found only in Rev 21:20)
I had to repeat the search.
Try another search restricted to "Rev" for "prophecy"
I tried to delete the index directory, ~/.jsword/sword-KJV and re-index.

In eclipse, with the latest code:
The indexing proceeds along giving greater and greater percentages and 
verses further along. The last percent that I see is 60% and the verse 
is Rev 22:21. So visually the progress bar seems to indicate that it has 
not completed, but quit at 60%.
The result of the search is that the BibleViewPane is emptied and the 
"View:" box is emptied of the passage that I was viewing.
I repeated the search and it found the verse.
I tried another search ("prophecy") and it found it right away.
After the search I tried to delete ~/.jsword/sword-KJV but I got an in 
use error dialog.

In version .9.7
The indexing is really slow with a noticeable hit on machine 
performance. It also got to 60% and Rev 22.21. But hung around there for 
a long time with the machine's disk activity light on. Finally, it 
showed 99% and "Saving index". I guess in the other test I did not see 
it because it was so fast.
The rest of the test had the same result. With the exception that the 
search was a bit slower.

Analysis, Guess and Thoughts:
I think that the speed up was because of the BitSet changes that I did a 
bit ago. I think that the indexing makes JSword v0.9.7 look like it is 
hanging. My machine has 512M of ram and a fairly fast processor so it 
may be more intolerable on another machine.

I have a coding practice I call the "Principle of Least Surprise." I 
found it surprising that the verse I was viewing was replaced with the 
search result. I would have expected it to open another tab. In my 
opinion this would be more useful as each search would have its own tab.

The message that was presented when the work was index said that the 
results would be more accurate (forget the exact wording). I think that 
the message is a bit misleading. Could we put up a dialog box asking 
permission to index, stating that it may take a few minutes and then 
block until it is done (still showing the progress meter) and once done, 
the search would be performed?

The metering could be more accurate with each verse being a step along 
the way. Since the number of verses is known in advance the meter would 
be more useful.

With regard to the "in use" error that I had under WinXP I am not sure 
what would be best. Under UNIX it is no problem as an open file can be 
deleted by any process even if it is open by another. The directory 
entry in the file system is deleted but it is not until the reference 
count on the actual file reaches zero that the file is deleted. Windows 
on the other hand is much more onerous in its handling of the file. It 
cannot be deleted if it is in use. The question that comes to mind is: 
"Should the program hold open file handles for indexes?" If I am 
searching a bunch of bibles I may hit a resource limit.


Joe Walker wrote:
> Robert Berndt wrote:
> 
>> I'm having problems with indexing a few of the sword modules.
>> In particular, anything with Strong's numbers never completely 
>> finishes the indexing process.
>>
>> Has anyone else had this problem?
> 
> 
> I've not had any problems - I did download a new KJV but didn't 
> reproduce anything.
> Can you tell me what version of JSword you are using and the verse it 
> gets stuck on (from the status bar)



More information about the jsword-devel mailing list