[jsword-devel] Dictionary problems (and solutions, maybe...)

Brian Fernandes infernalproteus at gmail.com
Sat Jan 31 16:30:50 MST 2009


Today looked into a couple of issues I've noticed with dictionaries:

1) Linked entries

Load the Chinese/English Dictionary (ZhHanzi) and scroll down to the 
penultimate entry. If you select this you will see a "reading error" 
dialog box. There are several other entries in this dictionary which 
exhibit the same problem (like the entry just before this one).

These turned out to be "linked" entries and JSword was not properly 
determining the target of the link.

Here is some code from the DataEntry class:

  public String getLinkTarget() {
         // 6 represents the length of "@LINK" + 1 to skip the last 
separator.
         return SwordUtil.decode(name, data, getKeyEnd() + 6, 
data.length - (getLinkEnd() + 1), charset).trim();
}

The length calculation (4th parameter) did not make much sense, since it 
should have been decoding from the start to the end of the link target.

I made the following change to fix this, linked entries now work. I'm 
not familiar with linked entries (or any entries for that matter), so 
please review:

  public String getLinkTarget()
     {
         // 6 represents the length of "@LINK" + 1 to skip the last 
separator.
         int linkStart = getKeyEnd() + 6;
         int len = getLinkEnd() - linkStart + 1;
         return SwordUtil.decode(name, data, linkStart, len, 
charset).trim();
     }


2) Load the "Hebrew to Greek Dictionary of Septuagint Words" 
(HebrewGreek). Select the 2nd entry - "00001", you get an "error reading 
  00001" message. Spot checking indicates that other entries work fine.

The problem here is the fact that the binary search used in 
RawLDBackend#search assumes that the entries are in alphabetical / 
ascending order. While most of the list is alphabetical, the first entry 
(the title) is out of order. This causes the binary search to be thrown 
out of gear in *specific* situations.

a) This happens only for entry 00001 - the 2nd entry.
b) This will not happen for the 2nd entry of all dictionaries, even 
those containing out of order titles (and many dictionaries have no 
title entry). It depends on the number of entries in the dictionary and 
thus the points at which the list is halved for each iteration in the 
search. I have about 5 dictionaries with titles, only HebrewGreek 
exhibits this problem.

A proposed fix is to omit the first entry from the search, whether it is 
a title or not. In RawLDBackend, line #345, we change "low" from -1 to 
0. This fixes our problem with 00001, but will also prevent the binary 
search from finding a match for the first entry in any dictionary.

However, the conditional following the search (line #378) explicitly 
looks for matches against the first entry (written specifically to 
handle out of order titles), and will return 0 as expected, whether the 
first entry is an out of order title, or not.

Thoughts?

Brian.



More information about the jsword-devel mailing list