[jsword-devel] Dictionary problems (and solutions, maybe...)
Brian Fernandes
infernalproteus at gmail.com
Sat Jan 31 16:30:50 MST 2009
Today looked into a couple of issues I've noticed with dictionaries:
1) Linked entries
Load the Chinese/English Dictionary (ZhHanzi) and scroll down to the
penultimate entry. If you select this you will see a "reading error"
dialog box. There are several other entries in this dictionary which
exhibit the same problem (like the entry just before this one).
These turned out to be "linked" entries and JSword was not properly
determining the target of the link.
Here is some code from the DataEntry class:
public String getLinkTarget() {
// 6 represents the length of "@LINK" + 1 to skip the last
separator.
return SwordUtil.decode(name, data, getKeyEnd() + 6,
data.length - (getLinkEnd() + 1), charset).trim();
}
The length calculation (4th parameter) did not make much sense, since it
should have been decoding from the start to the end of the link target.
I made the following change to fix this, linked entries now work. I'm
not familiar with linked entries (or any entries for that matter), so
please review:
public String getLinkTarget()
{
// 6 represents the length of "@LINK" + 1 to skip the last
separator.
int linkStart = getKeyEnd() + 6;
int len = getLinkEnd() - linkStart + 1;
return SwordUtil.decode(name, data, linkStart, len,
charset).trim();
}
2) Load the "Hebrew to Greek Dictionary of Septuagint Words"
(HebrewGreek). Select the 2nd entry - "00001", you get an "error reading
00001" message. Spot checking indicates that other entries work fine.
The problem here is the fact that the binary search used in
RawLDBackend#search assumes that the entries are in alphabetical /
ascending order. While most of the list is alphabetical, the first entry
(the title) is out of order. This causes the binary search to be thrown
out of gear in *specific* situations.
a) This happens only for entry 00001 - the 2nd entry.
b) This will not happen for the 2nd entry of all dictionaries, even
those containing out of order titles (and many dictionaries have no
title entry). It depends on the number of entries in the dictionary and
thus the points at which the list is halved for each iteration in the
search. I have about 5 dictionaries with titles, only HebrewGreek
exhibits this problem.
A proposed fix is to omit the first entry from the search, whether it is
a title or not. In RawLDBackend, line #345, we change "low" from -1 to
0. This fixes our problem with 00001, but will also prevent the binary
search from finding a match for the first entry in any dictionary.
However, the conditional following the search (line #378) explicitly
looks for matches against the first entry (written specifically to
handle out of order titles), and will return 0 as expected, whether the
first entry is an out of order title, or not.
Thoughts?
Brian.
More information about the jsword-devel
mailing list