[sword-devel] Question for Optimization / speed experts
Christian Renz
sword-devel@crosswire.org
Mon, 6 Jan 2003 19:18:24 -0300
>BibleTime supports displaying different lexicons side by side, it offers only
>the keys which are available in all of the chosen lexicons.
Some ideas for this:
- You don't have to build the whole list at once. Build e.g. a list of
20 or 30 items, then calculate more as the user scrolls up/down. You
might even calculate the list using an idle thread.
- Personally, I would find a list of *all* keys more desirable (that
is, list any key that is found in any of the chosen lexicons). I
took a look at Bibleworks when visiting my brother (he is teaching
OT) and noticed it has a feature where instead of having a window
per dictionary, you have one window displaying the definitions in a
list of selected dictionaries, like this:
x Dictionaries -----------*
| Yaddayadda |
| ... |
| |
| BDB |
| ... |
| |
| Websters |
| ... |
| |
* ------------------------*
It is a nice feature I would like to see in sword as well.
- Your algorithm is O(m * n), which explains the runtime. The
algorithm proposed by Glen looks good, especially as it runs in
linear time (ie, looks at each entry exactly once). However, it can
still be optimized: If an element is > the currently active item,
you use that one to continue your search. Also, you can step the
marker more often. I implemented the algorithm in perl (just for
fun), but tried to not use perl-specific optimizations. I put some
code into subroutines so the main algorithm reads better.
--------------- 8< --------------- 8< ---------------
# Calculate intersection of sets. 20030106 crenz@web42.com
use strict;
use warnings;
my @sets =
([ 0, 1, 2, 3, 7, 8, 9, 10, 11, 12, 13, 14, 15, ],
[ 0, 3, 7, 8, 10, 11, 13, 14, 16, ],
[ 0, 1, 2, 3, 5, 7, 8, 10, 11, 12, 13, 14, 16, ],
[ 2, 3, 4, 6, 7, 8, 10, 11, 12, 14, 16, ],
[ 0, 1, 2, 3, 7, 8, 9, 10, 11, 13, 14, 15, ]);
my $numSets = scalar @sets; # $numsets = 5;
my @intersection = ();
my $currentSet = 0;
my @marker = (0, 0, 0, 0, 0);
my $currentItem;
# helper functions
sub stepMarker {
$marker[$currentSet]++;
if ($marker[$currentSet] >= scalar @{$sets[$currentSet]}) {
# we are at the end of the current set
# => abort, there cannot be any more common items
last MAIN;
}
}
sub nextSet {
$currentSet++;
$currentSet %= $numSets;
}
sub getCurrentItem {
return $sets[$currentSet]->[$marker[$currentSet]];
}
# main intersection algorithm
my $activeItem = getCurrentItem;
my $isPresent = 1;
MAIN: while (1) {
stepMarker();
nextSet;
while (($currentItem = getCurrentItem()) < $activeItem) {
stepMarker();
}
if ($currentItem > $activeItem) {
$activeItem = $currentItem;
$isPresent = 1;
} else {
$isPresent++;
if ($isPresent == $numSets) {
# item has been found present in all sets
push @intersection, $activeItem;
stepMarker();
$activeItem = getCurrentItem;
$isPresent = 1;
}
}
}
print "Common items in the $numSets sets: @intersection\n";
--------------- 8< --------------- 8< ---------------
If you want to combine all items present in any list, you could use
Mergesort. It's only O(n * log n) for two lists.
Greetings,
Christian
--
crenz@web42.com - http://www.web42.com/crenz/ - http://www.web42.com/
"Few are those who can see with their own eyes and hear with their own
hearts." -- Albert Einstein