[bt-devel] Fwd: Re: clucene crash when searching

Martin Gruner mg.pub at gmx.net
Wed Nov 26 12:20:26 MST 2008


Hi Eeli

a summary note: If you are happy with an additional pane offering Sword's 
unindexed regexp search in addition to what we  do have with clucene, marked 
as ADVANCED search, that's ok with me. And if you do it fast it can be in 1.7. 
But the main focus should be on improving our existing clucene based solution, 
as long as we do not favor a _replacement_ with another search engine. Ok?

mg

On Wednesday 19 November 2008 21:17:41 Eeli Kaikkonen wrote:
> On Tue, 18 Nov 2008, Martin Gruner wrote:
> > I do not see that we would gain much by adding support of Sword's
> > non-indexed search engines, except for the ability to search for phrases.
> >
> > Searching in BT should be simple and consistent. That means that we
> > should not, in my opinion, offer different search syntaxes to the user.
> > Maybe one exception: a regexp-based search for power users, but all
> > "normal" users should have one single search to work with (from a user's
> > point of view).
>
> Actually that is almost the same thing I was pointing at. If we add a
> tab for regexp search it wouldn't add any complexity for users - they
> can always choose not to open the tab. The current search launchers
> would open the current search UI, only the tab text would be different.
> Regexp search would give all possible power to power users, including
> phrase search.
>
> > My suggestion would be to talk about the search engine we do use,
> > clucene. I just checked - they released a bugfix 0.9.21 version recently,
> > and 0.9.23, which is a beta-quality preview release of their next
> > development branch, which is supposed to improve Lucene
> > compatibility/feature coverage. Ben also told me that he was going to
> > implement the wildcard operator in the beginning of words (like
> > "*minded").
> > But nobody can say how long this will take. So we may want to use another
> > open source search engine which suits our needs better.
>
> Good to hear clucene is going forward. I'm pessimistic about finding a
> better alternative, but it's always good to look around. I'm even more
> pessimistic about the idea of using something else than we and Sword
> already use. Even though we don't depend on Sword in this we may still
> benefit each other (and other frontends) by helping (c)lucene.
>
> A technical note about wildcard search: allowing prepended wildcard may
> lead to very slow performance, it might be as slow as going through the
> whole module because every word in index must be tested. Creating
> another index would help but I don't know if it's realistic. Even now
> searching for very common words (e.g. "and, "an")  may be more than 5
> seconds with a slow machine. Therefore a threaded search may be
> necessary later even if we stick with indexed search only.  (On the
> other hand, this slowness which I noticed may come from the graphical
> UI, not from the engine - this should be researched further.)
>
> > We could start a wiki page listing the specific problems that we see with
> > clucene, and investigate if they can be solved. At the same time we can
> > collect information about other search engines in a matrix of
> > features/properties that we do need. Maybe we come up with something
> > better, more stable and feature-rich than clucene?
>
> I'm extremely pessimistic about this (creating a new engine). It's not
> impossible, but given the amount of men and time it's better to leave it
> to others who already have done it. This may of course change if we gain
> more interest and developers from Windows community later. But even then
> it may be better to help (c)lucene.
>
> > A major problem that I see: What about our release roadmap? We should not
> > start changing the search engine in the 1.7.x branch/release cycle. I'm
> > unhappy with the status quo, we cannot stay in beta state for a long time
> > and continue changing the internals of our software. We should release
> > FIRST, and THEN start making major changes.
>
> I wasn't and am not going to start anything ATM. A wiki page might
> really be a good idea. Time spent on thinking about this is not wasted
> time, even though it may never realize. A wiki page could help the Sword
> engine, too.
>
>   Yours,
> 	Eeli Kaikkonen (Mr.), Oulu, Finland
> 	e-mail: eekaikko at mailx.studentx.oulux.fix (with no x)
>
> _______________________________________________
> bt-devel mailing list
> bt-devel at crosswire.org
> http://www.crosswire.org/mailman/listinfo/bt-devel




More information about the bt-devel mailing list