[sword-devel] Comming soon: new improved sword searching

Chris Little sword-devel@crosswire.org
Sun, 8 Sep 2002 13:32:12 -0700 (MST)


On Mon, 9 Sep 2002, Leon Brooks wrote:

> On Sun, 8 Sep 2002 13:42, Joel Mawhorter wrote:
> > If any of you can think of an example of something that you do
> > with the current regular expression searching that won't be possible with
> > what I described above, please let me know.
> 
> All verses containing two or more of God, Good or Greed: (g[ore]*d){2,}
> 
> It's true that most people can't be bothered learning regex, but I think 
> losing it would be a Bad Idea. If you're after speed improvements, do a 
> plaintext/indexed search up to the first special character before handing 
> over to regex proper. Be aware that regex libraries are sufficient efficient 
> these days that you might not see any improvement, depending on how much 
> indexing you get to use (by way of reducing the search space) on the way in.

Good points.  It's even more important in some languages that have unusual
inflectional morphology, like infixation (think of marking -ed for past
tense in the middle of the word) and suprafixation (e.g. tone differences
marking tense or person).  You probably COULD do these with &'s and |'s,
but it would be more intuitive & simpler to construct a regex (at least to
anyone who had ever seen a regex).

The REAL reason to keep it is because of geek appeal.  What kind of free 
software project would we be if we didn't support regex?  :)  And isn't 
there some unwritten rule about requiring Linux programs to use regex?

--Chris