[sword-devel] [jsword-devel] Stem searching

Troy A. Griffitts scribe at crosswire.org
Wed Jul 11 07:24:50 MST 2012


Sorry for all the typos last night. Was running on coffee.

So, I spent a little time this afternoon adding in the index field 
discussed in this message.  It's committed in SVN.  You'll need to 
rebuild your indexes on a module to take advantage of it.  Nothing 
should be disturbed if you don't, but you won't have the new field 
available.

The field is called 'morph'.  You can see it in action on swordweb:

http://crosswire.org/study/parallelstudy.jsp

At the top, under 'Presets' click 'NT Scholar'

Click on any word in KJV, TR, or Tregelles.

You'll notice a few new options now.  You can search for the lemma with 
any morphology-- how things have worked for a long time.
Or you can now search for the lemma with the same morphology as the word 
you clicked-- this uses the new morph index field.

Once you've done a search, you can see the search string that was built 
and can play around with ? and * on the morph, or any of the other 
lucene functions you find here:

http://lucene.apache.org/core/3_6_0/queryparsersyntax.html

One caveat: '-' is common in Robinson's morph codes and needs to be 
escaped; it means 'without' in lucene search syntax.  bummer.

Have fun, let me know what you think.

Troy


On 07/11/2012 03:17 AM, Daniel Owens wrote:
> Please let me know if there is anything non- or semi-technical I can 
> do (testing, etc.) to help with this. I was surprised to learn that BW 
> (and Logos 4, I would add) uses such arcane syntax, but it really is 
> powerful once you learn it. I wonder how many users simply do not make 
> use of the powerful tools at their fingertips because they are 
> intimidated by the syntax. The only person I know who uses this to 
> great effect is a PHP programmer with a computer science degree...
>
> Daniel
>
> On 07/10/2012 05:09 PM, Troy A. Griffitts wrote:
>> Chris,
>>
>> We're toyed around with the best way to add lemma+morph searching in 
>> SWORD but haven't finalized anything yet.
>>
>> Indexing Morphology codes won't helps.  This would give you 2 fields 
>> which need to be used together.
>>
>> For example, if you wish to find λογος only in the nominative within 
>> 3 words of any present, active, indicative, 2 persons singular or 
>> plural verb, you could not satisfy your search.
>>
>> Believe it or not, end users of tools like Bibleworks seem quite 
>> happy to learn odd syntax like:
>>
>>
>> "λογος@* *@PAI2?"~3
>>
>>
>> Of course GUI tools to help build that syntax for them is also desired.
>>
>> This it the direction we're heading, but would require lemma encoding 
>> changed from strongs to lexical form.
>>
>> Presently we could nearly obtain this by building an index as (from 
>> the start of John 1.1):
>>
>> G1722 at PREP G746 at N-DSF G2258 at V-IXI-3S
>>
>> But this would require users to know strongs numbers rather than 
>> lexical form, which would almost certainly need a GUI to help them 
>> build the search syntax.
>>
>> Hope this helps,
>>
>> Troy
>>
>>
>>
>>
>> On 07/10/2012 11:41 PM, Chris Burrell wrote:
>>> Hello
>>>
>>> Does anyone know/tried some kind of stem search with JSword? Is it 
>>> implemented? Or would we need to do a bit more work there?
>>>
>>> Chris
>>>
>>>
>>>
>>> _______________________________________________
>>> jsword-devel mailing list
>>> jsword-devel at crosswire.org
>>> http://www.crosswire.org/mailman/listinfo/jsword-devel
>>
>>
>>
>>
>> _______________________________________________
>> sword-devel mailing list: sword-devel at crosswire.org
>> http://www.crosswire.org/mailman/listinfo/sword-devel
>> Instructions to unsubscribe/change your settings at above page
>
>
> _______________________________________________
> sword-devel mailing list: sword-devel at crosswire.org
> http://www.crosswire.org/mailman/listinfo/sword-devel
> Instructions to unsubscribe/change your settings at above page





More information about the sword-devel mailing list