[jsword-devel] MorphGreek Search
DM Smith
dmsmith555 at yahoo.com
Fri Feb 8 19:47:31 MST 2008
Steve,
Here is an outline of what needs to be done for the change:
1) Add a public static method to o.c.j.book.OSISUtil that given an
element will return a String representing the content that needs to be
indexed. See: getStrongsNumbers for an example.
2) In o.c.j.index.lucene.LuceneIndex add logic a condition to call
your method if the module has lemmas. This block will need to define a
field for the lemma and I suggest "lemma" as a catch-all for all
lemmas other than Strong's numbers. I suggest using "morph" for the
morph field.
You will note that in OSIS that the lemma and the morph attributes
values are of the form "label:value" The labels in the cases below are
lemma.Strong and robinson. But there is little consistency regarding
the labels across the modules, much less across OSIS documents.
3) Determine which analyzer is appropriate. By default, if you do
nothing, SimpleAnalyzer will be used which should index each "word".
To change it take a look at o.c.j.index.lucene.LuceneAnalyzer for a
model of what to do. Basically, you'd create an analyzer which is
pretty much just a transformational filter. Then bind it in
o.c.j.i.l.LuceneAnalyzer to the field.
That's it.
If you want to make the lemma visible, then that's some xslt changes
in simple.xsl (which is no longer simple) And it would also mean
adding Show/Hide Lemma to the View Menu.
Hope this helps and looking forward to your contributions.
In Him,
DM
On Feb 2, 2008, at 9:02 AM, Steven Mullins wrote:
> DM,
>
> That's great news! I have been looking at http://emdros.org/ for
> lingustic
> analysis of the Greek New Testament. I think that Lucene might do
> all the
> analysis that I need though, without Emdros. Let me know when you
> have a
> version working and I'll give it a try. Maybe when the command line
> searches
> are working well, we can make a GUI search builder.
>
> Steve
>
> Steven Mullins wrote:
>> Greetings! I use Bible Desktop with the MorphGreek module.
>> I'd like to be able to search using the lexical and morphological
>> infomation
>> in addition to the rendered word. For example the source for the
>> second
> word
>> in John 1:1, arche is:
>>
>> <w lemma="lemma.Strong:ἀρχή" morph="robinson:N-DSF">ἀρχῇ</
>> w>
>>
>> It would great if I could search by the contexts of the 'lemma'
>> tag, since
> it
>> contains the lexical form of the word. It would also be helpful to
>> be able
>> to search/filter by the 'morph' tag as well, since I might only be
>> looking
>> for a word in a certain form. Maybe it is possible do this kind of
>> seach
> now
>> now, but I have no idea how. I'd be willing to help program this
>> functionality in, if it is not there already.
>>
>
> Welcome Steve,
>
> This is a great idea! And it would be pretty easy to do.
>
> I'm in the middle of updating to Lucene 2.3 (it is about to be
> released)
> and there are some optimizations I am working on. If you give me a
> couple of days I should be done and will point you in the right
> direction.
>
> In His Service,
> DM
>
> _______________________________________________
> jsword-devel mailing list
> jsword-devel at crosswire.org
> http://www.crosswire.org/mailman/listinfo/jsword-devel
More information about the jsword-devel
mailing list