[sword-devel] search issue in 2TGreek

Karl Kleinpaste karl at kleinpaste.org
Mon Apr 14 14:59:55 EDT 2025


On 3/29/25 13:30, Fred wrote:
> So I fired up Xiphos, installed the 3 greek modules from Crosswire and 
> did some searches.
> Oddly enough, doing a search for that word, using "exact phrase" says 
> it can be found in Matt 8:17, Mark 13:35, II Cor 6:18, II Tim 1:5, 
> Titus 3:9 and 1 Peter 4:19.
>
> doing the same search in the tischmorph module turns up only the ones 
> in Rev and II Cor.
>
> So, looked up the matthew reference and find that word isn't there!
> ...
> Two oddities here, anybody got any clues?

Sorry for being tardy about this, another instance of "I marked this for 
later, then didn't notice when 'later' came and went..."

The short, useless answer is "I don't have any good explanation."

TischMorph:
When I search lemma:G3841, I get 10 verses:
2Cor 6:18; Rev 1:8; 4:8; 11:17; 15:3; 16:7; 16:14; 19:6; 19:15; 21:22
Of that, Rev 16:14, 19:15 are included because they are ref'd by the 
same Strong's but in the form παντοκρατορος.

When I search παντοκρατωρ, I get:
lucene: nothing?
phrase, regex: The usual 8 verses (above, excluding the 2 alternates).
Certainly, I expect lucene search should cough up the same 8.

When I adjust the search term to the more general παντοκρ*, then lucene 
search gives me what I expect. ???

I have no idea how you're getting Matt/Mark/2Tim/Tit/1Pet references. I 
don't see that.

However, when I use diatheke:
diatheke -b TischMorph -s lucene -k παντοκρατωρ | sed -e 's/8R/8 ; R/' 
-e 's/II/; II/' | semis | sort
("sed" is to fix some rough output from diatheke)
then I get the usual 8 verses.

In 2TGreek:
lucene: nothing?
regex: The usual 8.
phrase: The usual 8 plus Mat 8:17; Mark 13:35; 2Tim 1:5, Tit 3:9?!? But 
no 1Pet reference.
This is freakish.

Again, adjust search to use παντοκρ*, I now get a proper set including 
the extras from παντοκρατορος(10 verses).
How is this distinguished?

I cannot begin to explain any of this. Surely, lucene search should be 
returning the proper set on a no-wildcard single word. This suggests to 
me that there is something funny about how the lucene index is being 
handled.
"Exact phrase" adding verses that are manifestly not to be included is a 
different sort of problem.

The bottom line for Xiphos is that it turns off Greek Accents and uses 
StripText on search terms (for consistency of "no accents that would 
create false differences"), then executes the search. What Xiphos 
displays is what the engine sends back. Xiphos has no more control than 
that.

I'm thoroughly mystified by all of this, especially since I can see that 
I (being in Xiphos' code) have nothing additional I can do about how the 
search behaves. The code just prepares the search terms and hands it off 
for execution.

I rebuilt lucene indices to ensure there was nothing funny from a 
possibly old index generation.

How can παντοκρατωρ and παντοκρ* produce different lucene results?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://crosswire.org/pipermail/sword-devel/attachments/20250414/b2feb7e2/attachment.htm>


More information about the sword-devel mailing list