[sword-devel] Sword phrase search not returning all expected results

David Haslam dfhdfh at protonmail.com
Sat Mar 1 04:31:41 EST 2025


Hi Tobias,

My search results were with xiphos.exe 4.2.1 (gtk2 webkit1)

There are 14 verses in which the word occurs more than once:

> Acts 3:16: And his name through faith in his name hath made this man strong, whom ye see and know: yea, the faith which is by him hath given him this perfect soundness in the presence of you all.
> Romans 1:17: For therein is the righteousness of God revealed from faith to faith: as it is written, The just shall live by faith.
> Romans 3:30: Seeing it is one God, which shall justify the circumcision by faith, and uncircumcision through faith.
> Romans 4:16: Therefore it is of faith, that it might be by grace; to the end the promise might be sure to all the seed; not to that only which is of the law, but to that also which is of the faith of Abraham; who is the father of us all,
> Romans 14:23: And he that doubteth is damned if he eat, because he eateth not of faith: for whatsoever is not of faith is sin.
> II Corinthians 1:24: Not for that we have dominion over your faith, but are helpers of your joy: for by faith ye stand.
> Galatians 2:16: Knowing that a man is not justified by the works of the law, but by the faith of Jesus Christ, even we have believed in Jesus Christ, that we might be justified by the faith of Christ, and not by the works of the law: for by the works of the law shall no flesh be justified.
> Galatians 3:23: But before faith came, we were kept under the law, shut up unto the faith which should afterwards be revealed.
> Philippians 3:9: And be found in him, not having mine own righteousness, which is of the law, but that which is through the faith of Christ, the righteousness which is of God by faith:
> I Timothy 1:19: Holding faith, and a good conscience; which some having put away concerning faith have made shipwreck:
> Hebrews 11:7: By faith Noah, being warned of God of things not seen as yet, moved with fear, prepared an ark to the saving of his house; by the which he condemned the world, and became heir of the righteousness which is by faith.
> James 2:14: What doth it profit, my brethren, though a man say he hath faith, and have not works? can faith save him?
> James 2:18: Yea, a man may say, Thou hast faith, and I have works: shew me thy faith without thy works, and I will shew thee my faith by my works.
> James 2:22: Seest thou how faith wrought with his works, and by works was faith made perfect?

Best regards,

David

Sent with [Proton Mail](https://proton.me/mail/home) secure email.

On Saturday, March 1st, 2025 at 9:09 AM, Tobias Klein <contact at tklein.info> wrote:

> Hi David,
>
> when I perform a Lucene search for "faith" in the KJV with Xiphos 4.2.1 on Linux I get 341 results.
> When I perform an exact phrase search with the same environment I get 338 results (exactly like in Ezra after the bugfix).
>
> Best regards,
> Tobias
>
> On 3/1/25 09:45, David Haslam wrote:
>
>> Hi Tobias,
>>
>> A Lucene search for 'faith' in the KJV module using Xiphos returns 231 locations.
>> Aside: Only 2 of these locations are in the OT !!!
>>
>> I'm sure there are verses that have the word repeated, as the whole word occurs 247 times.
>> (Search results from the plain text file output by diatheke)
>>
>> If one drops the whole word criterion, the total leaps to 362, as words such as 'faithful', 'faithfulness', 'faithless', etc., are then included.
>>
>> Best regards,
>>
>> David
>>
>> Sent with [Proton Mail](https://proton.me/mail/home) secure email.
>>
>> On Saturday, March 1st, 2025 at 7:09 AM, Tobias Klein [<contact at tklein.info>](mailto:contact at tklein.info) wrote:
>>
>>> Hi Troy,
>>>
>>> can this be fixed in SWORD?
>>>
>>> This bug impacts the search function quite significantly. I noticed when my standard test scenario for search started to fail after my adjustments.
>>> The reason was that the search results for my test scenario significantly increased and I had to adjust the expected results.
>>> The test scenario searches for "faith" in KJV. Previously (before the bugfix) I expected 324 search results.
>>> After the bugfix/change mentioned below there are now 338 search results. So you see that quite some verses are missed by the search function because of this bug.
>>>
>>> Best regards,
>>> Tobias
>>>
>>> On 2/23/25 18:38, David Haslam wrote:
>>>
>>>> Excellent sleuthing, Tobias !
>>>>
>>>> Best regards,
>>>>
>>>> David
>>>>
>>>> Sent with [Proton Mail](https://proton.me/mail/home) secure email.
>>>>
>>>> On Sunday, February 23rd, 2025 at 5:17 PM, Tobias Klein [<contact at tklein.info>](mailto:contact at tklein.info) wrote:
>>>>
>>>>> Hi Troy,
>>>>>
>>>>> I have discovered the root cause of this bug.
>>>>>
>>>>> There is the following code in osisplain.cpp.
>>>>> I suppose the uppercasing action here has negative impact on the overall parsing when the stripText() is running?
>>>>>
>>>>> elseif (!strncmp(token, "/divineName", 11)) {
>>>>> // Get the end portion of the string, and upper case it
>>>>> char*end=buf.getRawData();
>>>>> end+=buf.size() -u->lastTextNode.size();
>>>>> toupperstr(end);
>>>>> }
>>>>> When I comment this portion out, the search bug does not occur anymore and I get a correct result, see below.
>>>>>
>>>>> textBuf: For he said, Because the Lord hath sworn that the Lord will have war with Amalek from generation to generation.
>>>>> term: generation to generation
>>>>> Got 11 results!
>>>>> Exod 17:16
>>>>> Isa 13:20
>>>>> Isa 34:10
>>>>> Isa 34:17
>>>>> Isa 51:8
>>>>> Jer 50:39
>>>>> Lam 5:19
>>>>> Dan 4:3
>>>>> Dan 4:34
>>>>> Joel 3:20
>>>>> Luke 1:50
>>>>>
>>>>> So, what the code stumbles over in the specific case of Exodus 17:16 is the <divineName> tag and the parsing / actions related to it.
>>>>> Why is the uppercasing necessary at all in the code above? Shouldn't this be left to the application software in terms of formatting the respective element/tag in uppercase?
>>>>>
>>>>> Best regards,
>>>>> Tobias
>>>>>
>>>>> On 2/22/25 20:32, Tobias Klein wrote:
>>>>>
>>>>>> Hi Troy,
>>>>>>
>>>>>> so I did a little debugging on this.
>>>>>>
>>>>>> The respective portion of code in swmodule.cpp is this code below. I added some conditional print outs for Exodus 17:16 to see what happens there.
>>>>>>
>>>>>> caseSEARCHTYPE_PHRASE: {
>>>>>> textBuf=stripText();
>>>>>> if ((flags&REG_ICASE) ==REG_ICASE) textBuf.toUpper();
>>>>>> SWKey*currentKey=getKey();
>>>>>> std::stringreferenceKey=["Exod 17:16"](http://Exod17:16);
>>>>>> if (currentKey->getShortText() ==referenceKey) {
>>>>>> std::cout<<"textBuf: "<<textBuf.c_str() <<std::endl;
>>>>>> std::cout<<"term: "<<term.c_str() <<std::endl;
>>>>>> }
>>>>>> // TKL: This is where the actual search per verse happens
>>>>>> sres=strstr(textBuf.c_str(), term.c_str());
>>>>>>
>>>>>> I get the following output based on my modification above:
>>>>>>
>>>>>> textBuf: For he said, Because the
>>>>>> term: generation to generation
>>>>>>
>>>>>> The full verse content of Exodus 17:16 in KJV is this:
>>>>>> For he said, Because the Lord hath sworn that the Lord will have war with Amalek from generation to generation.
>>>>>>
>>>>>> So ... it seems that the stripText() call strips too much of the content (textBuf) of the verse away.
>>>>>> Based on that there is no way for the strstr call to succeed detecting the term "generation to generation", because at that point it is not part of the search string (textBuf) anymore.
>>>>>>
>>>>>> Could you do some investigation regarding the behavior of stripText here?
>>>>>>
>>>>>> Best regards,
>>>>>> Tobias
>>>>>>
>>>>>> On 2/22/25 15:45, Tobias Klein wrote:
>>>>>>
>>>>>>> Hi Troy,
>>>>>>>
>>>>>>> an Ezra Bible App user reported that the phrase search is not working as expected.
>>>>>>>
>>>>>>> Here is an example where the results are not as expected.
>>>>>>>
>>>>>>> Module: KJV
>>>>>>>
>>>>>>> Search term: "generation to generation"
>>>>>>>
>>>>>>> I get the following results from the SWORD engine:
>>>>>>> Isa 13:20
>>>>>>> Isa 34:10
>>>>>>> Isa 34:17
>>>>>>> Isa 51:8
>>>>>>> Jer 50:39
>>>>>>> Dan 4:3
>>>>>>> Dan 4:34
>>>>>>> Joel 3:20
>>>>>>> Luke 1:50
>>>>>>>
>>>>>>> However, the verse Exodus 17:16 also contains this phrase, but is not in the list of search results.
>>>>>>> Could it be related to the way how the markup is structured?
>>>>>>>
>>>>>>> In Exodus 17:16 [KJV], the markup of the respective phrase looks like this:
>>>>>>>
>>>>>>> <w class="strong:H01755">from generation</w> <w class="strong:H01755">to generation</w>
>>>>>>>
>>>>>>> This is how I call the search function of the SWORD engine:
>>>>>>> listKey = module->search(searchTerm.c_str(), int(searchType), flags, scope, 0, internalModuleSearchProgressCB);
>>>>>>> see https://github.com/ezra-bible-app/node-sword-interface/blob/master/src/sword_backend/module_search.cpp#L178
>>>>>>>
>>>>>>> Have a nice weekend!
>>>>>>>
>>>>>>> Best regards,
>>>>>>> Tobias
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> sword-devel mailing list: sword-devel at crosswire.org
>>>>>>> http://crosswire.org/mailman/listinfo/sword-devel
>>>>>>> Instructions to unsubscribe/change your settings at above page
>>>>>>
>>>>>> _______________________________________________
>>>>>> sword-devel mailing list:
>>>>>> sword-devel at crosswire.org
>>>>>>
>>>>>> http://crosswire.org/mailman/listinfo/sword-devel
>>>>>> Instructions to unsubscribe/change your settings at above page
>>>>
>>>> _______________________________________________
>>>> sword-devel mailing list:
>>>> sword-devel at crosswire.org
>>>>
>>>> http://crosswire.org/mailman/listinfo/sword-devel
>>>> Instructions to unsubscribe/change your settings at above page
>>
>> _______________________________________________
>> sword-devel mailing list:
>> sword-devel at crosswire.org
>>
>> http://crosswire.org/mailman/listinfo/sword-devel
>> Instructions to unsubscribe/change your settings at above page
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://crosswire.org/pipermail/sword-devel/attachments/20250301/473ac3ef/attachment-0001.htm>


More information about the sword-devel mailing list