[sword-devel] diatheke search type regex and the dot ?
Karl Kleinpaste
karl at kleinpaste.org
Mon Mar 6 12:59:36 MST 2017
On 03/03/2017 09:16 PM, Troy A. Griffitts wrote:
> SWORD supports compiling with a variety of regex engines
I have an interesting result. My previous build of sword used
--with-cxx11regex, and that failed to find Abednego in any circumstance.
Reconfiguring without that option and rebuilding, I now get this result:
$ diatheke -b KJV -s regex -k Abed....nego
Entries containing "Abed....nego"-- none (KJV)
$ diatheke -b KJV -s regex -k Abed...nego
Entries containing "Abed...nego"-- Daniel 1:7Daniel 2:49 ; Daniel 3:12 ;
Daniel 3:13 ; Daniel 3:14 ; Daniel 3:16 ; Daniel 3:19 ; Daniel 3:20 ;
Daniel 3:22 ; Daniel 3:23 ; Daniel 3:26 ; Daniel 3:28 ; Daniel 3:29 ;
Daniel 3:30 ; -- 14 matches total (KJV)
$ diatheke -b KJV -s regex -k Abed..nego
Entries containing "Abed..nego"-- none (KJV)
$ diatheke -b KJV -s regex -k Abed.nego
Entries containing "Abed.nego"-- none (KJV)
What's important here is that the dash in the middle of "Abed-nego" in
KJV appears as (from Dan.3.30, passed through "od -c"):
0000360 d A b e d 342 200 223 n e g o < / w
So diatheke with C++11 regex fails entirely, and diatheke without C++11
regex finds it only when the 3 component bytes of the dash character are
specified individually, which is to say, unaware of multibyte encoding
at all.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.crosswire.org/pipermail/sword-devel/attachments/20170306/61474230/attachment.html>
More information about the sword-devel
mailing list