<html>
<head>
<meta content="text/html; charset=windows-1252"
http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<div class="moz-cite-prefix">On 03/03/2017 09:16 PM, Troy A.
Griffitts wrote:<br>
</div>
<blockquote
cite="mid:f2044dd2-6f51-0e43-b6b4-9fe54afb3ff5@crosswire.org"
type="cite">SWORD supports compiling with a variety of regex
engines</blockquote>
<p><font face="FreeSerif">I have an interesting result. My previous
build of sword used --with-cxx11regex, and that failed to find
Abednego in any circumstance. Reconfiguring without that option
and rebuilding, I now get this result:</font></p>
<font face="FreeSerif">$ diatheke -b KJV -s regex -k Abed....nego<br>
Entries containing "Abed....nego"-- none (KJV)<br>
$ diatheke -b KJV -s regex -k Abed...nego<br>
Entries containing "Abed...nego"-- Daniel 1:7Daniel 2:49 ; Daniel
3:12 ; Daniel 3:13 ; Daniel 3:14 ; Daniel 3:16 ; Daniel 3:19 ;
Daniel 3:20 ; Daniel 3:22 ; Daniel 3:23 ; Daniel 3:26 ; Daniel
3:28 ; Daniel 3:29 ; Daniel 3:30 ; -- 14 matches total (KJV)<br>
$ diatheke -b KJV -s regex -k Abed..nego<br>
Entries containing "Abed..nego"-- none (KJV)<br>
$ diatheke -b KJV -s regex -k Abed.nego<br>
Entries containing "Abed.nego"-- none (KJV)<br>
<br>
What's important here is that the dash in the middle of
"Abed-nego" in KJV appears as (from Dan.3.30, passed through "od
-c"):<br>
0000360 d A b e d 342 200 223 n e g o
< / w<br>
</font>
<p><font face="FreeSerif">So diatheke with C++11 regex fails
entirely, and diatheke without C++11 regex finds it only when
the 3 component bytes of the dash character are specified
individually, which is to say, unaware of multibyte encoding at
all.</font><br>
</p>
</body>
</html>