[sword-devel] Searching in paragraphs

sword-devel@crosswire.org sword-devel@crosswire.org
Sun, 15 Sep 2002 18:34:47 +0600


This is a multipart MIME message.

--==_Exmh_-2880922760
Content-Type: text/plain

I found that processing a bit vector of all 31102 verses is very fast. For me 
converting a vector of verses to the corresponding vector of paragraphs (by 
ORing) (g++ -O3 under Linux and G++ 2.95) it is about 0.005 sec (Similar speed 
expected for converting from bit vectors to other formats, it isn't slow as 
someone has said):

*** begin paragraphs.h ***
#ifndef PARAGRAPHS_H
#define PARAGRAPHS_H

#include <vector>

extern const char bible_pars_lengths[]; // lengths of paragraphs in verses
extern const int bible_pars_count;

void versesToParagraphsBitVector(const std::vector<bool> &verses,
				 std::vector<bool> &paragraphs);

#endif
*** end paragraphs.h ***

*** begin paragraphs.cpp ***
#include "paragraphs.h"

typedef std::vector<bool> bitvector;

void versesToParagraphsBitVector(const bitvector &verses, bitvector 
&paragraphs)
{
    paragraphs.resize(bible_pars_count);
    bitvector::const_iterator verse = verses.begin();
    bitvector::iterator paragraph=paragraphs.begin();
    for(int i=0; i!=bible_pars_count; ++i, ++paragraph) {
	bool f = false;
	for(int j=0; j<bible_pars_lengths[i]; ++j, ++verse)
	    if(*verse) {
		f = true;
		break;
	    }
	*paragraph = f;
    }
}
*** end paragraphs.cpp ***

See attachment for parsdata.cpp with the list of Bible paragraphs lengths (in 
verses) accordingly to OLB.

So please, writing the search engine, give the option to output the result in 
std::vector<bool> bit vector of matched verses and processing of it by this 
converter to matched paragragphs. Add searching in paragraphs, please.

Please include in Sword these files.


--==_Exmh_-2880922760
Content-Type: application/zip ; name="parsdata.zip"
Content-Description: parsdata.zip
Content-Transfer-Encoding: x-uuencode
Content-Disposition: attachment; filename="parsdata.zip"

begin 644 parsdata.zip
M4$L#!!0``@`(`*J2+RV+\`F$QPP``$U'```,`!4`<&%R<V1A=&$N8W!P550)
M``-`>X0]P7R$/55X!`#H`^@#[5QKBQW'$?WL_17[,09#IJK?&/\28XPM-K'`
M2$%2("3DOT>:>Z?ZG-.]EF)$2$0$L]J=.X_NZGJ<JCI]G_[V[NG-J\<7KU^]
M???XXI>?WCS^_/+G7Y]^_,M/;][^^.O3JS^_^^7M]S\\?O?XCX>O_)N'K]+]
M*.^/^O[(W\SS'_ZW^^\?SG?XK-R/],SU!K_KYQG>>[W/[\^S`A<4N,GE00,>
M5N_G,AS7=0T>GNY_7_?@A,O]F1_.V?6P<T3M?C;)6QI<%!?K,`QDA3)+,L7K
M[WX_ROW_=C_,X()T'RL*?,`\*CS@^M]0B!V&'$).(`4['W?=>\KDN!YTP!O.
M3U`W0@BW'P;77E)O^-3KP_/MY^5VK=KY60R]71<Y+-SYR<`_"H[A$EZYAGU-
M+YYZ^Y%@1<\S,3P:4(7CFE#HZ[5PYZAOLW$5T#F&\P<-_M2PVX\"\SDU\U+9
MFSBO-6L[2QDRQ`ZS;R#W#%-H]_LK*$0#(558V@;O"9DWD,UMC"&9)/<837J`
M/?H!`JS7Y0?,9:"^G9IL^-*!"]Q$%H:V=%L"DRG?UOFX+JXPQH:/CY5,^$E2
MRP@]:&I\(8L*0DQ-IQQ^)U2_BG10#\E==;)5>NKU'+_\YK@F'#>&T2:C!;4.
MGJ+M/,$T^B'ZV.3_\[98"\^H+X876Y'@D-5\74.1A3K%R@RRXYC[.:.IK(<,
M>J#K"&N>@KV\Q$!#'QO=L<O$XI:;)CE:1<SVYCVFKZOH^>8]YRHVB`AG"'&Z
M[3;=(VX.T3O9SRG/RR5Y>)FI\AU4WES\2H+E\@,T*:,E\ZOQ?G#"KNI1R?4E
MDIQAK$SDK0L)-9-OZJ">$2[.IUXA(#?P-`D#[T`O,OU0Q&$$%3?9)UK9Z1P'
MG8_H657#3XGQ*Q<7F@^<$B$%T?Z"?]5K?',!&NKWA6V\@S@:KF#<$R\NLFX8
M/V*]!DZ@2/3IJ&\AXBPQIV.DSFC`4T8I7,L(*SA`+1$9#3B7T=X*#-/AYOS,
M,4!M0OG00K*&JB%(<*X&`J8)W=`T?`=-.IF[>F*[Y.J$BN(I.0LXCLNN@7?P
M;`U,<,!,`Z5FL!Q:\P@<-WW.2P2*H(*Z:`/@+BI90:OK"`\1F"&:OF1NB.4;
MX@5'()YAACWL#$-A`SS@..4NJVRDL!H>0^T(AH3%]K!:0K2!7Y/"KBX>F4$\
M_*CTI^88;8,.AV1$%:/P@#R#_$I'N\7'SI>C*&T%)A7C9M\!GYL1#<4*H24#
ME`#UV(HN6JAAEO2%KHP4;='L##J/RHHQIDO.5LDT3'%37=PUI5,%/2AF>`JB
M.ZI,WAE/A[L'_-[);F<T.="YD:>J.NZ&ZD@@J--"(G#OH!8Q\`-.#!S?A]^O
M(20'54!LXH8X\:8V&3%=I@PQ5`HA7D05#R&F0K<>UQ"Z"-,R.>M.0#I,J>C+
M)M!F->ELQQDQ":9T%E4/2E<.Q#M&"#HS",ND,IS]])VA9/4$KGI;MK#P`&>J
M<F<'F!>]1!>6T351E0-]L+IRSQO/%R&YH:'U2YG:NGXDSK&XWX18O&$8:*B.
M9\#VF4]T=)`6(/\&=*[5OH&@=$6SFU<V5$>+UR^N]WR%NR(RSML"*!\+U.VJ
MZ4E3>"/4DZZQ1ODI`JQGM+ZHNLSDUFQ)Q1UUI"^B]Z40DXS3$5Y.BN<S&$TO
MX8J>&UII1X5+JERQY.S<M5("18KI.1)F9E94\`<-E['+6A+IK%&'5E4JU@ML
M+$'784*I8/;2,)H:)*V0&S$:";R;X'8J\X6*8_"PR(0K+'J1ZDE!A#W$;Z5K
M],FHC-(BLVZ;D#-0!6/@6.8)Y:Q2RL4RM#EB*,-2L0N`K;M(2((^EK2BB4)6
M7>+Y?APAW3LDWZ`Z\%S**I44"_!5%,LU37;"E5"-C8`0A9L)C0Z((@4=`>4%
M!_Z1T/P&#F.&E"9(9$@\[\^)?J#9(`9,"P8Z)!!FS#@2F6Z560Z2BN>-DF%V
MEC9-'$SYL-SAY"4.7"'L"U7XNXAOZ^B^W45ULD86(R"M'1?%T?JH<`]+E;*N
M-:=CT=CPZ)VT.J%3-MLL@?H$A#4=ES>+V60IT6LI?TCTU79*DK)(%^1=I()#
MYHZ2Q3*'#JY+4MFDQS!CJ8M%2'/G<@0)X6E'H>J8-ZGI&:^3#"?@`E:6NOJ$
MII)(FPK3G,W`7!QS)XTF7,/$-HZ62HOFK*:-,M,QA1,+`.BHY4[U/?:2)I6`
MF>1F<0)%W%E?G%.`P0!LV-7%(V_.[0YMZOIO7),_\IS=[Y_ZN<G[73K8VM76
M(WW"^WUS+FW>Z9\HM^?>\7N>]['QNJQM$J>;I;/]'+-`&_14;:*(V#0!K?"X
MI#UIQ(?H$[452IE]19"]BT@51SDK`TD>7/6#CI<7[0E&B89+!!.L16K_.Q?O
M4Y3?/V)P'U/>_Q__^</_QY__N=[K_X5C]2]P7;XD6_D<,BS/^$T,>,CS<(QE
MCE%/'N0;E./0>?,-4OI<2N.?B-;\(TC(_TT4MA/NEZR0._B"B-M=+F@+18$2
MM[0I`A5IQ&9ER[FD(\@@:W!QD@92W62))FF:;[)&D]3VNJ\)BRH)ZTVYGUVS
M]Z(EM88)9J%J0A*,.J0?TK!@FY76%P"Q::L>FZZ.+(PL?1ZLN#0L)E5,K(J2
M/>=87!!WDL2]P](U*4RU30><BH58D@#B)I;P,V643I2DCG@WB](E29*'4C;[
M;Y04.@XC2WDN2TM(Z\Q56#P)M#1M7EHW5;(BW=HNYD0]A`++6*3[784,D\1D
MD";2Y<5M5TBF[E&3<J1VR?#MCF2ER0\ZI&0V>Z)%JJ=TRZQ<9&2<F/9^VY+L
M5*E5#"$79*'CM-VCQX:HLEP\FPU9RT%$N*6>KJ&>*KUCR/DA)>'*313B`G*;
MS)04`20WHR6(1'<MAE7DFE:DT\SH8<C4O?44LA*PB4'DS#C>TZ'S(BKH0=:5
M.^G8]JE*L8N:4E$*3!579$GJSZ:<BBHQ:XH<(YLATV$L[E=;]I/@$`[`=@2G
MM`D%1A3ILI34"XI$["OSCT0W7S/TLO2:FY1#;8@26R6KM:VW7WH8XYDQL6XW
M(I0Z5]V+TB4K=LR)_=NI24.UGTE55N)WV8\I:V,U*<<"(XL?XM&1N-(7_MIL
MQ%9E!H>N.7+);%-A#UTHFUB8Q.<TJ7LI?T&9/LB*R-#8J2B6JHVH@<$N;<!$
M47IP$4PU!.PM=.LFJ+=M=O?DC4)7V9?0-OL%DC1/&B#0IE'4L'^]]`I=JY%M
M`\?R3JA9N4)$;2UR:\(K=PO;-KQ59=$FW3)4A9O:I5,QY#H$1-:$/*6EW:I-
M7=P91G/9-7AT\E6W%BC^&<)M:QNJ)FK]C9O0),I7XHL@]G*"]+S)Q-7E$Z,1
M&>D4;URXQQ6W-Y`2=NR043>Z2J.&R4Y,%72D%04#"'8#-`S27<)661NDC2)"
M(T!QJ.OJVOPV#7\SQ")4=8YI5':O1",H"\F^::\,&&MUG4]>MPH8S8X(1$U4
ME:P9^"_<OF8:5WNNW8!]5S?*K#C(3S)>)7P6U&Y8VKR6#Y*25N'T0>R@(*\%
M(=^:4JB&:'.E'7MD`+%M;S[.B>BEK=$N>E17>N]![#0C#@P1UI(V5.V@+39]
MV63H2C0;*V6]2S`>I'!5>1I352E/9A7J2OF>O#C#5?'$*E")I6(8@I+3N'R[
M"X-8Q!J4G0J7'2-")@P)YK;H&&X<,^*YIPU+1:DE0W+O(0"G2*Z?%TH<HT<J
M@RTM1,L;0G9"MS%PXQGO.\K*.*Q*6+.F?IZW=B5%`7,6M*,M;\G2,YGH&@.+
M-C2]+VQ#SD4L*Z%I:/Y._(RT8;$O]1.3],R&(+_IUAS3YP!\63?=P3)34`G]
M]JPLH"8;&2>MN;+[3<+V2"@?3VKAF&'A/O6B^Q`V[/",TJ'MOUA1,ZE?41Z;
MI91*%1PSV:CA4O)J`NI\`PZ&[FG59G]#&F/>%$1GK:,NB9P50=15<$31C>9,
M6JT`\/.&VN;*!3\$L1?-%$P%1ZS#]DS%C<`K54HJDUUUN6E3.?-WC&B86=I1
M4_.-7)`_YUJ8+$T2Y*"*:-GE^R&&LG(=][NJ'O%N"?1&J&%)T@RBB&%6X;+0
M=2/+(@]4!*VT$5=`[KLMHDVZ)$W.%1)'WI0WE1Y6I2<S-EO6HDEG!!R4<4[)
MGQI@):A4U3LY[?G5;Z#HR_Y6P]QB+)N)(FVT'=]W\S4)AZ)DUXANHGV6I`6R
M(Q]U^4:0(MO/DNQD(\B*/=C^S`VHZ3B0)@T%WX"<(A4`;:,DZ?]6>7]!F5%(
M)%*URT:@)FX9L96Y,D9=[J+->%PWZI*3[[XLI4@!A78EU06,=.*-G^]_^.>W
M#P]/^%4W+U^]PV^Z>?'ZK^]/?/?P^/[?VY=_?WK]IS^LWX/S]1^?_>C[XX>O
MOWWX%U!+`0(7`Q0``@`(`*J2+RV+\`F$QPP``$U'```,``T```````$```"D
M@0````!P87)S9&%T82YC<'!55`4``T![A#U5>```4$L%!@`````!``$`1P``
'``8-````````
`
end

--==_Exmh_-2880922760
Content-Type: text/plain; charset=us-ascii

Victor Porton (porton@ex-code.com)
--==_Exmh_-2880922760--