[sword-devel] diatheke plain output - line breaks missing?
Greg Hellings
greg.hellings at gmail.com
Wed Jan 24 10:09:20 MST 2007
So are the current *plain filters the way that we want them to be? It
seems that taking out the new lines in favor of spaces would be
beneficial for searches, as DM is advocating. That almost seems to be
the primary (only?) place where *plain and *console would differ. If
we aren't going to change those, then it almost seems that there is no
need for *plain and *console to be differentiated at this point, it's
only necessary to complete the expansion of the OSIS tags in osisplain
to include working on the ones like <q> and <l> like my patch adds the
line-break.
--Greg
On 1/24/07, Troy A. Griffitts <scribe at crosswire.org> wrote:
> Yes, you are correct. Before entryAttribute searches and before lucene
> searches, the only way to search for a strongs number in the text was to
> turn on strongs number and then search for, e.g. "<1234>" Different
> symbols were used for different parts of the text, as you have pointed
> out, which allowed one to be clever with regex. It was inconsistent,
> and not easy for an average end user, I know, but it is how things used
> to work. Punctuation has never been stripped to my knowledge. Thanks
> for the walk down memory lane.
>
> -Troy.
>
>
> DM Smith wrote:
> > On Jan 23, 2007, at 7:54 PM, Troy A. Griffitts wrote:
> >
> >> DM,
> >> To sum up the previous thread:
> >> *plain filters are primarily for use as strip filters.
> >> Strip filters prepare text for searching, NOT display.
> >> A newline would prevent a stristr phrase match in some situations--
> >> primarily seen in the psalms.
> >>
> >> -Troy.
> >
> > If this is the case I think that osisplain has bugs in it.
> >
> > The following are replaced by osisplain with \n:
> > <title>
> > </title>
> > </l>
> > <lg>
> > </lg>
> > <p>
> > </p>
> > <lb/>
> > <milestone type="line"/>
> >
> > Note, that some of these allow for attributes.
> >
> > Of the element of OSIS that naturally produce line breaks, these are
> > the only that are handled.
> >
> > I also find the following in osisplain
> > The text of notes are rendered between (). My guess is that notes may
> > be turned off during searching.
> > The attributes of a <w> element are rendered between <>. Again, I
> > presume strongs may be filtered out before searching.
> >
> > I did not find any stripping of punctuation in osisplain.
> >
> > It looks like phrase searching in swmodule.cpp at line 649 (that you
> > pointed out) does an exact match of the user's input within the verse
> > after both have been filtered by the same set of filters. I tried to
> > find where punctuation was stripped out, which must be done because
> > phrase searching works and verses may have punctuation, but I didn't
> > find it, having gotten lost at the list of applied filters. I am
> > wondering whether that filter replaces \n with ' '.
> >
> > _______________________________________________
> > sword-devel mailing list: sword-devel at crosswire.org
> > http://www.crosswire.org/mailman/listinfo/sword-devel
> > Instructions to unsubscribe/change your settings at above page
>
>
> _______________________________________________
> sword-devel mailing list: sword-devel at crosswire.org
> http://www.crosswire.org/mailman/listinfo/sword-devel
> Instructions to unsubscribe/change your settings at above page
>
More information about the sword-devel
mailing list