[sword-devel] diatheke plain output - line breaks missing?
DM Smith
dmsmith555 at yahoo.com
Tue Jan 23 16:44:41 MST 2007
There has been much talk about an "ASCII" filter in this thread. I
think that might run into problems.
The ESV and KJV modules are encoded in UTF-8 and have some UTF-8
characters. Additionally, Sword allows for numeric entities for
unicode characters in OSIS. These would need to be converted into
UTF-8 characters.
Also, Sword allows for any module that is not UTF-8 to be encoded in
CP1252 (Microsoft's variant of Latin-1) with non-ASCII characters.
On Jan 23, 2007, at 1:54 AM, Greg Hellings wrote:
> Troy,
>
> Yes, the previous email answered the question I had about how to add a
> new filter. It's as straightforward and easy as I suspected. :) And
> your naming scheme of *plain and *console seems pretty good but maybe
> just *plain would be the ones for searching, as they are now, with NO
> formatting, including ASCII new-lies etc. And we could just do
> something like osisascii.cpp for ones that still maintain the
> ASCII-supported formatting?
>
> I trust that you'll change the '\n' to ' ' in my patch if you decide
> to use it. ;) It would make sense to at least preserve that part of
> the white space otherwise searches won't work correctly as those two
> words would be interpreted as one word.
>
> Is the new *ascii filters something you would be interested in seeing?
> I'm sure with the current set of *plain filters, I could be sure to
> make excellent progress on any that would be needed. (Indeed, would
> anything other than OSIS be needed? I don't think any of the other
> markup formats that I've seen are complex enough to require a
> different class between *plain and *ascii uses. In instances like
> that perhaps there could be an additional entry to FMT_* that is
> caught by the same case as the plain filter. Perhaps one for
> ThML2ASCII would also be good to take out the <br> and <scripRef>
> elements? Would any others need to be created that would be different
> between plain and ascii formatting?
>
> --Greg
>
> On 1/22/07, Troy A. Griffitts <scribe at crosswire.org> wrote:
>> Greg,
>> We just missed eachother :) Hope my previous email
>> answers your
>> questions. Yes, adding a newline in your patch would break the
>> functionality of the primary purpose for the *plain filters. The
>> issue
>> is that we are using them secondarily as output filters. You have
>> noticed this and suggested well that we should have 2 different
>> filter sets.
>>
>> -Troy.
>>
>>
>>
>> Greg Hellings wrote:
>>> Indeed. Troy, in that case is my insertion of the new-line
>>> character
>>> going to break the searches? If one searches for a string that
>>> spans
>>> a new-line character in the filter, will the search pick up the
>>> white-space and be intelligent about searching for the newline
>>> character also? And what about the fact that DM says that <q> is
>>> not
>>> translated to "?
>>>
>>> And if I wanted to take the current osisplain.cpp/.h and translate
>>> them into an output filter that would be more suitable for something
>>> like diatheke, what types of changes should be made to make that
>>> visible by the SWMgr? As was pointed out, if the main purpose of
>>> the
>>> current *plain.cpp files is to prepare the output for searching and
>>> not for display perhaps there should be a *plain_serach and
>>> *plain_display variants or some other such naming scheme?
>>>
>>> --Greg
>>>
>>> On 1/22/07, benjie <cricketc at gmail.com> wrote:
>>>> On Mon, Jan 22, 2007 at 09:09:14PM -0700, Troy A. Griffitts wrote:
>>>>> Well, kindof. It's a matter of purpose. The purpose for a
>>>>> strip filter
>>>>> is to prepare the buffer for a search, e.g. stristr(StripText
>>>>> (), istr)
>>>>>
>>>>> for example, if one searches for a phrase,
>>>>> "streams of water that yield"
>>>>>
>>>>> It should hit on Psalm 1:3
>>>>>
>>>>> He is like a tree
>>>>> planted by streams of water
>>>>> that yields its fruit in its season,
>>>>> and its leaf does not wither.
>>>>> In all that he does, he prospers.
>>>>>
>>>>> So, in conclusion, filters have different purposes.
>>>>> From: http://crosswire.org/svn/sword/trunk/include/swmodule.h
>>>>>
>>>>> virtual SWModule &AddRenderFilter(SWFilter *newfilter);
>>>>> virtual SWModule &AddEncodingFilter(SWFilter *newfilter);
>>>>> virtual SWModule &AddStripFilter(SWFilter *newfilter);
>>>>> virtual SWModule &AddRawFilter(SWFilter *newfilter);
>>>>> virtual SWModule &AddOptionFilter(SWOptionFilter *newfilter);
>>>> So if we are interested in working with a plain text (ASCII)
>>>> rendering
>>>> filter, we really need to write a new filter specifically for
>>>> that. It
>>>> seems like that would be good for diatheke, which defaults to plain
>>>> output anyway. It wouldn't hurt for that output to be formatted
>>>> a bit
>>>> better.
>>>>
>>>> -Benjie
>>>>
>>>> _______________________________________________
>>>> sword-devel mailing list: sword-devel at crosswire.org
>>>> http://www.crosswire.org/mailman/listinfo/sword-devel
>>>> Instructions to unsubscribe/change your settings at above page
>>>>
>>>
>>> _______________________________________________
>>> sword-devel mailing list: sword-devel at crosswire.org
>>> http://www.crosswire.org/mailman/listinfo/sword-devel
>>> Instructions to unsubscribe/change your settings at above page
>>
>>
>> _______________________________________________
>> sword-devel mailing list: sword-devel at crosswire.org
>> http://www.crosswire.org/mailman/listinfo/sword-devel
>> Instructions to unsubscribe/change your settings at above page
>>
>
> _______________________________________________
> sword-devel mailing list: sword-devel at crosswire.org
> http://www.crosswire.org/mailman/listinfo/sword-devel
> Instructions to unsubscribe/change your settings at above page
More information about the sword-devel
mailing list