[sword-devel] diatheke plain output - line breaks missing?

Troy A. Griffitts scribe at crosswire.org
Tue Jan 23 18:04:01 MST 2007


DM,
	Again, to summarize the previous thread:  We are not suggesting 
re-encoding the text.  We are suggesting the use of end user, readable 
ASCII markup characters-- as you have also previously suggested:

[supplied word], LORD, {note:}, etc.

	Since STRIP filters (*plain) are for searching and CANNOT have these 
because they would fail certain search hits, it has been proposed to add 
a new RENDER filter set which is gear for plain text DISPLAY.  I 
proposed the name *console (*ascii seemed to me to possibly introduce 
confusion that they might ENCODE text into ascii somehow-- which it 
seems you may have also been confused that they might do).

	I hope this is clearer, though after re-reading, I'm not sure it is.

	-Troy.


DM Smith wrote:
> There has been much talk about an "ASCII" filter in this thread. I  
> think that might run into problems.
> 
> The ESV and KJV modules are encoded in UTF-8 and have some UTF-8  
> characters. Additionally, Sword allows for numeric entities for  
> unicode characters in OSIS. These would need to be converted into  
> UTF-8 characters.
> 
> Also, Sword allows for any module that is not UTF-8 to be encoded in  
> CP1252 (Microsoft's variant of Latin-1) with non-ASCII characters.
> 
> 
> On Jan 23, 2007, at 1:54 AM, Greg Hellings wrote:
> 
>> Troy,
>>
>> Yes, the previous email answered the question I had about how to add a
>> new filter.  It's as straightforward and easy as I suspected. :)  And
>> your naming scheme of *plain and *console seems pretty good but maybe
>> just *plain would be the ones for searching, as they are now, with NO
>> formatting, including ASCII new-lies etc.  And we could just do
>> something like osisascii.cpp for ones that still maintain the
>> ASCII-supported formatting?
>>
>> I trust that you'll change the '\n' to ' ' in my patch if you decide
>> to use it.  ;)  It would make sense to at least preserve that part of
>> the white space otherwise searches won't work correctly as those two
>> words would be interpreted as one word.
>>
>> Is the new *ascii filters something you would be interested in seeing?
>>  I'm sure with the current set of *plain filters, I could be sure to
>> make excellent progress on any that would be needed.  (Indeed, would
>> anything other than OSIS be needed?  I don't think any of the other
>> markup formats that I've seen are complex enough to require a
>> different class between *plain and *ascii uses.  In instances like
>> that perhaps there could be an additional entry to FMT_* that is
>> caught by the same case as the plain filter.  Perhaps one for
>> ThML2ASCII would also be good to take out the <br> and <scripRef>
>> elements?  Would any others need to be created that would be different
>> between plain and ascii formatting?
>>
>> --Greg
>>
>> On 1/22/07, Troy A. Griffitts <scribe at crosswire.org> wrote:
>>> Greg,
>>>         We just missed eachother :)  Hope my previous email  
>>> answers your
>>> questions.  Yes, adding a newline in your patch would break the
>>> functionality of the primary purpose for the *plain filters.  The  
>>> issue
>>> is that we are using them secondarily as output filters.  You have
>>> noticed this and suggested well that we should have 2 different  
>>> filter sets.
>>>
>>>         -Troy.
>>>
>>>
>>>
>>> Greg Hellings wrote:
>>>> Indeed.  Troy, in that case is my insertion of the new-line  
>>>> character
>>>> going to break the searches?  If one searches for a string that  
>>>> spans
>>>> a new-line character in the filter, will the search pick up the
>>>> white-space and be intelligent about searching for the newline
>>>> character also?  And what about the fact that DM says that <q> is  
>>>> not
>>>> translated to "?
>>>>
>>>> And if I wanted to take the current osisplain.cpp/.h and translate
>>>> them into an output filter that would be more suitable for something
>>>> like diatheke, what types of changes should be made to make that
>>>> visible by the SWMgr?  As was pointed out, if the main purpose of  
>>>> the
>>>> current *plain.cpp files is to prepare the output for searching and
>>>> not for display perhaps there should be a *plain_serach and
>>>> *plain_display variants or some other such naming scheme?
>>>>
>>>> --Greg
>>>>
>>>> On 1/22/07, benjie <cricketc at gmail.com> wrote:
>>>>> On Mon, Jan 22, 2007 at 09:09:14PM -0700, Troy A. Griffitts wrote:
>>>>>> Well, kindof.  It's a matter of purpose.  The purpose for a  
>>>>>> strip filter
>>>>>> is to prepare the buffer for a search, e.g. stristr(StripText 
>>>>>> (), istr)
>>>>>>
>>>>>> for example, if one searches for a phrase,
>>>>>> "streams of water that yield"
>>>>>>
>>>>>> It should hit on Psalm 1:3
>>>>>>
>>>>>> He is like a tree
>>>>>> planted by streams of water
>>>>>> that yields its fruit in its season,
>>>>>> and its leaf does not wither.
>>>>>> In all that he does, he prospers.
>>>>>>
>>>>>> So, in conclusion, filters have different purposes.
>>>>>> From: http://crosswire.org/svn/sword/trunk/include/swmodule.h
>>>>>>
>>>>>>   virtual SWModule &AddRenderFilter(SWFilter *newfilter);
>>>>>>   virtual SWModule &AddEncodingFilter(SWFilter *newfilter);
>>>>>>   virtual SWModule &AddStripFilter(SWFilter *newfilter);
>>>>>>   virtual SWModule &AddRawFilter(SWFilter *newfilter);
>>>>>>   virtual SWModule &AddOptionFilter(SWOptionFilter *newfilter);
>>>>> So if we are interested in working with a plain text (ASCII)  
>>>>> rendering
>>>>> filter, we really need to write a new filter specifically for  
>>>>> that. It
>>>>> seems like that would be good for diatheke, which defaults to plain
>>>>> output anyway. It wouldn't hurt for that output to be formatted  
>>>>> a bit
>>>>> better.
>>>>>
>>>>> -Benjie
>>>>>
>>>>> _______________________________________________
>>>>> sword-devel mailing list: sword-devel at crosswire.org
>>>>> http://www.crosswire.org/mailman/listinfo/sword-devel
>>>>> Instructions to unsubscribe/change your settings at above page
>>>>>
>>>> _______________________________________________
>>>> sword-devel mailing list: sword-devel at crosswire.org
>>>> http://www.crosswire.org/mailman/listinfo/sword-devel
>>>> Instructions to unsubscribe/change your settings at above page
>>>
>>> _______________________________________________
>>> sword-devel mailing list: sword-devel at crosswire.org
>>> http://www.crosswire.org/mailman/listinfo/sword-devel
>>> Instructions to unsubscribe/change your settings at above page
>>>
>> _______________________________________________
>> sword-devel mailing list: sword-devel at crosswire.org
>> http://www.crosswire.org/mailman/listinfo/sword-devel
>> Instructions to unsubscribe/change your settings at above page
> 
> 
> _______________________________________________
> sword-devel mailing list: sword-devel at crosswire.org
> http://www.crosswire.org/mailman/listinfo/sword-devel
> Instructions to unsubscribe/change your settings at above page




More information about the sword-devel mailing list