[sword-devel] French Darby translation is OSIS (beta version)
DM Smith
dmsmith555 at yahoo.com
Sat May 12 19:02:55 MST 2007
On May 12, 2007, at 6:05 PM, Chris Little wrote:
>
>
> DM Smith wrote:
>>
>> On May 12, 2007, at 4:15 PM, Chris Little wrote:
>>
>>>
>>>> -Line-feed and tabulations are not considered as space: if you
>>>> look at
>>>>
>>>> Genesis 1:2, it should be "Et l'Esprit de Dieu" and it is
>>>> displayed as
>>>>
>>>> "Etl'Esprit de Dieu" (a space is missing).
>>>>
>>>
>>> This looks like a problem with osis2mod, but the OSIS file itself
>>> could
>>>
>>> use some whitespace cleanup. There is a lot of stray whitespace,
>>> for
>>>
>>> example at ends of lines, before </p>. The problem in Genesis 1:2
>>> could
>>>
>>> be handled by deleting changing the linefeed + tab to a single
>>> space.
>>>
>>
>> I think this is rather a "feature". osis2mod is trimming "extraneous"
>> whitespace. I think this was to handle input that is pretty. I'm in
>> favor of retaining all whitespace. My opinion is that an osis
>> document
>> should be what is actually wanted. I've got some changes I need to
>> make
>> because of the NASB (osis2mod is not handling stuff between verses
>> well). I can change this too if it is what people want.
>
> It should trim whitespace in favor of smaller, simpler files. But
> here,
> it sounds like \n and \t are being deleted rather than something like
> s/[\s]+/ /.
>
> I'm surprised we're doing this, but I'm just judging by the reported
> symptoms, rather than looking at the osis2mod code itself.
And I was going by memory. So shame on me. I just went and looked at
the code.
Osis2mod does not get rid of any "extraneous" whitespace, but it
calls FileMgr::getLine, which trims whitespace from the beginning and
the end of the line. I also think there is a bug in its handling of
line endings, in that in some places it just checks for 13 and others
just 10 and yet others both are looked for.
From what I can determine FileMgr::getLine is called by swcofig,
osis2mod and imp2gbs.
I think this should be replace with a call to std::getline. This is
used by imp2ld, imp2vs, and xml2gbs.
(for completeness, it should be noted that vpl2mod defines its own
readline, which reads one character at a time into a buffer.)
More information about the sword-devel
mailing list