[sword-devel] Rendering issues with Finnish Umlauts in FinPR

Troy A. Griffitts scribe at crosswire.org
Sun Jan 22 14:03:22 EST 2023


Hey guys,

Sorry for not jumping in on this thread more quickly.

Please remember, SWORD has 4 transformation points, each moving from the 
module source (as described in the .conf file) to the client's request:

RenderFilters - markup, e.g., GBF, ThML, OSIS -> XHTML

StripFilters - prep before searching

OptionFilters - turning on an off markup in the text stream based on 
user options, e.g., Strongs Number, Words of Christ in Red, etc.

EncodingFilters - e.g., 8859 - > UTF-8


Module team: be sure the module has the correct Encoding value in the 
.conf file (or the default)

Tobias, be sure you are creating your SWMgr with the correct 
MarkupFilterMgr to do the transformation you desire, e.g., see:

https://crosswire.org/svn/sword/trunk/examples/cmdline/outrender.cpp

Hope this helps,

Troy


On 1/22/23 10:39, Fr Cyrille wrote:
> HI David,
> If you send me the file, I can convert it quickly in osis. I script it 
> from imp to usfm and the with u2o.py.
>
> Le 22/01/2023 à 16:54, David Haslam a écrit :
>> Thanks Tobias,
>>
>> The problem is that CrossWire no longer accepts module submissions 
>> that use IMP format for the build process.
>>
>> We’d need to have a script (or equivalent TextPipe filter) to convert 
>> IMP to OSIS (whether directly or indirectly through some 
>> other intermediate file format).
>>
>> I’m not currently in a practical position to work on that kind of task.
>> Is anyone else up to it?
>>
>> Best regards,
>>
>> David
>>
>> Sent from Proton Mail for iOS
>>
>>
>> On Sun, Jan 22, 2023 at 15:39, Tobias Klein <contact at tklein.info> wrote:
>>>
>>> The FinPR module that David sent me works fine without rendering 
>>> issues! (see screenshot below)
>>>
>>> It would be good to upgrade the module in the repo accordingly.
>>>
>>> Best regards,
>>> Tobias
>>>
>>> On 1/22/23 8:31 AM, David Haslam wrote:
>>>> Thanks Kristóf.
>>>>
>>>> The rendering problem could have been fixed a decade ago!!!
>>>>
>>>> Checking through my email archives yesterday, I discovered that I 
>>>> had rebuilt the FinPR module exactly 10 years ago! That rebuild 
>>>> used mod2imp and imp2vs and included a fix to the text encoding 
>>>> implemented on the IMP textfile). The message was sent to the 
>>>> modules address on 2013-01-21 but presumably never progressed by 
>>>> Chris Little who was then still supposed to be responsible for 
>>>> module releases and updates. He went permanently AWOL from 
>>>> CrossWire around that time.
>>>>
>>>> Back then we had not narrowed the policy for submitted source text 
>>>> to be OSIS XML only.
>>>>
>>>> I wrote privately to Tobias last night, forwarding the email of 10 
>>>> years ago complete with both attachments. He will examine those today.
>>>>
>>>> Aside: I also replaced <…> by {…} where these had wrapped the ch:vs 
>>>> references that recorded av11n in the original upstream source. In 
>>>> 2012, there had been no suitable av11n available in SWORD but which 
>>>> we do have more recently.
>>>>
>>>> mod2osis should not be used, as has already been noted.
>>>> A round trip with mod2osis and osis2mod is not lossless, unlike one 
>>>> with mod2imp and imp2vs.
>>>>
>>>>
>>>> Best regards,
>>>>
>>>> David
>>>>
>>>> Sent from Proton Mail for iOS
>>>>
>>>>
>>>> On Sat, Jan 21, 2023 at 23:15, Kristof Szabo <kristof.szabo at web.de> 
>>>> wrote:
>>>>> I managed to get Ezra running (it was some libicu70 mess), and 
>>>>> yes, the accented characters in this module are broken (as other 
>>>>> modules accented characters are OK; I assume it is not a font 
>>>>> issue). I tried the conf file change, but it didn't work either.
>>>>>
>>>>> The mitigation was to rebuild the module, mod2osis leaves some 
>>>>> garbage in the OSIS, but that would be easy to clean, anyway 
>>>>> osis2mod is possible with this garbage left in and tada we have a 
>>>>> proper accents.
>>>>>
>>>>> image.png
>>>>>
>>>>> As the module was updated last only 3,5 yrs ago I assume the 
>>>>> maintainer is still active, ie. they can be reached.
>>>>>
>>>>> Or I can have a look too, the challenge is, that such a module 
>>>>> rebuild can open pandora's box, if I run some tests 
>>>>> (https://github.com/krisek/sword-test) or David checks them, then 
>>>>> for sure there will be some issues. I'm happy to fix some of them, 
>>>>> but I definitely do not speak Finnish, so I'm not sure this would 
>>>>> be a responsible action. If Dom gives me the go I can fix syntax & 
>>>>> submit, but I don't want to end up in the rabbit hole :) Best 
>>>>> would be to reach out to the original maintainer.
>>>>>
>>>>> Kind regards,
>>>>> k-
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Sat, Jan 21, 2023 at 8:26 PM Greg Hellings 
>>>>> <greg.hellings at gmail.com> wrote:
>>>>>
>>>>>     Is Ezra properly setting encoding on the content it renders?
>>>>>     Is it maybe setting a font that doesn't have the proper code
>>>>>     points?
>>>>>
>>>>>     --Greg
>>>>>
>>>>>     On Sat, Jan 21, 2023, 13:12 Tobias Klein <contact at tklein.info>
>>>>>     wrote:
>>>>>
>>>>>         Hi Kristof, David,
>>>>>
>>>>>         Adding Encoding=UTF-8 to the module conf file
>>>>>         ~/.sword/mods.d/finpr.conf does not solve my issue.
>>>>>
>>>>>         The text still looks the same as before ...
>>>>>
>>>>>         What else could I do to further debug this?
>>>>>
>>>>>         Best regards,
>>>>>         Tobias
>>>>>
>>>>>         On 1/21/23 5:18 PM, Kristof Szabo wrote:
>>>>>>         Hi Thomas,
>>>>>>
>>>>>>         I suppose the problem is that finpr.conf contains no
>>>>>>         encoding information (check the Hun* modules for
>>>>>>         reference), and if there is nothing specified Latin-1 is
>>>>>>         the default. mod2osis (shouldn't be used !! :)) shows
>>>>>>         that the module is in UTF-8, so there is a misalignment.
>>>>>>
>>>>>>         https://wiki.crosswire.org/DevTools:conf_Files#:~:text=Plaintext-,Encoding,-UTF%2D8%0AUTF
>>>>>>
>>>>>>         Kind regards,
>>>>>>         Kristof
>>>>>>
>>>>>>         On Sat, Jan 21, 2023 at 4:49 PM David Haslam
>>>>>>         <dfhdfh at protonmail.com> wrote:
>>>>>>
>>>>>>             Hi Thomas,
>>>>>>
>>>>>>             What about other Finnish modules?
>>>>>>             eg. FinPR92, FinRK, FinSTLK2017
>>>>>>
>>>>>>             Presumably you already tested (eg) German modules and
>>>>>>             found that umlauts and eszett are both rendered aright?
>>>>>>
>>>>>>             Btw. FinPR renders aright in PocketSword (iOS/iPadOS).
>>>>>>
>>>>>>             David
>>>>>>
>>>>>>             Sent from Proton Mail for iOS
>>>>>>
>>>>>>
>>>>>>             On Sat, Jan 21, 2023 at 15:25, Tobias Klein
>>>>>>             <contact at tklein.info> wrote:
>>>>>>>
>>>>>>>             Hi,
>>>>>>>
>>>>>>>             When retrieving the text of the FinPR module I am
>>>>>>>             getting some rendering issues with the Finnish
>>>>>>>             Umlauts. This is based on a user's problem report.
>>>>>>>
>>>>>>>
>>>>>>>             Romans 5:8 returns like this in node-sword-interface
>>>>>>>             / Ezra:
>>>>>>>
>>>>>>>             Mutta Jumala osoittaa rakkautensa meit� kohtaan
>>>>>>>             siin�, ett� Kristus, kun me viel� olimme syntisi�,
>>>>>>>             kuoli meid�n edest�mme.
>>>>>>>
>>>>>>>
>>>>>>>             While it should like like this (rendered text copied
>>>>>>>             from Xiphos):
>>>>>>>
>>>>>>>             Mutta Jumala osoittaa rakkautensa meitä kohtaan
>>>>>>>             siinä, että Kristus, kun me vielä olimme syntisiä,
>>>>>>>             kuoli meidän edestämme.
>>>>>>>
>>>>>>>
>>>>>>>             This occurs both on Linux and macOS (have not tested
>>>>>>>             on Windows yet).
>>>>>>>
>>>>>>>             Any pointers what could be the root cause? I
>>>>>>>             generally have not observed rendering issues with
>>>>>>>             other modules.
>>>>>>>
>>>>>>>
>>>>>>>             Best regards,
>>>>>>>             Tobias
>>>>>>>
>>>>>>             _______________________________________________
>>>>>>             sword-devel mailing list: sword-devel at crosswire.org
>>>>>>             http://crosswire.org/mailman/listinfo/sword-devel
>>>>>>             Instructions to unsubscribe/change your settings at
>>>>>>             above page
>>>>>>
>>>>>>
>>>>>>         _______________________________________________
>>>>>>         sword-devel mailing list:sword-devel at crosswire.org
>>>>>>         http://crosswire.org/mailman/listinfo/sword-devel
>>>>>>         Instructions to unsubscribe/change your settings at above page
>>>>>         _______________________________________________
>>>>>         sword-devel mailing list: sword-devel at crosswire.org
>>>>>         http://crosswire.org/mailman/listinfo/sword-devel
>>>>>         Instructions to unsubscribe/change your settings at above page
>>>>>
>>>>>     _______________________________________________
>>>>>     sword-devel mailing list: sword-devel at crosswire.org
>>>>>     http://crosswire.org/mailman/listinfo/sword-devel
>>>>>     Instructions to unsubscribe/change your settings at above page
>>>>>
>>>>
>>>> _______________________________________________
>>>> sword-devel mailing list:sword-devel at crosswire.org
>>>> http://crosswire.org/mailman/listinfo/sword-devel
>>>> Instructions to unsubscribe/change your settings at above page
>>
>> _______________________________________________
>> sword-devel mailing list:sword-devel at crosswire.org
>> http://crosswire.org/mailman/listinfo/sword-devel
>> Instructions to unsubscribe/change your settings at above page
>
>
> _______________________________________________
> sword-devel mailing list:sword-devel at crosswire.org
> http://crosswire.org/mailman/listinfo/sword-devel
> Instructions to unsubscribe/change your settings at above page
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://crosswire.org/pipermail/sword-devel/attachments/20230122/5e20f7d3/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: inccpcpbnhmlapdi.png
Type: image/png
Size: 53294 bytes
Desc: not available
URL: <http://crosswire.org/pipermail/sword-devel/attachments/20230122/5e20f7d3/attachment-0002.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 191581 bytes
Desc: not available
URL: <http://crosswire.org/pipermail/sword-devel/attachments/20230122/5e20f7d3/attachment-0003.png>


More information about the sword-devel mailing list