[sword-devel] Rendering issues with Finnish Umlauts in FinPR

Troy A. Griffitts scribe at crosswire.org
Thu Apr 6 16:22:47 EDT 2023


Hi Tobias,

Great that the encoding issue is solved. If you really want to handle OSIS yourself (which is a lot of work) I would recommend requesting OSIS markup. Remember, we support various markup formats. Not everything is OSIS. If you ask for OSIS then you should get generally the same markup you received previously, from newer modules which are almost exclusively OSIS-- most markup will just be passthru. If a user installs a GBF or ThML or some other markup module, we will try our best to convert to OSIS on the fly for you.

Though I wouldn't recommend this path long-term. Quite a bit of time goes into the various render filters to handle all the nuances of our supported markup formats. I would recommend using one of our render filter sets eventually (preferably WEBIF, XHTML, or HTMLHREF) and let us give you nice HTML output with styles you can adjust as you see fit. This will focus that burden of support for new markup on the engine instead of your frontend.  I don't know of any other frontend team that tries to handle OSIS themselves.

If you feel you might lose control of how you want output, we have various means to inject modifications, of you don't like what we give you, but honestly, if you have a good objection, we're likely to adjust the output for you. We try our best to just give classed containers: divs and spans, etc., and there is an API call to give you our default styles which give a sane styling for classes. Typically, the strategy is to include these styles first in your output, and then override anything you don't like afterward. This way, if something new is added in the engine, at least you will get some sane style applied for it.  The call is:

SWModule::getRenderHeader()

Hope this helps a bit. Thanks for letting us know your progress. I am excited about your app!

On April 6, 2023 12:36:41 PM MST, Tobias Klein <contact at tklein.info> wrote:
>Hi Troy,
>
>Thanks for your help! I have just tried using this parameter for the construction of SWMgr and it looks good.
>I had another look at the example for the German Rieger commentary where I observed the encoding issues before and they are gone now!
>
>Well, I guess I could have asked one or the other question before ... but as long as a bit of code reading and experimenting results in a solution, I'm usually fine.
>Thank you! I do appreciate all your efforts.
>
>I do have another question regarding the construction of MarkupFilterMgr.
>If I want to apply the encoding filter for UTF8, but do not need the markup filter manager for XHTML is it ok to perform the construction like this?
>
>new MarkupFilterMgr(sword::FMT_UNKNOWN, sword::ENC_UTF8)
>
>I checked the code in markupfiltmgr.cpp and found that the implementation of MarkupFilterMgr::createFilters does not consider the case FMT_UNKNOWN, so in this case it would simply not add any specific markup filter, right? Since I haven't used any markup filters so far and my code already depends on the standard output generated by the SWORD engine I did not want to add a markup filter for the time being.
>
>Is there another way to apply the UTF8 encoding without using MarkupFilterMgr? (It just looks a bit weird when I look at the construction now)
>
>Best regards,
>Tobias
>
>PS:
>I think one thing we could do one of these days is check together with other frontend developers whether some helper functions created in various frontends could also be moved into the SWORD library.
>
>Consider some of the helper functions implemented here:
>https://github.com/ezra-bible-app/node-sword-interface/blob/master/src/sword_backend/module_helper.cpp
>https://github.com/ezra-bible-app/node-sword-interface/blob/master/src/sword_backend/repository_interface.cpp
>
>
>On 4/3/23 9:27 PM, Troy A. Griffitts wrote:
>> Hi Tobias,
>> 
>> Yes, our documentation certainly needs much improvement. I am surprised how far you've gone with so few questions. You have a great talent for figuring things out. I wouldn't worry about the guts of the details in the encoding filters. All you should need to do is specify your desired output on the SWMgr you use for rendering with something like:
>> 
>> SWMgr mgr(new MarkupFilterMgr(sword::FMT_XHTML, sword::ENC_UTF8));
>> 
>> Let me know if you'd like help,
>> 
>> Troy
>> 
>> 
>> On April 3, 2023 11:18:29 AM MST, Tobias Klein <contact at tklein.info> wrote:
>> 
>>     Thanks Troy!
>> 
>>     I'll have a look at the EncodingFilters.
>> 
>>     I think this is something not fully clear from the SWORD
>>     documentation/examples.
>> 
>>     Maybe these transformation points you had mentioned in the thread
>>     below should be described somewhere in the developer wiki?
>> 
>>     Best regards,
>>     Tobias
>> 
>>     On 4/3/23 6:45 PM, Troy A. Griffitts wrote:
>>>     Dear Tobias,
>>> 
>>>     Please be sure to note my comment to you below in this thread. It
>>>     is likely the cause of your rendering issues, while other apps
>>>     have no problems.
>>> 
>>>     In brief, it says that I haven't seen anywhere that you tell
>>>     SWORD what markup and encoding you want from the engine. If this
>>>     is the case you will get whatever the modules are encoded /
>>>     marked up as, which might be various things.
>>> 
>>>     Hope this helps,
>>> 
>>>     Troy
>>> 
>>>     On January 22, 2023 12:03:22 PM MST, "Troy A. Griffitts"
>>>     <scribe at crosswire.org> wrote:
>>> 
>>>         Hey guys,
>>> 
>>>         Sorry for not jumping in on this thread more quickly.
>>> 
>>>         Please remember, SWORD has 4 transformation points, each
>>>         moving from the module source (as described in the .conf
>>>         file) to the client's request:
>>> 
>>>         RenderFilters - markup, e.g., GBF, ThML, OSIS -> XHTML
>>> 
>>>         StripFilters - prep before searching
>>> 
>>>         OptionFilters - turning on an off markup in the text stream
>>>         based on user options, e.g., Strongs Number, Words of Christ
>>>         in Red, etc.
>>> 
>>>         EncodingFilters - e.g., 8859 - > UTF-8
>>> 
>>> 
>>>         Module team: be sure the module has the correct Encoding
>>>         value in the .conf file (or the default)
>>> 
>>>         Tobias, be sure you are creating your SWMgr with the correct
>>>         MarkupFilterMgr to do the transformation you desire, e.g., see:
>>> 
>>>         https://crosswire.org/svn/sword/trunk/examples/cmdline/outrender.cpp
>>> 
>>>         Hope this helps,
>>> 
>>>         Troy
>>> 
>>> 
>>>         On 1/22/23 10:39, Fr Cyrille wrote:
>>>>         HI David,
>>>>         If you send me the file, I can convert it quickly in osis. I
>>>>         script it from imp to usfm and the with u2o.py.
>>>> 
>>>>         Le 22/01/2023 à 16:54, David Haslam a écrit :
>>>>>         Thanks Tobias,
>>>>> 
>>>>>         The problem is that CrossWire no longer accepts module
>>>>>         submissions that use IMP format for the build process.
>>>>> 
>>>>>         We’d need to have a script (or equivalent TextPipe filter)
>>>>>         to convert IMP to OSIS (whether directly or indirectly
>>>>>         through some other intermediate file format).
>>>>> 
>>>>>         I’m not currently in a practical position to work on that
>>>>>         kind of task.
>>>>>         Is anyone else up to it?
>>>>> 
>>>>>         Best regards,
>>>>> 
>>>>>         David
>>>>> 
>>>>>         Sent from Proton Mail for iOS
>>>>> 
>>>>> 
>>>>>         On Sun, Jan 22, 2023 at 15:39, Tobias Klein
>>>>>         <contact at tklein.info> wrote:
>>>>>> 
>>>>>>         The FinPR module that David sent me works fine without
>>>>>>         rendering issues! (see screenshot below)
>>>>>> 
>>>>>>         It would be good to upgrade the module in the repo
>>>>>>         accordingly.
>>>>>> 
>>>>>>         Best regards,
>>>>>>         Tobias
>>>>>> 
>>>>>>         On 1/22/23 8:31 AM, David Haslam wrote:
>>>>>>>         Thanks Kristóf.
>>>>>>> 
>>>>>>>         The rendering problem could have been fixed a decade ago!!!
>>>>>>> 
>>>>>>>         Checking through my email archives yesterday, I
>>>>>>>         discovered that I had rebuilt the FinPR module exactly 10
>>>>>>>         years ago! That rebuild used mod2imp and imp2vs and
>>>>>>>         included a fix to the text encoding implemented on the
>>>>>>>         IMP textfile). The message was sent to the modules
>>>>>>>         address on 2013-01-21 but presumably never progressed by
>>>>>>>         Chris Little who was then still supposed to be
>>>>>>>         responsible for module releases and updates. He went
>>>>>>>         permanently AWOL from CrossWire around that time.
>>>>>>> 
>>>>>>>         Back then we had not narrowed the policy for submitted
>>>>>>>         source text to be OSIS XML only.
>>>>>>> 
>>>>>>>         I wrote privately to Tobias last night, forwarding the
>>>>>>>         email of 10 years ago complete with both attachments. He
>>>>>>>         will examine those today.
>>>>>>> 
>>>>>>>         Aside: I also replaced <…> by {…} where these had wrapped
>>>>>>>         the ch:vs references that recorded av11n in the original
>>>>>>>         upstream source. In 2012, there had been no suitable
>>>>>>>         av11n available in SWORD but which we do have more recently.
>>>>>>> 
>>>>>>>         mod2osis should not be used, as has already been noted.
>>>>>>>         A round trip with mod2osis and osis2mod is not lossless,
>>>>>>>         unlike one with mod2imp and imp2vs.
>>>>>>> 
>>>>>>> 
>>>>>>>         Best regards,
>>>>>>> 
>>>>>>>         David
>>>>>>> 
>>>>>>>         Sent from Proton Mail for iOS
>>>>>>> 
>>>>>>> 
>>>>>>>         On Sat, Jan 21, 2023 at 23:15, Kristof Szabo
>>>>>>>         <kristof.szabo at web.de> wrote:
>>>>>>>>         I managed to get Ezra running (it was some libicu70
>>>>>>>>         mess), and yes, the accented characters in this module
>>>>>>>>         are broken (as other modules accented characters are OK;
>>>>>>>>         I assume it is not a font issue). I tried the conf file
>>>>>>>>         change, but it didn't work either.
>>>>>>>> 
>>>>>>>>         The mitigation was to rebuild the module, mod2osis
>>>>>>>>         leaves some garbage in the OSIS, but that would be easy
>>>>>>>>         to clean, anyway osis2mod is possible with this garbage
>>>>>>>>         left in and tada we have a proper accents.
>>>>>>>> 
>>>>>>>>         image.png
>>>>>>>> 
>>>>>>>>         As the module was updated last only 3,5 yrs ago I assume
>>>>>>>>         the maintainer is still active, ie. they can be reached.
>>>>>>>> 
>>>>>>>>         Or I can have a look too, the challenge is, that such a
>>>>>>>>         module rebuild can open pandora's box, if I run some
>>>>>>>>         tests (https://github.com/krisek/sword-test) or David
>>>>>>>>         checks them, then for sure there will be some issues.
>>>>>>>>         I'm happy to fix some of them, but I definitely do not
>>>>>>>>         speak Finnish, so I'm not sure this would be a
>>>>>>>>         responsible action. If Dom gives me the go I can fix
>>>>>>>>         syntax & submit, but I don't want to end up in the
>>>>>>>>         rabbit hole :) Best would be to reach out to the
>>>>>>>>         original maintainer.
>>>>>>>> 
>>>>>>>>         Kind regards,
>>>>>>>>         k-
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>>         On Sat, Jan 21, 2023 at 8:26 PM Greg Hellings
>>>>>>>>         <greg.hellings at gmail.com> wrote:
>>>>>>>> 
>>>>>>>>             Is Ezra properly setting encoding on the content it
>>>>>>>>             renders? Is it maybe setting a font that doesn't
>>>>>>>>             have the proper code points?
>>>>>>>> 
>>>>>>>>             --Greg
>>>>>>>> 
>>>>>>>>             On Sat, Jan 21, 2023, 13:12 Tobias Klein
>>>>>>>>             <contact at tklein.info> wrote:
>>>>>>>> 
>>>>>>>>                 Hi Kristof, David,
>>>>>>>> 
>>>>>>>>                 Adding Encoding=UTF-8 to the module conf file
>>>>>>>>                 ~/.sword/mods.d/finpr.conf does not solve my issue.
>>>>>>>> 
>>>>>>>>                 The text still looks the same as before ...
>>>>>>>> 
>>>>>>>>                 What else could I do to further debug this?
>>>>>>>> 
>>>>>>>>                 Best regards,
>>>>>>>>                 Tobias
>>>>>>>> 
>>>>>>>>                 On 1/21/23 5:18 PM, Kristof Szabo wrote:
>>>>>>>>>                 Hi Thomas,
>>>>>>>>> 
>>>>>>>>>                 I suppose the problem is that finpr.conf
>>>>>>>>>                 contains no encoding information (check the
>>>>>>>>>                 Hun* modules for reference), and if there is
>>>>>>>>>                 nothing specified Latin-1 is the default.
>>>>>>>>>                 mod2osis (shouldn't be used !! :)) shows that
>>>>>>>>>                 the module is in UTF-8, so there is a misalignment.
>>>>>>>>> 
>>>>>>>>>                 https://wiki.crosswire.org/DevTools:conf_Files#:~:text=Plaintext-,Encoding,-UTF%2D8%0AUTF
>>>>>>>>> 
>>>>>>>>>                 Kind regards,
>>>>>>>>>                 Kristof
>>>>>>>>> 
>>>>>>>>>                 On Sat, Jan 21, 2023 at 4:49 PM David Haslam
>>>>>>>>>                 <dfhdfh at protonmail.com> wrote:
>>>>>>>>> 
>>>>>>>>>                     Hi Thomas,
>>>>>>>>> 
>>>>>>>>>                     What about other Finnish modules?
>>>>>>>>>                     eg. FinPR92, FinRK, FinSTLK2017
>>>>>>>>> 
>>>>>>>>>                     Presumably you already tested (eg) German
>>>>>>>>>                     modules and found that umlauts and eszett
>>>>>>>>>                     are both rendered aright?
>>>>>>>>> 
>>>>>>>>>                     Btw. FinPR renders aright in PocketSword
>>>>>>>>>                     (iOS/iPadOS).
>>>>>>>>> 
>>>>>>>>>                     David
>>>>>>>>> 
>>>>>>>>>                     Sent from Proton Mail for iOS
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>                     On Sat, Jan 21, 2023 at 15:25, Tobias Klein
>>>>>>>>>                     <contact at tklein.info> wrote:
>>>>>>>>>> 
>>>>>>>>>>                     Hi,
>>>>>>>>>> 
>>>>>>>>>>                     When retrieving the text of the FinPR
>>>>>>>>>>                     module I am getting some rendering issues
>>>>>>>>>>                     with the Finnish Umlauts. This is based on
>>>>>>>>>>                     a user's problem report.
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>>                     Romans 5:8 returns like this in
>>>>>>>>>>                     node-sword-interface / Ezra:
>>>>>>>>>> 
>>>>>>>>>>                     Mutta Jumala osoittaa rakkautensa meit�
>>>>>>>>>>                     kohtaan siin�, ett� Kristus, kun me viel�
>>>>>>>>>>                     olimme syntisi�, kuoli meid�n edest�mme.
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>>                     While it should like like this (rendered
>>>>>>>>>>                     text copied from Xiphos):
>>>>>>>>>> 
>>>>>>>>>>                     Mutta Jumala osoittaa rakkautensa meitä
>>>>>>>>>>                     kohtaan siinä, että Kristus, kun me vielä
>>>>>>>>>>                     olimme syntisiä, kuoli meidän edestämme.
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>>                     This occurs both on Linux and macOS (have
>>>>>>>>>>                     not tested on Windows yet).
>>>>>>>>>> 
>>>>>>>>>>                     Any pointers what could be the root cause?
>>>>>>>>>>                     I generally have not observed rendering
>>>>>>>>>>                     issues with other modules.
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>>                     Best regards,
>>>>>>>>>>                     Tobias
>>>>>>>>>> 
>>>>>>>>>                     _______________________________________________
>>>>>>>>>                     sword-devel mailing list:
>>>>>>>>>                     sword-devel at crosswire.org
>>>>>>>>>                     http://crosswire.org/mailman/listinfo/sword-devel
>>>>>>>>>                     Instructions to unsubscribe/change your
>>>>>>>>>                     settings at above page
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>                 _______________________________________________
>>>>>>>>>                 sword-devel mailing list:sword-devel at crosswire.org
>>>>>>>>>                 http://crosswire.org/mailman/listinfo/sword-devel
>>>>>>>>>                 Instructions to unsubscribe/change your settings at above page
>>>>>>>>                 _______________________________________________
>>>>>>>>                 sword-devel mailing list: sword-devel at crosswire.org
>>>>>>>>                 http://crosswire.org/mailman/listinfo/sword-devel
>>>>>>>>                 Instructions to unsubscribe/change your settings
>>>>>>>>                 at above page
>>>>>>>> 
>>>>>>>>             _______________________________________________
>>>>>>>>             sword-devel mailing list: sword-devel at crosswire.org
>>>>>>>>             http://crosswire.org/mailman/listinfo/sword-devel
>>>>>>>>             Instructions to unsubscribe/change your settings at
>>>>>>>>             above page
>>>>>>>> 
>>>>>>> 
>>>>>>>         _______________________________________________
>>>>>>>         sword-devel mailing list:sword-devel at crosswire.org
>>>>>>>         http://crosswire.org/mailman/listinfo/sword-devel
>>>>>>>         Instructions to unsubscribe/change your settings at above page
>>>>> 
>>>>>         _______________________________________________
>>>>>         sword-devel mailing list:sword-devel at crosswire.org
>>>>>         http://crosswire.org/mailman/listinfo/sword-devel
>>>>>         Instructions to unsubscribe/change your settings at above page
>>>> 
>>>> 
>>>>         _______________________________________________
>>>>         sword-devel mailing list:sword-devel at crosswire.org
>>>>         http://crosswire.org/mailman/listinfo/sword-devel
>>>>         Instructions to unsubscribe/change your settings at above page
>>> 
>>>     --     Sent from my Android device with K-9 Mail. Please excuse my brevity.
>>> 
>>>     _______________________________________________
>>>     sword-devel mailing list:sword-devel at crosswire.org
>>>     http://crosswire.org/mailman/listinfo/sword-devel
>>>     Instructions to unsubscribe/change your settings at above page
>> 
>> -- 
>> Sent from my Android device with K-9 Mail. Please excuse my brevity.

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://crosswire.org/pipermail/sword-devel/attachments/20230406/a8c28d22/attachment-0001.htm>


More information about the sword-devel mailing list