[sword-devel] Rendering issues with Finnish Umlauts in FinPR

Tobias Klein contact at tklein.info
Fri Apr 7 04:29:24 EDT 2023


Thank you, Troy! With these explanations I understand the big picture of 
markup transformation and filters a bit better!

For now, I'll go forward with the FMT_OSIS option for the 
MarkupFilterMgr in node-sword-interface. I understand from your 
explanations that this produces similar results as when initializing 
SWMgr without any markup filters.

When I started out using SWORD in node-sword-interface I did not set any 
filters at the time (not really knowing about it as a newbie). Since 
then I have somehow handled the OSIS markup and I am not unhappy with 
the formatting/styling in the frontend at this point. It could probably 
be better and I have certainly not explored all variants, yet ... so, I 
will keep the filter possibilities on my mind for future adaptations!

Have a blessed Easter celebration :)

Tobias

On 4/6/23 10:22 PM, Troy A. Griffitts wrote:
> Hi Tobias,
>
> Great that the encoding issue is solved. If you really want to handle 
> OSIS yourself (which is a lot of work) I would recommend requesting 
> OSIS markup. Remember, we support various markup formats. Not 
> everything is OSIS. If you ask for OSIS then you should get generally 
> the same markup you received previously, from newer modules which are 
> almost exclusively OSIS-- most markup will just be passthru. If a user 
> installs a GBF or ThML or some other markup module, we will try our 
> best to convert to OSIS on the fly for you.
>
> Though I wouldn't recommend this path long-term. Quite a bit of time 
> goes into the various render filters to handle all the nuances of our 
> supported markup formats. I would recommend using one of our render 
> filter sets eventually (preferably WEBIF, XHTML, or HTMLHREF) and let 
> us give you nice HTML output with styles you can adjust as you see 
> fit. This will focus that burden of support for new markup on the 
> engine instead of your frontend.  I don't know of any other frontend 
> team that tries to handle OSIS themselves.
>
> If you feel you might lose control of how you want output, we have 
> various means to inject modifications, of you don't like what we give 
> you, but honestly, if you have a good objection, we're likely to 
> adjust the output for you. We try our best to just give classed 
> containers: divs and spans, etc., and there is an API call to give you 
> our default styles which give a sane styling for classes. Typically, 
> the strategy is to include these styles first in your output, and then 
> override anything you don't like afterward. This way, if something new 
> is added in the engine, at least you will get some sane style applied 
> for it.  The call is:
>
> SWModule::getRenderHeader()
>
> Hope this helps a bit. Thanks for letting us know your progress. I am 
> excited about your app!
>
> On April 6, 2023 12:36:41 PM MST, Tobias Klein <contact at tklein.info> 
> wrote:
>
>     Hi Troy,
>
>     Thanks for your help! I have just tried using this parameter for
>     the construction of SWMgr and it looks good.
>     I had another look at the example for the German Rieger commentary
>     where I observed the encoding issues before and they are gone now!
>
>     Well, I guess I could have asked one or the other question before
>     ... but as long as a bit of code reading and experimenting results
>     in a solution, I'm usually fine.
>     Thank you! I do appreciate all your efforts.
>
>     I do have another question regarding the construction of
>     MarkupFilterMgr.
>     If I want to apply the encoding filter for UTF8, but do not need
>     the markup filter manager for XHTML is it ok to perform the
>     construction like this?
>
>     new MarkupFilterMgr(sword::FMT_UNKNOWN, sword::ENC_UTF8)
>
>     I checked the code in markupfiltmgr.cpp and found that the
>     implementation of MarkupFilterMgr::createFilters does not consider
>     the case FMT_UNKNOWN, so in this case it would simply not add any
>     specific markup filter, right? Since I haven't used any markup
>     filters so far and my code already depends on the standard output
>     generated by the SWORD engine I did not want to add a markup
>     filter for the time being.
>
>     Is there another way to apply the UTF8 encoding without using
>     MarkupFilterMgr? (It just looks a bit weird when I look at the
>     construction now)
>
>     Best regards,
>     Tobias
>
>     PS:
>     I think one thing we could do one of these days is check together
>     with other frontend developers whether some helper functions
>     created in various frontends could also be moved into the SWORD
>     library.
>
>     Consider some of the helper functions implemented here:
>     https://github.com/ezra-bible-app/node-sword-interface/blob/master/src/sword_backend/module_helper.cpp
>     https://github.com/ezra-bible-app/node-sword-interface/blob/master/src/sword_backend/repository_interface.cpp
>
>
>     On 4/3/23 9:27 PM, Troy A. Griffitts wrote:
>>     Hi Tobias,
>>
>>     Yes, our documentation certainly needs much improvement. I am
>>     surprised how far you've gone with so few questions. You have a
>>     great talent for figuring things out. I wouldn't worry about the
>>     guts of the details in the encoding filters. All you should need
>>     to do is specify your desired output on the SWMgr you use for
>>     rendering with something like:
>>
>>     SWMgr mgr(new MarkupFilterMgr(sword::FMT_XHTML, sword::ENC_UTF8));
>>
>>     Let me know if you'd like help,
>>
>>     Troy
>>
>>
>>     On April 3, 2023 11:18:29 AM MST, Tobias Klein
>>     <contact at tklein.info> wrote:
>>
>>         Thanks Troy!
>>
>>         I'll have a look at the EncodingFilters.
>>
>>         I think this is something not fully clear from the SWORD
>>         documentation/examples.
>>
>>         Maybe these transformation points you had mentioned in the
>>         thread below should be described somewhere in the developer wiki?
>>
>>         Best regards,
>>         Tobias
>>
>>         On 4/3/23 6:45 PM, Troy A. Griffitts wrote:
>>>         Dear Tobias,
>>>
>>>         Please be sure to note my comment to you below in this
>>>         thread. It is likely the cause of your rendering issues,
>>>         while other apps have no problems.
>>>
>>>         In brief, it says that I haven't seen anywhere that you tell
>>>         SWORD what markup and encoding you want from the engine. If
>>>         this is the case you will get whatever the modules are
>>>         encoded / marked up as, which might be various things.
>>>
>>>         Hope this helps,
>>>
>>>         Troy
>>>
>>>         On January 22, 2023 12:03:22 PM MST, "Troy A. Griffitts"
>>>         <scribe at crosswire.org> wrote:
>>>
>>>             Hey guys,
>>>
>>>             Sorry for not jumping in on this thread more quickly.
>>>
>>>             Please remember, SWORD has 4 transformation points, each
>>>             moving from the module source (as described in the .conf
>>>             file) to the client's request:
>>>
>>>             RenderFilters - markup, e.g., GBF, ThML, OSIS -> XHTML
>>>
>>>             StripFilters - prep before searching
>>>
>>>             OptionFilters - turning on an off markup in the text
>>>             stream based on user options, e.g., Strongs Number,
>>>             Words of Christ in Red, etc.
>>>
>>>             EncodingFilters - e.g., 8859 - > UTF-8
>>>
>>>
>>>             Module team: be sure the module has the correct Encoding
>>>             value in the .conf file (or the default)
>>>
>>>             Tobias, be sure you are creating your SWMgr with the
>>>             correct MarkupFilterMgr to do the transformation you
>>>             desire, e.g., see:
>>>
>>>             https://crosswire.org/svn/sword/trunk/examples/cmdline/outrender.cpp
>>>
>>>             Hope this helps,
>>>
>>>             Troy
>>>
>>>
>>>             On 1/22/23 10:39, Fr Cyrille wrote:
>>>>             HI David,
>>>>             If you send me the file, I can convert it quickly in
>>>>             osis. I script it from imp to usfm and the with u2o.py.
>>>>
>>>>             Le 22/01/2023 à 16:54, David Haslam a écrit :
>>>>>             Thanks Tobias,
>>>>>
>>>>>             The problem is that CrossWire no longer accepts module
>>>>>             submissions that use IMP format for the build process.
>>>>>
>>>>>             We’d need to have a script (or equivalent TextPipe
>>>>>             filter) to convert IMP to OSIS (whether directly or
>>>>>             indirectly through some other intermediate file format).
>>>>>
>>>>>             I’m not currently in a practical position to work on
>>>>>             that kind of task.
>>>>>             Is anyone else up to it?
>>>>>
>>>>>             Best regards,
>>>>>
>>>>>             David
>>>>>
>>>>>             Sent from Proton Mail for iOS
>>>>>
>>>>>
>>>>>             On Sun, Jan 22, 2023 at 15:39, Tobias Klein
>>>>>             <contact at tklein.info> wrote:
>>>>>>
>>>>>>             The FinPR module that David sent me works fine
>>>>>>             without rendering issues! (see screenshot below)
>>>>>>
>>>>>>             It would be good to upgrade the module in the repo
>>>>>>             accordingly.
>>>>>>
>>>>>>             Best regards,
>>>>>>             Tobias
>>>>>>
>>>>>>             On 1/22/23 8:31 AM, David Haslam wrote:
>>>>>>>             Thanks Kristóf.
>>>>>>>
>>>>>>>             The rendering problem could have been fixed a decade
>>>>>>>             ago!!!
>>>>>>>
>>>>>>>             Checking through my email archives yesterday, I
>>>>>>>             discovered that I had rebuilt the FinPR module
>>>>>>>             exactly 10 years ago! That rebuild used mod2imp and
>>>>>>>             imp2vs and included a fix to the text encoding
>>>>>>>             implemented on the IMP textfile). The message was
>>>>>>>             sent to the modules address on 2013-01-21 but
>>>>>>>             presumably never progressed by Chris Little who was
>>>>>>>             then still supposed to be responsible for module
>>>>>>>             releases and updates. He went permanently AWOL from
>>>>>>>             CrossWire around that time.
>>>>>>>
>>>>>>>             Back then we had not narrowed the policy for
>>>>>>>             submitted source text to be OSIS XML only.
>>>>>>>
>>>>>>>             I wrote privately to Tobias last night, forwarding
>>>>>>>             the email of 10 years ago complete with both
>>>>>>>             attachments. He will examine those today.
>>>>>>>
>>>>>>>             Aside: I also replaced <…> by {…} where these had
>>>>>>>             wrapped the ch:vs references that recorded av11n in
>>>>>>>             the original upstream source. In 2012, there had
>>>>>>>             been no suitable av11n available in SWORD but which
>>>>>>>             we do have more recently.
>>>>>>>
>>>>>>>             mod2osis should not be used, as has already been noted.
>>>>>>>             A round trip with mod2osis and osis2mod is not
>>>>>>>             lossless, unlike one with mod2imp and imp2vs.
>>>>>>>
>>>>>>>
>>>>>>>             Best regards,
>>>>>>>
>>>>>>>             David
>>>>>>>
>>>>>>>             Sent from Proton Mail for iOS
>>>>>>>
>>>>>>>
>>>>>>>             On Sat, Jan 21, 2023 at 23:15, Kristof Szabo
>>>>>>>             <kristof.szabo at web.de> wrote:
>>>>>>>>             I managed to get Ezra running (it was some libicu70
>>>>>>>>             mess), and yes, the accented characters in this
>>>>>>>>             module are broken (as other modules accented
>>>>>>>>             characters are OK; I assume it is not a font
>>>>>>>>             issue). I tried the conf file change, but it didn't
>>>>>>>>             work either.
>>>>>>>>
>>>>>>>>             The mitigation was to rebuild the module, mod2osis
>>>>>>>>             leaves some garbage in the OSIS, but that would be
>>>>>>>>             easy to clean, anyway osis2mod is possible with
>>>>>>>>             this garbage left in and tada we have a proper accents.
>>>>>>>>
>>>>>>>>             image.png
>>>>>>>>
>>>>>>>>             As the module was updated last only 3,5 yrs ago I
>>>>>>>>             assume the maintainer is still active, ie. they can
>>>>>>>>             be reached.
>>>>>>>>
>>>>>>>>             Or I can have a look too, the challenge is, that
>>>>>>>>             such a module rebuild can open pandora's box, if I
>>>>>>>>             run some tests
>>>>>>>>             (https://github.com/krisek/sword-test) or David
>>>>>>>>             checks them, then for sure there will be some
>>>>>>>>             issues. I'm happy to fix some of them, but I
>>>>>>>>             definitely do not speak Finnish, so I'm not sure
>>>>>>>>             this would be a responsible action. If Dom gives me
>>>>>>>>             the go I can fix syntax & submit, but I don't want
>>>>>>>>             to end up in the rabbit hole :) Best would be to
>>>>>>>>             reach out to the original maintainer.
>>>>>>>>
>>>>>>>>             Kind regards,
>>>>>>>>             k-
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>             On Sat, Jan 21, 2023 at 8:26 PM Greg Hellings
>>>>>>>>             <greg.hellings at gmail.com> wrote:
>>>>>>>>
>>>>>>>>                 Is Ezra properly setting encoding on the
>>>>>>>>                 content it renders? Is it maybe setting a font
>>>>>>>>                 that doesn't have the proper code points?
>>>>>>>>
>>>>>>>>                 --Greg
>>>>>>>>
>>>>>>>>                 On Sat, Jan 21, 2023, 13:12 Tobias Klein
>>>>>>>>                 <contact at tklein.info> wrote:
>>>>>>>>
>>>>>>>>                     Hi Kristof, David,
>>>>>>>>
>>>>>>>>                     Adding Encoding=UTF-8 to the module conf
>>>>>>>>                     file ~/.sword/mods.d/finpr.conf does not
>>>>>>>>                     solve my issue.
>>>>>>>>
>>>>>>>>                     The text still looks the same as before ...
>>>>>>>>
>>>>>>>>                     What else could I do to further debug this?
>>>>>>>>
>>>>>>>>                     Best regards,
>>>>>>>>                     Tobias
>>>>>>>>
>>>>>>>>                     On 1/21/23 5:18 PM, Kristof Szabo wrote:
>>>>>>>>>                     Hi Thomas,
>>>>>>>>>
>>>>>>>>>                     I suppose the problem is that finpr.conf
>>>>>>>>>                     contains no encoding information (check
>>>>>>>>>                     the Hun* modules for reference), and if
>>>>>>>>>                     there is nothing specified Latin-1 is the
>>>>>>>>>                     default. mod2osis (shouldn't be used !!
>>>>>>>>>                     :)) shows that the module is in UTF-8, so
>>>>>>>>>                     there is a misalignment.
>>>>>>>>>
>>>>>>>>>                     https://wiki.crosswire.org/DevTools:conf_Files#:~:text=Plaintext-,Encoding,-UTF%2D8%0AUTF
>>>>>>>>>
>>>>>>>>>                     Kind regards,
>>>>>>>>>                     Kristof
>>>>>>>>>
>>>>>>>>>                     On Sat, Jan 21, 2023 at 4:49 PM David
>>>>>>>>>                     Haslam <dfhdfh at protonmail.com> wrote:
>>>>>>>>>
>>>>>>>>>                         Hi Thomas,
>>>>>>>>>
>>>>>>>>>                         What about other Finnish modules?
>>>>>>>>>                         eg. FinPR92, FinRK, FinSTLK2017
>>>>>>>>>
>>>>>>>>>                         Presumably you already tested (eg)
>>>>>>>>>                         German modules and found that umlauts
>>>>>>>>>                         and eszett are both rendered aright?
>>>>>>>>>
>>>>>>>>>                         Btw. FinPR renders aright in
>>>>>>>>>                         PocketSword (iOS/iPadOS).
>>>>>>>>>
>>>>>>>>>                         David
>>>>>>>>>
>>>>>>>>>                         Sent from Proton Mail for iOS
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>                         On Sat, Jan 21, 2023 at 15:25, Tobias
>>>>>>>>>                         Klein <contact at tklein.info> wrote:
>>>>>>>>>>
>>>>>>>>>>                         Hi,
>>>>>>>>>>
>>>>>>>>>>                         When retrieving the text of the FinPR
>>>>>>>>>>                         module I am getting some rendering
>>>>>>>>>>                         issues with the Finnish Umlauts. This
>>>>>>>>>>                         is based on a user's problem report.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>                         Romans 5:8 returns like this in
>>>>>>>>>>                         node-sword-interface / Ezra:
>>>>>>>>>>
>>>>>>>>>>                         Mutta Jumala osoittaa rakkautensa
>>>>>>>>>>                         meit� kohtaan siin�, ett� Kristus,
>>>>>>>>>>                         kun me viel� olimme syntisi�, kuoli
>>>>>>>>>>                         meid�n edest�mme.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>                         While it should like like this
>>>>>>>>>>                         (rendered text copied from Xiphos):
>>>>>>>>>>
>>>>>>>>>>                         Mutta Jumala osoittaa rakkautensa
>>>>>>>>>>                         meitä kohtaan siinä, että Kristus,
>>>>>>>>>>                         kun me vielä olimme syntisiä, kuoli
>>>>>>>>>>                         meidän edestämme.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>                         This occurs both on Linux and macOS
>>>>>>>>>>                         (have not tested on Windows yet).
>>>>>>>>>>
>>>>>>>>>>                         Any pointers what could be the root
>>>>>>>>>>                         cause? I generally have not observed
>>>>>>>>>>                         rendering issues with other modules.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>                         Best regards,
>>>>>>>>>>                         Tobias
>>>>>>>>>>
>>>>>>>>>                         _______________________________________________
>>>>>>>>>                         sword-devel mailing list:
>>>>>>>>>                         sword-devel at crosswire.org
>>>>>>>>>                         http://crosswire.org/mailman/listinfo/sword-devel
>>>>>>>>>                         Instructions to unsubscribe/change
>>>>>>>>>                         your settings at above page
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>                     _______________________________________________
>>>>>>>>>                     sword-devel mailing list:sword-devel at crosswire.org
>>>>>>>>>                     http://crosswire.org/mailman/listinfo/sword-devel
>>>>>>>>>                     Instructions to unsubscribe/change your settings at above page
>>>>>>>>                     _______________________________________________
>>>>>>>>                     sword-devel mailing list:
>>>>>>>>                     sword-devel at crosswire.org
>>>>>>>>                     http://crosswire.org/mailman/listinfo/sword-devel
>>>>>>>>                     Instructions to unsubscribe/change your
>>>>>>>>                     settings at above page
>>>>>>>>
>>>>>>>>                 _______________________________________________
>>>>>>>>                 sword-devel mailing list: sword-devel at crosswire.org
>>>>>>>>                 http://crosswire.org/mailman/listinfo/sword-devel
>>>>>>>>                 Instructions to unsubscribe/change your
>>>>>>>>                 settings at above page
>>>>>>>>
>>>>>>>
>>>>>>>             _______________________________________________
>>>>>>>             sword-devel mailing list:sword-devel at crosswire.org
>>>>>>>             http://crosswire.org/mailman/listinfo/sword-devel
>>>>>>>             Instructions to unsubscribe/change your settings at above page
>>>>>
>>>>>             _______________________________________________
>>>>>             sword-devel mailing list:sword-devel at crosswire.org
>>>>>             http://crosswire.org/mailman/listinfo/sword-devel
>>>>>             Instructions to unsubscribe/change your settings at above page
>>>>
>>>>
>>>>             _______________________________________________
>>>>             sword-devel mailing list:sword-devel at crosswire.org
>>>>             http://crosswire.org/mailman/listinfo/sword-devel
>>>>             Instructions to unsubscribe/change your settings at above page
>>>
>>>         -- 
>>>         Sent from my Android device with K-9 Mail. Please excuse my
>>>         brevity.
>>>
>>>         _______________________________________________
>>>         sword-devel mailing list:sword-devel at crosswire.org
>>>         http://crosswire.org/mailman/listinfo/sword-devel
>>>         Instructions to unsubscribe/change your settings at above page
>>
>>     -- 
>>     Sent from my Android device with K-9 Mail. Please excuse my brevity.
>
> -- 
> Sent from my Android device with K-9 Mail. Please excuse my brevity.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://crosswire.org/pipermail/sword-devel/attachments/20230407/b0f90fcc/attachment-0001.htm>


More information about the sword-devel mailing list