[sword-devel] Rendering issues with Finnish Umlauts in FinPR
Tobias Klein
contact at tklein.info
Fri Apr 7 04:29:24 EDT 2023
Thank you, Troy! With these explanations I understand the big picture of
markup transformation and filters a bit better!
For now, I'll go forward with the FMT_OSIS option for the
MarkupFilterMgr in node-sword-interface. I understand from your
explanations that this produces similar results as when initializing
SWMgr without any markup filters.
When I started out using SWORD in node-sword-interface I did not set any
filters at the time (not really knowing about it as a newbie). Since
then I have somehow handled the OSIS markup and I am not unhappy with
the formatting/styling in the frontend at this point. It could probably
be better and I have certainly not explored all variants, yet ... so, I
will keep the filter possibilities on my mind for future adaptations!
Have a blessed Easter celebration :)
Tobias
On 4/6/23 10:22 PM, Troy A. Griffitts wrote:
> Hi Tobias,
>
> Great that the encoding issue is solved. If you really want to handle
> OSIS yourself (which is a lot of work) I would recommend requesting
> OSIS markup. Remember, we support various markup formats. Not
> everything is OSIS. If you ask for OSIS then you should get generally
> the same markup you received previously, from newer modules which are
> almost exclusively OSIS-- most markup will just be passthru. If a user
> installs a GBF or ThML or some other markup module, we will try our
> best to convert to OSIS on the fly for you.
>
> Though I wouldn't recommend this path long-term. Quite a bit of time
> goes into the various render filters to handle all the nuances of our
> supported markup formats. I would recommend using one of our render
> filter sets eventually (preferably WEBIF, XHTML, or HTMLHREF) and let
> us give you nice HTML output with styles you can adjust as you see
> fit. This will focus that burden of support for new markup on the
> engine instead of your frontend. I don't know of any other frontend
> team that tries to handle OSIS themselves.
>
> If you feel you might lose control of how you want output, we have
> various means to inject modifications, of you don't like what we give
> you, but honestly, if you have a good objection, we're likely to
> adjust the output for you. We try our best to just give classed
> containers: divs and spans, etc., and there is an API call to give you
> our default styles which give a sane styling for classes. Typically,
> the strategy is to include these styles first in your output, and then
> override anything you don't like afterward. This way, if something new
> is added in the engine, at least you will get some sane style applied
> for it. The call is:
>
> SWModule::getRenderHeader()
>
> Hope this helps a bit. Thanks for letting us know your progress. I am
> excited about your app!
>
> On April 6, 2023 12:36:41 PM MST, Tobias Klein <contact at tklein.info>
> wrote:
>
> Hi Troy,
>
> Thanks for your help! I have just tried using this parameter for
> the construction of SWMgr and it looks good.
> I had another look at the example for the German Rieger commentary
> where I observed the encoding issues before and they are gone now!
>
> Well, I guess I could have asked one or the other question before
> ... but as long as a bit of code reading and experimenting results
> in a solution, I'm usually fine.
> Thank you! I do appreciate all your efforts.
>
> I do have another question regarding the construction of
> MarkupFilterMgr.
> If I want to apply the encoding filter for UTF8, but do not need
> the markup filter manager for XHTML is it ok to perform the
> construction like this?
>
> new MarkupFilterMgr(sword::FMT_UNKNOWN, sword::ENC_UTF8)
>
> I checked the code in markupfiltmgr.cpp and found that the
> implementation of MarkupFilterMgr::createFilters does not consider
> the case FMT_UNKNOWN, so in this case it would simply not add any
> specific markup filter, right? Since I haven't used any markup
> filters so far and my code already depends on the standard output
> generated by the SWORD engine I did not want to add a markup
> filter for the time being.
>
> Is there another way to apply the UTF8 encoding without using
> MarkupFilterMgr? (It just looks a bit weird when I look at the
> construction now)
>
> Best regards,
> Tobias
>
> PS:
> I think one thing we could do one of these days is check together
> with other frontend developers whether some helper functions
> created in various frontends could also be moved into the SWORD
> library.
>
> Consider some of the helper functions implemented here:
> https://github.com/ezra-bible-app/node-sword-interface/blob/master/src/sword_backend/module_helper.cpp
> https://github.com/ezra-bible-app/node-sword-interface/blob/master/src/sword_backend/repository_interface.cpp
>
>
> On 4/3/23 9:27 PM, Troy A. Griffitts wrote:
>> Hi Tobias,
>>
>> Yes, our documentation certainly needs much improvement. I am
>> surprised how far you've gone with so few questions. You have a
>> great talent for figuring things out. I wouldn't worry about the
>> guts of the details in the encoding filters. All you should need
>> to do is specify your desired output on the SWMgr you use for
>> rendering with something like:
>>
>> SWMgr mgr(new MarkupFilterMgr(sword::FMT_XHTML, sword::ENC_UTF8));
>>
>> Let me know if you'd like help,
>>
>> Troy
>>
>>
>> On April 3, 2023 11:18:29 AM MST, Tobias Klein
>> <contact at tklein.info> wrote:
>>
>> Thanks Troy!
>>
>> I'll have a look at the EncodingFilters.
>>
>> I think this is something not fully clear from the SWORD
>> documentation/examples.
>>
>> Maybe these transformation points you had mentioned in the
>> thread below should be described somewhere in the developer wiki?
>>
>> Best regards,
>> Tobias
>>
>> On 4/3/23 6:45 PM, Troy A. Griffitts wrote:
>>> Dear Tobias,
>>>
>>> Please be sure to note my comment to you below in this
>>> thread. It is likely the cause of your rendering issues,
>>> while other apps have no problems.
>>>
>>> In brief, it says that I haven't seen anywhere that you tell
>>> SWORD what markup and encoding you want from the engine. If
>>> this is the case you will get whatever the modules are
>>> encoded / marked up as, which might be various things.
>>>
>>> Hope this helps,
>>>
>>> Troy
>>>
>>> On January 22, 2023 12:03:22 PM MST, "Troy A. Griffitts"
>>> <scribe at crosswire.org> wrote:
>>>
>>> Hey guys,
>>>
>>> Sorry for not jumping in on this thread more quickly.
>>>
>>> Please remember, SWORD has 4 transformation points, each
>>> moving from the module source (as described in the .conf
>>> file) to the client's request:
>>>
>>> RenderFilters - markup, e.g., GBF, ThML, OSIS -> XHTML
>>>
>>> StripFilters - prep before searching
>>>
>>> OptionFilters - turning on an off markup in the text
>>> stream based on user options, e.g., Strongs Number,
>>> Words of Christ in Red, etc.
>>>
>>> EncodingFilters - e.g., 8859 - > UTF-8
>>>
>>>
>>> Module team: be sure the module has the correct Encoding
>>> value in the .conf file (or the default)
>>>
>>> Tobias, be sure you are creating your SWMgr with the
>>> correct MarkupFilterMgr to do the transformation you
>>> desire, e.g., see:
>>>
>>> https://crosswire.org/svn/sword/trunk/examples/cmdline/outrender.cpp
>>>
>>> Hope this helps,
>>>
>>> Troy
>>>
>>>
>>> On 1/22/23 10:39, Fr Cyrille wrote:
>>>> HI David,
>>>> If you send me the file, I can convert it quickly in
>>>> osis. I script it from imp to usfm and the with u2o.py.
>>>>
>>>> Le 22/01/2023 à 16:54, David Haslam a écrit :
>>>>> Thanks Tobias,
>>>>>
>>>>> The problem is that CrossWire no longer accepts module
>>>>> submissions that use IMP format for the build process.
>>>>>
>>>>> We’d need to have a script (or equivalent TextPipe
>>>>> filter) to convert IMP to OSIS (whether directly or
>>>>> indirectly through some other intermediate file format).
>>>>>
>>>>> I’m not currently in a practical position to work on
>>>>> that kind of task.
>>>>> Is anyone else up to it?
>>>>>
>>>>> Best regards,
>>>>>
>>>>> David
>>>>>
>>>>> Sent from Proton Mail for iOS
>>>>>
>>>>>
>>>>> On Sun, Jan 22, 2023 at 15:39, Tobias Klein
>>>>> <contact at tklein.info> wrote:
>>>>>>
>>>>>> The FinPR module that David sent me works fine
>>>>>> without rendering issues! (see screenshot below)
>>>>>>
>>>>>> It would be good to upgrade the module in the repo
>>>>>> accordingly.
>>>>>>
>>>>>> Best regards,
>>>>>> Tobias
>>>>>>
>>>>>> On 1/22/23 8:31 AM, David Haslam wrote:
>>>>>>> Thanks Kristóf.
>>>>>>>
>>>>>>> The rendering problem could have been fixed a decade
>>>>>>> ago!!!
>>>>>>>
>>>>>>> Checking through my email archives yesterday, I
>>>>>>> discovered that I had rebuilt the FinPR module
>>>>>>> exactly 10 years ago! That rebuild used mod2imp and
>>>>>>> imp2vs and included a fix to the text encoding
>>>>>>> implemented on the IMP textfile). The message was
>>>>>>> sent to the modules address on 2013-01-21 but
>>>>>>> presumably never progressed by Chris Little who was
>>>>>>> then still supposed to be responsible for module
>>>>>>> releases and updates. He went permanently AWOL from
>>>>>>> CrossWire around that time.
>>>>>>>
>>>>>>> Back then we had not narrowed the policy for
>>>>>>> submitted source text to be OSIS XML only.
>>>>>>>
>>>>>>> I wrote privately to Tobias last night, forwarding
>>>>>>> the email of 10 years ago complete with both
>>>>>>> attachments. He will examine those today.
>>>>>>>
>>>>>>> Aside: I also replaced <…> by {…} where these had
>>>>>>> wrapped the ch:vs references that recorded av11n in
>>>>>>> the original upstream source. In 2012, there had
>>>>>>> been no suitable av11n available in SWORD but which
>>>>>>> we do have more recently.
>>>>>>>
>>>>>>> mod2osis should not be used, as has already been noted.
>>>>>>> A round trip with mod2osis and osis2mod is not
>>>>>>> lossless, unlike one with mod2imp and imp2vs.
>>>>>>>
>>>>>>>
>>>>>>> Best regards,
>>>>>>>
>>>>>>> David
>>>>>>>
>>>>>>> Sent from Proton Mail for iOS
>>>>>>>
>>>>>>>
>>>>>>> On Sat, Jan 21, 2023 at 23:15, Kristof Szabo
>>>>>>> <kristof.szabo at web.de> wrote:
>>>>>>>> I managed to get Ezra running (it was some libicu70
>>>>>>>> mess), and yes, the accented characters in this
>>>>>>>> module are broken (as other modules accented
>>>>>>>> characters are OK; I assume it is not a font
>>>>>>>> issue). I tried the conf file change, but it didn't
>>>>>>>> work either.
>>>>>>>>
>>>>>>>> The mitigation was to rebuild the module, mod2osis
>>>>>>>> leaves some garbage in the OSIS, but that would be
>>>>>>>> easy to clean, anyway osis2mod is possible with
>>>>>>>> this garbage left in and tada we have a proper accents.
>>>>>>>>
>>>>>>>> image.png
>>>>>>>>
>>>>>>>> As the module was updated last only 3,5 yrs ago I
>>>>>>>> assume the maintainer is still active, ie. they can
>>>>>>>> be reached.
>>>>>>>>
>>>>>>>> Or I can have a look too, the challenge is, that
>>>>>>>> such a module rebuild can open pandora's box, if I
>>>>>>>> run some tests
>>>>>>>> (https://github.com/krisek/sword-test) or David
>>>>>>>> checks them, then for sure there will be some
>>>>>>>> issues. I'm happy to fix some of them, but I
>>>>>>>> definitely do not speak Finnish, so I'm not sure
>>>>>>>> this would be a responsible action. If Dom gives me
>>>>>>>> the go I can fix syntax & submit, but I don't want
>>>>>>>> to end up in the rabbit hole :) Best would be to
>>>>>>>> reach out to the original maintainer.
>>>>>>>>
>>>>>>>> Kind regards,
>>>>>>>> k-
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Sat, Jan 21, 2023 at 8:26 PM Greg Hellings
>>>>>>>> <greg.hellings at gmail.com> wrote:
>>>>>>>>
>>>>>>>> Is Ezra properly setting encoding on the
>>>>>>>> content it renders? Is it maybe setting a font
>>>>>>>> that doesn't have the proper code points?
>>>>>>>>
>>>>>>>> --Greg
>>>>>>>>
>>>>>>>> On Sat, Jan 21, 2023, 13:12 Tobias Klein
>>>>>>>> <contact at tklein.info> wrote:
>>>>>>>>
>>>>>>>> Hi Kristof, David,
>>>>>>>>
>>>>>>>> Adding Encoding=UTF-8 to the module conf
>>>>>>>> file ~/.sword/mods.d/finpr.conf does not
>>>>>>>> solve my issue.
>>>>>>>>
>>>>>>>> The text still looks the same as before ...
>>>>>>>>
>>>>>>>> What else could I do to further debug this?
>>>>>>>>
>>>>>>>> Best regards,
>>>>>>>> Tobias
>>>>>>>>
>>>>>>>> On 1/21/23 5:18 PM, Kristof Szabo wrote:
>>>>>>>>> Hi Thomas,
>>>>>>>>>
>>>>>>>>> I suppose the problem is that finpr.conf
>>>>>>>>> contains no encoding information (check
>>>>>>>>> the Hun* modules for reference), and if
>>>>>>>>> there is nothing specified Latin-1 is the
>>>>>>>>> default. mod2osis (shouldn't be used !!
>>>>>>>>> :)) shows that the module is in UTF-8, so
>>>>>>>>> there is a misalignment.
>>>>>>>>>
>>>>>>>>> https://wiki.crosswire.org/DevTools:conf_Files#:~:text=Plaintext-,Encoding,-UTF%2D8%0AUTF
>>>>>>>>>
>>>>>>>>> Kind regards,
>>>>>>>>> Kristof
>>>>>>>>>
>>>>>>>>> On Sat, Jan 21, 2023 at 4:49 PM David
>>>>>>>>> Haslam <dfhdfh at protonmail.com> wrote:
>>>>>>>>>
>>>>>>>>> Hi Thomas,
>>>>>>>>>
>>>>>>>>> What about other Finnish modules?
>>>>>>>>> eg. FinPR92, FinRK, FinSTLK2017
>>>>>>>>>
>>>>>>>>> Presumably you already tested (eg)
>>>>>>>>> German modules and found that umlauts
>>>>>>>>> and eszett are both rendered aright?
>>>>>>>>>
>>>>>>>>> Btw. FinPR renders aright in
>>>>>>>>> PocketSword (iOS/iPadOS).
>>>>>>>>>
>>>>>>>>> David
>>>>>>>>>
>>>>>>>>> Sent from Proton Mail for iOS
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Sat, Jan 21, 2023 at 15:25, Tobias
>>>>>>>>> Klein <contact at tklein.info> wrote:
>>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> When retrieving the text of the FinPR
>>>>>>>>>> module I am getting some rendering
>>>>>>>>>> issues with the Finnish Umlauts. This
>>>>>>>>>> is based on a user's problem report.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Romans 5:8 returns like this in
>>>>>>>>>> node-sword-interface / Ezra:
>>>>>>>>>>
>>>>>>>>>> Mutta Jumala osoittaa rakkautensa
>>>>>>>>>> meit� kohtaan siin�, ett� Kristus,
>>>>>>>>>> kun me viel� olimme syntisi�, kuoli
>>>>>>>>>> meid�n edest�mme.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> While it should like like this
>>>>>>>>>> (rendered text copied from Xiphos):
>>>>>>>>>>
>>>>>>>>>> Mutta Jumala osoittaa rakkautensa
>>>>>>>>>> meitä kohtaan siinä, että Kristus,
>>>>>>>>>> kun me vielä olimme syntisiä, kuoli
>>>>>>>>>> meidän edestämme.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> This occurs both on Linux and macOS
>>>>>>>>>> (have not tested on Windows yet).
>>>>>>>>>>
>>>>>>>>>> Any pointers what could be the root
>>>>>>>>>> cause? I generally have not observed
>>>>>>>>>> rendering issues with other modules.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Best regards,
>>>>>>>>>> Tobias
>>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> sword-devel mailing list:
>>>>>>>>> sword-devel at crosswire.org
>>>>>>>>> http://crosswire.org/mailman/listinfo/sword-devel
>>>>>>>>> Instructions to unsubscribe/change
>>>>>>>>> your settings at above page
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> sword-devel mailing list:sword-devel at crosswire.org
>>>>>>>>> http://crosswire.org/mailman/listinfo/sword-devel
>>>>>>>>> Instructions to unsubscribe/change your settings at above page
>>>>>>>> _______________________________________________
>>>>>>>> sword-devel mailing list:
>>>>>>>> sword-devel at crosswire.org
>>>>>>>> http://crosswire.org/mailman/listinfo/sword-devel
>>>>>>>> Instructions to unsubscribe/change your
>>>>>>>> settings at above page
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> sword-devel mailing list: sword-devel at crosswire.org
>>>>>>>> http://crosswire.org/mailman/listinfo/sword-devel
>>>>>>>> Instructions to unsubscribe/change your
>>>>>>>> settings at above page
>>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> sword-devel mailing list:sword-devel at crosswire.org
>>>>>>> http://crosswire.org/mailman/listinfo/sword-devel
>>>>>>> Instructions to unsubscribe/change your settings at above page
>>>>>
>>>>> _______________________________________________
>>>>> sword-devel mailing list:sword-devel at crosswire.org
>>>>> http://crosswire.org/mailman/listinfo/sword-devel
>>>>> Instructions to unsubscribe/change your settings at above page
>>>>
>>>>
>>>> _______________________________________________
>>>> sword-devel mailing list:sword-devel at crosswire.org
>>>> http://crosswire.org/mailman/listinfo/sword-devel
>>>> Instructions to unsubscribe/change your settings at above page
>>>
>>> --
>>> Sent from my Android device with K-9 Mail. Please excuse my
>>> brevity.
>>>
>>> _______________________________________________
>>> sword-devel mailing list:sword-devel at crosswire.org
>>> http://crosswire.org/mailman/listinfo/sword-devel
>>> Instructions to unsubscribe/change your settings at above page
>>
>> --
>> Sent from my Android device with K-9 Mail. Please excuse my brevity.
>
> --
> Sent from my Android device with K-9 Mail. Please excuse my brevity.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://crosswire.org/pipermail/sword-devel/attachments/20230407/b0f90fcc/attachment-0001.htm>
More information about the sword-devel
mailing list