[sword-devel] multiple languages in modules
Daniel Owens
dcowens76 at gmail.com
Fri Oct 12 05:44:33 MST 2012
<foreign> is the xml way of indicating a language other than the
language of the document. So you surround Hebrew text with <foreign
xml:lang="heb">. Judging from Ben's more recent email, even BPBible does
not support it. Regardless of the menthod, the effect is great.
I use Linux Libertine all the time for all but Hebrew. Vowel points do
not display correctly. Free Serif is a passable alternative because it
gets the vowels right, but the Hebrew glyphs seem anemic to me.
I am also working with David Troidl on BDB, which has many more
languages, including Arabic, Ethiopic, Syriac, and transliterated
Akkadian. It is not realistic to expect any one font to handle all of
those in addition to Greek and Hebrew. My main point is that applying
fonts based on language of the text rather than language of the module
is something worth working in.
More below.
On 10/11/2012 11:11 PM, Karl Kleinpaste wrote:
> I know nothing of <foreign>, but can only suppose that, if supported, it
> must pass through the engine with an appropriate (HTML) indication.
>
> As a general rule, I suggest either Free Serif or Linux Libertine, with
> a slight preference for Free Serif. Both have good coverage across
> every Latin alphabet variant, and pretty display of both Hebrew and
> Greek. In modules of mine that have Latin, Greek, and Hebrew alphabets,
> they all show quite well. We include both of these fonts in Xiphos'
> Win32 installers.
>
> You might find the UDHR module useful, from Crosswire Experimental, as a
> font demonstration module.
>
> (Linux Libertine is not Linux-specific. It was just developed in an
> open source environment.)
>
>> Is the <foreign> element passed through the engine? If so, do I need
>> to file bugs with front-ends to encourage support of <foreign>?
> Having just looked, the string "foreign" does not appear in Sword's
> source tree in src/modules/filters/*.cpp. So it's not supported right
> now after all. I don't know how BPBible supports it; I had understood
> that BPBible uses the regular filter sets. Does BPBible actually
> subclass the filters and extend them for <foreign>?
>
>> Second, when RtoL text is mixed with LtoR text you can get some
>> strange display problems. Punctuation and numbers can work for both
>> types of languages.
> This is often an artifact of how toolkits handle LtoR. Today, Xiphos
> uses GTK and WebKit, but I don't know how these reflect your example
> case. Our former use of gtkhtml3 -vs- gtkmozembed -vs- xulrunner -vs-
> today's WebKit always led to some strange realizations for how LtoR
> would show up in Xiphos. gtkhtml3 wants to right-justify any text
> containing (or perhaps it was "that leads off with") Hebrew. That
> peculiarity led to certain unexpected choices for how I created
> StrongsRealHebrew.
>
I love unicode, but mixed language language directions is one problem
that did not exist with legacy fonts. As far as I can tell, all web
browsers and word processors do the same thing—when you have some Hebrew
text they assume that anything that follows such as numerals or
punctuation (until you get some Latin text, for example) is Hebrew. When
marking up xml you get a false sense of security about text rendering
because the tags use Latin characters. But when they are rendered by a
browser, even text outside the <foreign xml:lang="heb"> is assumed to be
Hebrew until you get some Latin text. I think that is why html has
<bdo>, which helps solve the problem.
Daniel
More information about the sword-devel
mailing list