[sword-devel] osis2mod very happily generates modules with no text visible

Matěj Cepl mcepl at redhat.com
Wed Dec 28 17:59:22 MST 2011


On 28.12.2011 20:40, David Haslam wrote:
> I suspect that the following codepoints may need converting from a legacy
> font.
>
> 002045	⁅	LEFT SQUARE BRACKET WITH QUILL		1,648
> 002046	⁆	RIGHT SQUARE BRACKET WITH QUILL	1,648
> 002308	⌈	LEFT CEILING	6,663
> 002309	⌉	RIGHT CEILING	6,635
>
> If that's not the case, their use in the Czech language requires explaining.

They are not relevant to Czech, but these were my attempts to somehow 
preserve apparatus of variants in the Czech text ...

... hmm, looking at the original text once more it seems like my bug 
here. See for example note 1 in 62_Mk.xml (the original source file of 
the Mark’s gospel):

<defpozn n="v1">Řec. slovo <italic>euangelion</italic> se nachází už 
v&#x00A0;Homérově
Odyseji (14, 152.166), a&#x00A0;to ve významu <bczuv/>odměna za dobrou 
zprávu<eczuv/>.
Zároveň tento výraz označoval <bczuv/>dobrou zprávu<eczuv/> samu. Základem
užití tohoto slova v&#x00A0;NS je Stará smlouva (LXX). Velmi důležité 
jsou texty, kde
se vyskytuje příslušné odvozené sloveso <italic>euangelizesthai</italic>
(<bczuv/>zvěstovat / hlásat dobrou zprávu<eczuv/>). Viz zvl. 
Iz&#x00A0;52,7 a&#x00A0;61,1<pomlcka/>3;
srv. Sk&#x00A0;5,42p</defpozn>

Here ⁅ ⁆ are translations of <bczuv/> and <eczuv/> respectively which 
are just (beginning|end)-of-Czech-úvozovek (quotes), so I guess these 
should be translated to plain Unicode „ and “ (which is the way we use 
double quotes). I have to check once more why I didn't translated it to 
double-quotes in the first place. (The other tags here should be simple 
... <defpozn> is “defition of poznámka” (note), <italic> is obvious, 
<pomlcka/> stands for em-dash; preservation of unbreakable space is 
important for following of Czech grammar rules about breaking lines).

<italic> and <bczuv> are just two examples of many cases of mixing 
semantic and typographic markup in the text.

I haven't seen the code really for the last couple of months, so I have 
to recheck why I didn't go the obvious way and make <bczuv/> and 
<eczuv/> into proper double-quotes.

---------------------------
⌈ and ⌉ are more complicated. They are originally <bkzavorka/> and 
<ekzavorka/> and they limit the text for which the appropriate 
translator note is used. See again the first two verses of the Mark’s 
gospel:

<kap n="1"/>
<vers n="1"/>Počátek<odkazo n="o1"/> evangelia<odkaz n="v1"/> Ježíše Krista,
<hzavorka>Syna Božího</hzavorka><odkazo n="o2"/> <bkzavorka/>.
<vers n="2"/>Jak<ekzavorka/><odkaz n="t2"/> je <perf>napsáno</perf>
<bkzavorka/>v&#x00A0;proroku Izaiášovi<ekzavorka/>:<odkaz n="t3"/> 
<czap>Hle,
<hzavorka>já</hzavorka> posílám svého posla<odkaz n="t4"/> před tvou tváří,
který <fut>upraví</fut> tvou cestu <italic>[před
tebou]</italic>.<odkazo n="o3"/>

Here <odkaz n="t2"/> (“odkaz” is “a reference” in Czech) referes to the 
text over the border of verse “. Jak” (and apparently, IMHO, this is a 
mistake in the original markup of the text, but it is so printed in the 
paper version of the Bible). Other codes, just to make more examples:

- <kap/> is a chapter milestone
- <vers/> is a verse milestone
- <odkazo> is a “o-type” reference (meaning a reference to a footnote 
containing reference to the other verse of the Bible, the other one is 
<odkaz/> for a footnote with the actual text of translator’s note).
- <hzavorka> element is a markup for the text used in less important 
manuscripts (although John 7:59-8:11 is not marked up at all and there 
is only a <odkaz> at Jn 7:59; the same goes for Mk 16:9- which has also 
only a text note explaining that the rest of the chapter is not present 
in many important manuscripts), so they used it probably only for small 
portions of text.
- <perf> is an element marking the verb in the Greek perfectum, which is 
difficult to translate correctly to Czech as we don't have perfectum as 
a special case for verbs.
- <czap> is an element for “text quoted by Czech apostrophe”, and I am 
not sure whether in all cases it means a quotation of the Old Testament 
text in the New Testament (as in this case).
- <fut> is another case of marking up a Greek verb which cannot be well 
translated into Czech; here it is the other way around ... Czech future 
tenses are more rich than Greek ones (there are grammatical aspects 
http://en.wikipedia.org/wiki/Grammatical_aspect in Czech, which are not 
in Greek; so a Czech translation could make a difference which isn’t in 
the original text.

I hope I've shown you what's the battle I am fighting with.

Guys, thank you so so much for helping with this!

Blessings,

Matěj



More information about the sword-devel mailing list