[sword-devel] Updating Clarke commentary to become readable
Karl Kleinpaste
karl at charcoal.com
Sun Sep 24 14:10:54 MST 2006
The nasty little script below takes the current Clarke content and
strips the extraneous <br /> elements out in a coherent fashion. This
makes the Clarke content actually readable, as opposed to its current
state, which (unless you allow for a very wide commentary subwindow)
is thoroughly unreadable.
Along the way, it also converts his (excessive) use of "&c." into
"etc.", which makes some sections work that do not work under the
current Clarke incarnation. Cf. James 5:20, ¾ down, a paragraph
beginning, "1. I have already conjectured...", and observe odd
paragraph break and grammatical failure -- Sword libs are not
preserving `&' properly; proper content is present, but it's simply
not handled properly. See also Gen 1:11, for which Clarke displays
nothing at all in WinSword/BibleCS, even though there is content.
(GnomeSword displays Clarke's Gen 1:11 content, but incompletely so.)
#!/bin/sh -x
mod2imp Clarke |
sed -e 's|&c\.|etc.|g' \
-e 's|\([A-Za-z0-9-ÿ),.?!:;"]\)<br /> \([A-Za-z0-9-ÿ(,.?!:;"]\)|\1 \2|g' \
-e 's|</i><br /> \+<i>| |g' \
-e 's|\([A-Za-z0-9-ÿ),.?!:;"]\) \?<br /> <\([is]\)|\1 <\2|g' \
-e 's|\([fi]\)><br /> \([A-Za-z0-9-ÿ(,.?!:;"]\)|\1> \2|g' \
-e 's|]<br /> |] |'g \
-e 's|<br /> \[| [|'g |
imp2vs /dev/stdin . 2>&1 | egrep -v '^from file: |^adding entry: '
chmod go+r nt nt.vss ot ot.vss
exit 0
The modified clarkenobr.conf I'm using:
[ClarkeNoBr]
DataPath=./modules/comments/rawcom/clarke-nobr/
ModDrv=rawCom
Lang=en
Encoding=UTF-8
SourceType=ThML
Description=Adam Clarke's Commentary on the Bible (without forced line breaks)
About=Adam Clarke's 1810/1825 commentary and critical notes on the Bible, with forced line breaks removed.
LCSH=Bible. Commentaries.
DistributionLicense=Public Domain
--karl
More information about the sword-devel
mailing list