[sword-devel] Updating Clarke commentary to become readable

Greg Hellings greg.hellings at gmail.com
Sun Sep 24 14:16:52 MST 2006


Karl,

That is an astounding script.  Amazingly done!  I haven't tested it, as I
don't have Clarke's installed, but it seems that if the Sword lib is
mishandling the & character and the <br /> tag, then the problem really lies
within Sword and should be fixed there, ASAP.  Excellent sed-ing, though!

--Greg

On 9/24/06, Karl Kleinpaste <karl at charcoal.com> wrote:
>
> The nasty little script below takes the current Clarke content and
> strips the extraneous <br /> elements out in a coherent fashion.  This
> makes the Clarke content actually readable, as opposed to its current
> state, which (unless you allow for a very wide commentary subwindow)
> is thoroughly unreadable.
>
> Along the way, it also converts his (excessive) use of "&c." into
> "etc.", which makes some sections work that do not work under the
> current Clarke incarnation.  Cf. James 5:20, ¾ down, a paragraph
> beginning, "1. I have already conjectured...", and observe odd
> paragraph break and grammatical failure -- Sword libs are not
> preserving `&' properly; proper content is present, but it's simply
> not handled properly.  See also Gen 1:11, for which Clarke displays
> nothing at all in WinSword/BibleCS, even though there is content.
> (GnomeSword displays Clarke's Gen 1:11 content, but incompletely so.)
>
> #!/bin/sh -x
> mod2imp Clarke |
> sed -e 's|&c\.|etc.|g' \
>     -e 's|\([A-Za-z0-9€-ÿ),.?!:;"]\)<br /> \([A-Za-z0-9€-ÿ(,.?!:;"]\)|\1
> \2|g' \
>     -e 's|</i><br /> \+<i>| |g' \
>     -e 's|\([A-Za-z0-9€-ÿ),.?!:;"]\) \?<br /> <\([is]\)|\1 <\2|g' \
>     -e 's|\([fi]\)><br /> \([A-Za-z0-9€-ÿ(,.?!:;"]\)|\1> \2|g' \
>     -e 's|]<br /> |] |'g \
>     -e 's|<br /> \[| [|'g |
> imp2vs /dev/stdin . 2>&1 | egrep -v '^from file: |^adding entry: '
> chmod go+r nt nt.vss ot ot.vss
> exit 0
>
> The modified clarkenobr.conf I'm using:
>
> [ClarkeNoBr]
> DataPath=./modules/comments/rawcom/clarke-nobr/
> ModDrv=rawCom
> Lang=en
> Encoding=UTF-8
> SourceType=ThML
> Description=Adam Clarke's Commentary on the Bible (without forced line
> breaks)
> About=Adam Clarke's 1810/1825 commentary and critical notes on the Bible,
> with forced line breaks removed.
> LCSH=Bible. Commentaries.
> DistributionLicense=Public Domain
>
> --karl
>
> _______________________________________________
> sword-devel mailing list: sword-devel at crosswire.org
> http://www.crosswire.org/mailman/listinfo/sword-devel
> Instructions to unsubscribe/change your settings at above page
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.crosswire.org/pipermail/sword-devel/attachments/20060924/d9141183/attachment.html 


More information about the sword-devel mailing list