[sword-devel] parsing data for the web
Karl Kleinpaste
karl at kleinpaste.org
Fri Feb 2 11:05:50 MST 2007
DavidTroidl at aol.com writes:
> You can find Strong's Greek Dictionary in XML at
> http://morphgnt.org/projects/strongs-dictionary
Thank you very much.
Whee... I sicced another of my sed scripts on this text. You can do
it yourself with the script below, or you can just get my version:
If your Sword UI has an install manager, define a new remote source
for yourself: ftp.kleinpaste.org, /pub/sword.
(GnomeSword users: Terry and I were somewhat annoyed to discover a bug
in GS' module manager half an hour ago, which makes a remote source
not work if it lacks any Bible texts, as this one. *sigh* The fix is
a tweak to src/gnome2/mod_mgr.c. If you build out of SVN, then "svn
up" and rebuild/re-install, and all will be sweetness and light. :-)
Otherwise, for anyone using any UI, you can get *.zip in
/pub/sword/zip. I have these available for public consumption:
gnomesword.zip - GnomeSword Manual, v2.2.1.
clarkenobr.zip - Clarke's commentary w/forced line breaks removed.
strongsrealgreek.zip - The updated Strongs Greek, with actual Greek charset.
Note that directory read permissions are clamped down pretty hard;
don't be surprised that you can't "dir" in /.
The module is ThML and contains sword:// linkages around all Strong's
refs, so moving from one to another should be click-easy for those UIs
content to support sword:// hrefs. I am aware that WinSword does not.
--karl
PS- Script to slice up the new Strong's Greek *.xml:
#!/bin/sh
#
# Construct new Sword module from Ulrik
# Petersen's re-Greek-ified text, v1.1.
# Extract strongsgreek.xml from the *.zip first.
#
f=strongsrealgreek.imp
#
sed -e 's/><entry strongs/>
<entry strongs/' < strongsgreek.xml | tr '\r' '\n' > $f
mv $f $f.1
sed -e '1,/<entries>/d' -e '/<\/entries>/,$d' -e '/<\/entry>/d' < $f.1 > $f
mv $f $f.2
sed -e 's;<entry strongs="\([0-9]\+\)">;$$$\1;g' < $f.2 \
-e 's;<strongs>\([0-9]\+\)</strongs>;<a name="\1"><b>\1</b></a>;g' \
-e 's;<strongsref language="GREEK" strongs="\([0-9]\+\)"/>;<a href="sword://StrongsRealGreek/\1">\1</a>;g' \
-e 's;<strongsref language="HEBREW" strongs="\([0-9]\+\)"/>;<a href="sword://StrongsHebrew/\1">\1</a>;g' \
-e 's;<greek BETA="\([^"]\+\)" unicode="\([^"]\+\)"/>;\2 [\1];g' \
-e 's;<pronunciation>\([^<]\+\)</pronunciation>;\\<i>\1</i>\\<br />;g' \
-e 's;<see language="GREEK" strongs="\([0-9]\+\)"/>;<br />See Greek <a href="sword://StrongsRealGreek/\1">\1</a>.;g' \
-e 's;<see language="HEBREW" strongs="\([0-9]\+\)"/>;<br />See Hebrew <a href="sword://StrongsHebrew/\1">\1</a>.;g' \
-e 's;\(href="sword://Strongs\(RealGreek\|Hebrew\)/\|name="\)\([0-9]\)";\10\3";g' \
-e 's;\(href="sword://Strongs\(RealGreek\|Hebrew\)/\|name="\)\([0-9][0-9]\)";\10\3";g' \
-e 's;\(href="sword://Strongs\(RealGreek\|Hebrew\)/\|name="\)\([0-9][0-9][0-9]\)";\10\3";g' \
-e 's;\(href="sword://Strongs\(RealGreek\|Hebrew\)/\|name="\)\([0-9][0-9][0-9][0-9]\)";\10\3";g' \
-e 's;^ *;;g' \
-e '/^ *$/d' \
-e 's;$; ;' \
-e '/^\$\$\$/s; $;;' > $f
imp2ld strongsrealgreek.imp
chmod go+r strongsrealgreek.dat strongsrealgreek.idx
exit 0
More information about the sword-devel
mailing list