<div dir="ltr">The reason it doesn&#39;t work on Genesis 1:2 is because it doesn&#39;t find a word with enough similarity, so it ends up in an infinite loop within a TODO block :)<br><br>Also, there is the ESV English-Greek Reverse Interlinear New Testament (<a href="http://www.crossway.org/product/158134628X">http://www.crossway.org/product/158134628X</a>) already, which is the same sort of thing (NT only, obviously).<br>

<br clear="all">God Bless,<br>Ben<br>-------------------------------------------------------------------------------------------<br>The Lord is not slow to fulfill his promise as some count slowness,<br>but is patient toward you, not wishing that any should perish,<br>

but that all should reach repentance.<br>2 Peter 3:9 (ESV)<br>

<br><br><div class="gmail_quote">On Fri, Sep 12, 2008 at 10:56 AM, Greg Hellings <span dir="ltr">&lt;<a href="mailto:greg.hellings@gmail.com">greg.hellings@gmail.com</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

Sorry, I attached a version of the tarball that had the executable in<br>

it and the list moderation caught it. &nbsp;Here&#39;s the cleaned version.<br>

See the detailed summary below.<br>

<br>

--Greg<br>

<br>

On Thu, Sep 11, 2008 at 7:52 PM, Greg Hellings &lt;<a href="mailto:greg.hellings@gmail.com">greg.hellings@gmail.com</a>&gt; wrote:<br>

&gt; Troy,<br>

&gt;<br>

&gt; The task that I&#39;m currently working on as research for my dissertation<br>

&gt; can possibly be leveraged. &nbsp;We are attempting to sort out image<br>

&gt; annotations (in an effort to learn how to automatically create them).<br>

&gt; As such, we are given a list of terms which annotate the contents of<br>

&gt; an image - but we want to know how similar the semantics of some of<br>

&gt; the terms are. &nbsp;Here is where I think parallels can be drawn:<br>

&gt;<br>

&gt; We use established semantic relatedness measurement techniques (see<br>

&gt; <a href="http://wn-similarity.sourceforge.net" target="_blank">wn-similarity.sourceforge.net</a> for some of the best tools currently<br>

&gt; available for that) to construct a graph connecting each term with all<br>

&gt; the other annotating terms, where the edge weight of the graph is the<br>

&gt; value of the average over all of the semantic measures that the<br>

&gt; WordNet Similarity measure returns (in time we will take a weighted<br>

&gt; average with all the values normalized between [0..1], since some<br>

&gt; measures only scale from [0..1/2] and others can take values up to<br>

&gt; 16,000 and more). &nbsp;We then do some strange graph partitioning tricks,<br>

&gt; etc -- that&#39;s someone else&#39;s domain.<br>

&gt;<br>

&gt; However, you could possibly utilize the following modification of the<br>

&gt; technique. &nbsp;For each term in the ESV, find the similarity between it<br>

&gt; and every term in the KJV. &nbsp;If they are identical, set the value to 1,<br>

&gt; otherwise, use the WordNet::Similarity tools to produce a value. &nbsp;Then<br>

&gt; weight the value of the link by their relative positions in the text<br>

&gt; (that way two occurrences of the same term can be differentiated), for<br>

&gt; example, divide by abs(position(ESV) - position(KJV)) or something<br>

&gt; similar. &nbsp;Then assign the value for each term based on the word that<br>

&gt; it most closely resembles.<br>

&gt;<br>

&gt; This is very similar to what you&#39;re already doing, but not identical.<br>

&gt; I have modified the esvtag.cpp to use the included similarity.py to<br>

&gt; get the semantic distance from a few of the metrics that<br>

&gt; WordNet::Similarity uses (however, it scrapes a webpage to do so - you<br>

&gt; will do better, if you decide to use this system, to install the local<br>

&gt; Perl data and run the system locally) whenever the terms are not<br>

&gt; identical. &nbsp;It continues to work for Gen 1:1, the program pegs out my<br>

&gt; processor and does not appear to have any intention of completing Gen<br>

&gt; 1:2 -- I don&#39;t know where the fault for that lies, but it does that<br>

&gt; both in your original version and in this version. &nbsp;Obviously, the<br>

&gt; weighting I proposed would work best when the version being used<br>

&gt; maintains very similar phrase ordering and structuring to the KJV, but<br>

&gt; I suppose any metric we use will require human supervision anyway.<br>

&gt;<br>

&gt; As a bonus, I also have it sticking contiguous terms which are part of<br>

&gt; the same source -- &quot;In the beginning&quot; -- into the same &lt;w&gt; tag.<br>

&gt;<br>

&gt; --Greg<br>

&gt; P.S. The attached tarball will clobber any current esvtag directory<br>

&gt; that&#39;s a child of where you unpack it - so be careful about that.<br>

<div><div></div><div class="Wj3C7c">&gt;<br>

&gt; On Thu, Sep 11, 2008 at 4:02 PM, Troy A. Griffitts &lt;<a href="mailto:scribe@crosswire.org">scribe@crosswire.org</a>&gt; wrote:<br>

&gt;&gt; Hey guys. &nbsp;I have a fun and useful challenge for anyone wishing to show off<br>

&gt;&gt; their prowess at problem solving and basic world domination.<br>

&gt;&gt;<br>

&gt;&gt;<br>

&gt;&gt; We have morphological data for the KJV. &nbsp;Lots of work by many people went<br>

&gt;&gt; into this data, to markup each English word in the Bible text to the<br>

&gt;&gt; corresponding Hebrew or Greek word in the original text.<br>

&gt;&gt;<br>

&gt;&gt;<br>

&gt;&gt; We have many other Bibles with /similar/ wording to the KJV which are not<br>

&gt;&gt; yet marked up.<br>

&gt;&gt;<br>

&gt;&gt;<br>

&gt;&gt; Lane Dennis from Crossway (ESV publishers) is here at Tyndale House visiting<br>

&gt;&gt; and we&#39;ve talked in the past about helping them markup their ESV text to the<br>

&gt;&gt; original.<br>

&gt;&gt;<br>

&gt;&gt;<br>

&gt;&gt; I have done most all of the grunt work for you!<br>

&gt;&gt;<br>

&gt;&gt; Attached is source for a program which attempts to insert &lt;w&gt; markup into<br>

&gt;&gt; the ESV markup using the KJV data.<br>

&gt;&gt;<br>

&gt;&gt; It is HEAVILY commented, requires latest SVN of the SWORD engine INSTALLED<br>

&gt;&gt; on your system, both the KJV and ESV modules INSTALLED, and has an nice<br>

&gt;&gt; little method:<br>

&gt;&gt;<br>

&gt;&gt; void matchWords(...)<br>

&gt;&gt;<br>

&gt;&gt; where you&#39;re given:<br>

&gt;&gt; a word list from ESV<br>

&gt;&gt; a word list from KJV<br>

&gt;&gt; a map from KJV word to an XMLTag &quot;&lt;w...&gt;&quot;<br>

&gt;&gt;<br>

&gt;&gt; and all you have to do is fill out the equivalent:<br>

&gt;&gt; map from ESV word to an XMLTag.<br>

&gt;&gt;<br>

&gt;&gt;<br>

&gt;&gt; As a sample, it current has a really silly algorithm that actually works for<br>

&gt;&gt; Gen.1.1, so you have an example of the work you need to do.<br>

&gt;&gt;<br>

&gt;&gt; All you have to do is add the real magic that figures out which words in the<br>

&gt;&gt; ESV map to which words in the KJV (well, you get the idea).<br>

&gt;&gt;<br>

&gt;&gt; Have fun! &nbsp;And I&#39;m sure you can see where this is going and how useful it<br>

&gt;&gt; can be for future work!<br>

&gt;&gt;<br>

&gt;&gt;<br>

&gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp;-Troy.<br>

&gt;&gt;<br>

&gt;&gt;<br>

&gt;&gt;<br>

&gt;&gt;<br>

</div></div>&gt;&gt; _______________________________________________<br>

&gt;&gt; sword-devel mailing list: <a href="mailto:sword-devel@crosswire.org">sword-devel@crosswire.org</a><br>

&gt;&gt; <a href="http://www.crosswire.org/mailman/listinfo/sword-devel" target="_blank">http://www.crosswire.org/mailman/listinfo/sword-devel</a><br>

&gt;&gt; Instructions to unsubscribe/change your settings at above page<br>

&gt;&gt;<br>

&gt;<br>

<br>_______________________________________________<br>

sword-devel mailing list: <a href="mailto:sword-devel@crosswire.org">sword-devel@crosswire.org</a><br>

<a href="http://www.crosswire.org/mailman/listinfo/sword-devel" target="_blank">http://www.crosswire.org/mailman/listinfo/sword-devel</a><br>

Instructions to unsubscribe/change your settings at above page<br></blockquote></div><br></div>