[jsword-devel] Comparing texts

Troy A. Griffitts scribe at crosswire.org
Wed Aug 29 10:15:06 MST 2012

You might consider using CollateX, which does token level (word or 
other) collation, and does a pretty good job detecting things like 
transpositions, etc.  Here is how we use it here at the INTF:


Our web service for this is here (with example parameters following):


On 08/29/2012 06:50 PM, Chris Burrell wrote:
> Hi all
> The current diffing produces some fairly strange results from time to 
> time. I was wondering how much work it would be to make it work for a 
> word by word diff, rather than letter by letter. I've a quick scan 
> through the diff-ing engine, but it looks fairly complicated and can't 
> figure out how much of this is a copy of 
> http://code.google.com/p/google-diff-match-patch and how much has changed.
> In the example below,
>            "And God saw th_at th_e light *, that it was good : and God 
> divid*_was good. And God separat_ed the light from the darkness         "
> The new diff would hopefully not chop "that and "the"  in the first 
> occurrence above. It would not chop "divid" off either, but rather 
> have longer words, which would in turn make things slightly more readable.
> (bold indicates strike through)
> Chris
> _______________________________________________
> jsword-devel mailing list
> jsword-devel at crosswire.org
> http://www.crosswire.org/mailman/listinfo/jsword-devel

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.crosswire.org/pipermail/jsword-devel/attachments/20120829/c3a23fc2/attachment.html>

More information about the jsword-devel mailing list