[jsword-devel] Comparing texts

DM Smith dmsmith at crosswire.org
Wed Aug 29 11:00:56 MST 2012


It was based upon an earlier version of diff-match-patch, which was written in javascript, not java. The selection criteria I had was that it had to have a license compatible to JSword. When the original author was hired by google, the code changed to an incompatible license for porting. Since then it was ported to Java 5.

I ported the earlier version to Java 1.4. But I broke it out into multiple classes. (We might be able to eliminate our version and use the google version directly).

I think there is a way to have it do a word based match, but with code changes:
http://code.google.com/p/google-diff-match-patch/wiki/LineOrWordDiffs


On Aug 29, 2012, at 12:50 PM, Chris Burrell <chris at burrell.me.uk> wrote:

> Hi all
> 
> The current diffing produces some fairly strange results from time to time. I was wondering how much work it would be to make it work for a word by word diff, rather than letter by letter. I've a quick scan through the diff-ing engine, but it looks fairly complicated and can't figure out how much of this is a copy of http://code.google.com/p/google-diff-match-patch and how much has changed.
> 
> In the example below, 
> 
>            "And God saw that the light , that it was good : and God dividwas good. And God separated the light from the darkness          "
> 
> The new diff would hopefully not chop "that and "the"  in the first occurrence above. It would not chop "divid" off either, but rather have longer words, which would in turn make things slightly more readable.
> 
> (bold indicates strike through)
> 
> Chris
> 
> _______________________________________________
> jsword-devel mailing list
> jsword-devel at crosswire.org
> http://www.crosswire.org/mailman/listinfo/jsword-devel

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.crosswire.org/pipermail/jsword-devel/attachments/20120829/ae495fa6/attachment.html>


More information about the jsword-devel mailing list