[jsword-devel] Comparing texts

Troy A. Griffitts scribe at crosswire.org
Wed Aug 29 10:36:15 MST 2012


Sorry, I forgot to make this first link "guest viewable".  If you tried 
it earlier and it asked you to login, try again.

On 08/29/2012 07:15 PM, Troy A. Griffitts wrote:
> You might consider using CollateX, which does token level (word or 
> other) collation, and does a pretty good job detecting things like 
> transpositions, etc.  Here is how we use it here at the INTF:
>
> http://ntvmr.uni-muenster.de/web/test/collation?key=Jn.3.16&collate=graph
>
> Our web service for this is here (with example parameters following):
>
> http://ntvmr.uni-muenster.de/community/vmr/api/collate/
> http://ntvmr.uni-muenster.de/community/vmr/api/collate/?w1=Hello+world&l1=x&w2=Hello+cruel+world&format=svg
>
>
>
>
> On 08/29/2012 06:50 PM, Chris Burrell wrote:
>> Hi all
>>
>> The current diffing produces some fairly strange results from time to 
>> time. I was wondering how much work it would be to make it work for a 
>> word by word diff, rather than letter by letter. I've a quick scan 
>> through the diff-ing engine, but it looks fairly complicated and 
>> can't figure out how much of this is a copy of 
>> http://code.google.com/p/google-diff-match-patch and how much has 
>> changed.
>>
>> In the example below,
>>
>>            "And God saw th_at th_e light *, that it was good : and 
>> God divid*_was good. And God separat_ed the light from the darkness  
>>        "
>>
>> The new diff would hopefully not chop "that and "the"  in the first 
>> occurrence above. It would not chop "divid" off either, but rather 
>> have longer words, which would in turn make things slightly more 
>> readable.
>>
>> (bold indicates strike through)
>>
>> Chris
>>
>>
>>
>> _______________________________________________
>> jsword-devel mailing list
>> jsword-devel at crosswire.org
>> http://www.crosswire.org/mailman/listinfo/jsword-devel
>
>
>
> _______________________________________________
> jsword-devel mailing list
> jsword-devel at crosswire.org
> http://www.crosswire.org/mailman/listinfo/jsword-devel

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.crosswire.org/pipermail/jsword-devel/attachments/20120829/f7d1a93c/attachment-0001.html>


More information about the jsword-devel mailing list