mvnForum Homepage

Posted by ChuckMcKnight at Feb 20, 2010 5:55:56 PM
Re: cross-comparing parallel Bible tool idea
[Post continued here.]

Move to the second word in both versions and repeat the above process.

- Continue moving through the verse in this manner until all words have been compared.

Once the first two versions have been compared and matched, add the third version.

- Repeat a similar process to the above, only comparing the third version with both the first two versions, based on their newly spaced out lists.

- If a word in the third version is equivalent with either the word in the first or second version, it still counts as a match.

Continue repeating this process until all the versions are compared and matched.

With all the versions compared and matched, look for differences to highlight (change to the color selected for that version).

- If all versions use the exact same word with the same punctuation and capitalization, leave the word black (or whatever the default color is).

- When words, punctuation, or capitalization do not all match, check if any of the versions do match for that word.

- - If there is a majority for one word, punctuation, or capitalization, leave that type black and highlight all other types.

- - If there are no matches or if there is an equal ammount of matches so that there is no one majority, highlight the difference in all versions.

- Check seperately for words, punctuation or capitalization.

- - Where the difference is the whole word, highlight the whole word, but the punctuation might stay black if it is the same across the versions or in a majority.

- - If the word itself is the same but punctuation is different, highlight the punctuation itself that is not in the majority.

- - If the word itself is the same but capitalization is different, highlight the first letter of words that are not in the majority.

*Determining equivalence where the word is not the same will probably be the more difficult part of the project.

The most effective way I can think to do it would be to have a thesaurus of possible equivalent words. For example: {you, thou, thee, ye, ...}, {LORD, Jehovah, Yahweh, ...}, {servant, bond-servant, bond-slave, slave, ...}, ...

The problem is that this could probably take quite a while to compile. If such a tool already exists somewhere and could be borrowed for this project, that would be great. Otherwise it might just have to be a slowly growing list that is compiled as people find more words that are equivalent. Any suggestions for how to solve this problem more efficiently would be greatly appreciated.