[sword-devel] KJV2006 Project
Troy A. Griffitts
scribe at crosswire.org
Tue Feb 14 07:43:13 MST 2006
DM,
I am really excited about your desire to update the KJV2003 text. Much
work went into that project, and a special thank you go to all the
volunteers who helped tag the NT!
http://crosswire.org/sword/kjv2003/status/
To find the most recent raw data, all things KJV2003 live on our server
at: ~sword/html/kjv2003
It might be a mess in there. All my utilities for scanning the text
and computing which verses are complete, etc. are in there. I think I
have utilities to convert the data to a sword module which is probably
the latest KJV module data we publish.
What I, personally, would like to see done:
Fix invalid markup: <note/><note/> is the most glaring.
OT: The KJV2003 project was primarily aimed at a NT. An OT with
Strongs was freely distributable already. The data we used for the OT
is also in that directory. I believe it was downloaded from MPJ's site
ebible.org and was likely originally from a text from the Bible Foundation.
OT: has dropped most all "'s" in our current module. I think it's just
a problem with my conversion scripts which try to place OSIS <w> tags
AROUND each word/phrase, rather than AFTER a word/phrase, as the
original markup has it.
OT: It personally frustrates me that the body of the text we use for
the KJV OT has lowercased all personal pronouns for God. Can we find a
better text? I'm fairly certain the KJV in most of its printed
incarnations had these uppercase.
NT: Articles: All simple definite articles are left as empty tags in
the verses. The logic was that in English we have both an indefinite
and definite article, whereas Greek only as a definite article:
a house OIKOS
the house hO OIKOS
So, for consistency, English nouns were tagged the same whether they
had a definite article in Greek, or were anartherous. The desired
output would be something like:
TR: <w src="1">hO</w><w src="2">OIKOS</w>
KJV: <w src="1 2">the house</w>
Currently it is:
<w src="1"></w> <w src="2">the house</w>
I think the correcting script logic is something like:
Do I have an empty tag with strongs 3588 (article)
Is morphology of <w src="[mysource]+1"> begin with "N-" (noun) and
equal in other respects to my morphology?
combine src numbers and drop the empty tag.
The scholar who worked on tagging this text were told that this would
be the logic applied, so they tagged accordingly.
A better starting point than the raw data of the NT from
~sword/html/kjv2003, is probably from a modified mod2osis output of our
current module. You can apply the attached patch to assure that no
filters are working on the text and you get raw data output.
Thank you again for your willingness to help. This a very much needed
effort!
-Troy.
DM Smith wrote:
> The KJV Bible is the most downloaded Sword module at CrossWire.
> It is often the first impression that people get when looking at all the
> different Sword front-ends.
>
> There are some problems with the KJV that have been reported and need to
> be fixed.
>
> Anyone else interested in working on upgrading the KJV2003?
>
> As I am just finishing the installer for Sword, I would like to start
> this effort.
>
> _______________________________________________
> sword-devel mailing list: sword-devel at crosswire.org
> http://www.crosswire.org/mailman/listinfo/sword-devel
> Instructions to unsubscribe/change your settings at above page
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mod2osis.kjv2003.patch
Type: text/x-patch
Size: 1490 bytes
Desc: not available
Url : http://www.crosswire.org/pipermail/sword-devel/attachments/20060214/96b13f6b/mod2osis.kjv2003.bin
More information about the sword-devel
mailing list