[sword-devel] KJV2006 - 4th Beta

DM Smith dmsmith555 at yahoo.com
Sat Mar 25 16:12:44 MST 2006



Joachim Ansorg wrote:
> Hi,
> thank you for your continued work on the KJV module.
>
> I tested beta4 in BibleTime.
> I fixes BibleTime to support milestones with type x-p. The paragraph marker is 
> now shown correctly in the text. Should that milestone also start a newline 
> or a new paragraph or just insert the marker?
>   
In the traditional printing of the KJV, every verse begins on a new line.
The paragraph mark appears after the verse number.

Since our user interfaces can provide a richer experience, we can do 
with it as we please.

What I would suggest is that each paragraph start on a new line and 
perhaps have a blank line above it. You may wish to surround the 
paragraph marker with space to make the text more readable.

I still have to validate the paragraphing as a cursory glance shows that 
we have many more paragraph markers than the 1769 version had.
I will probably preserve them via something that should be ignored. 
(e.g. a different custom milestone.)
> Multiple strongs for one word (e.g. created in Gen.1.1) are split by space. I 
> think the old module (other modules?) use | as split marker. Is there any 
> official guideline for the split marker?
>   

Yes. OSIS does not allow the | in lemma or morph. They do allow spaces.
> The NT contains the morph prefix robinson. I ever thought this should be used 
> as a module name, at least we do in BT =)
>   
We had been using x-Strongs and x-Robinsons, but OSIS does not allow these.

The prefix is an osis work prefex that is defined in the header element 
of the osisText element. It is not a module name. OSIS lets it be 
anything at all. But since the osis2mod process strips out the header 
element, we have to have a convention. I am using strong and robinson at 
Troy's suggestion.

I think the sword engine already recognizes these.

> Since it's lower case (the old one was Robinson I think) this doesn't work.
> What is the right handling of a morph prefix? Are there any official types 
> defined (can't remember if there are some in the specs right now).
>   

So far the only ones that have any real value are strongs and robinson. 
There is strongsMorph, but Chris is working on a module to replace that. 
When he gets it done, we may change the prefix to something more 
constructive, maybe "chris" ;)
>
> Thank you for you help and work,
> Joachim
>
>   
>> L.Allan-pbio wrote:
>>     
>>> Thanks for working on this.
>>>
>>> I would add a vote for providing KjvLite (with all or most of the
>>> embedded tags removed.)
>>>       
>> Here you go: http://www.crosswire.org/~dmsmith/kjv2006/kjvlite.zip (If
>> it is not there, check back later)
>>
>> I created it by running the following xsl. I left divineName,
>> transChange, notes and q.
>>
>> If you don't want these, it is a trivial change from match="osis:w" to
>> match="osis:w|osis:q" (or what ever you don't want.)
>>
>> <?xml version="1.0" encoding="utf-8"?>
>> <xsl:stylesheet
>>   version="1.0"
>>   xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
>>   xmlns:osis="http://www.bibletechnologies.net/2003/OSIS/namespace"
>>
>>
>>   <xsl:output method="xml" indent="no"/>
>>
>>   <xsl:template match="osis:w"><xsl:apply-templates /></xsl:template>
>>
>>   <!-- ignore markup notes -->
>>   <xsl:template match=" osis:note[@type = 'x-strongsMarkup'] |
>> osis:milestone[@type = 'x-strongsMarkup']"/>
>>
>>   <!-- Copy all remaining nodes -->
>>   <xsl:template match="@*|node()">
>>     <xsl:copy>
>>       <xsl:apply-templates select="@*"/>
>>       <xsl:apply-templates/>
>>     </xsl:copy>
>>   </xsl:template>
>>
>> </xsl:stylesheet>
>>
>>     
>>> LcdBible does a one-time detagging of verses with a one-pass
>>> state-machine defilter, and puts the entire contents in a 4 meg buffer
>>> .... and "brute-force" strstr searches are about 10x - 50x faster.
>>> I've been working on Boyer-Moore-Horspool searching, which can be
>>> quite a bit faster once the search word(s) are four characters or longer.
>>>
>>>
>>> ----- Original Message ----- From: "DM Smith" <dmsmith555 at yahoo.com>
>>> To: "SWORD Developers' Collaboration Forum" <sword-devel at crosswire.org>
>>> Sent: Saturday, March 25, 2006 8:59 AM
>>> Subject: Re: [sword-devel] KJV2006 - 4th Beta
>>>
>>>       
>>>> L.Allan-pbio wrote:
>>>>         
>>>>> DM,
>>>>>
>>>>> I tried out the "raw" KJV2006-Beta-4 with sword.exe rc1 ....
>>>>>
>>>>> drum roll, please <g>
>>>>>
>>>>> Well, nothing shows up. Am I providing "dummy checking" or otherwise
>>>>> doing something wrong?
>>>>>           
>>>> I guess you are providing "dummy checking". (I'm the *dummy* for not
>>>> actually testing this module! ;)
>>>>
>>>>         
>>>>> The KJV2006-ztext shows up ok, but not the raw vpl.
>>>>>
>>>>> When I try to look at nt and ot with a text editor, I'm getting a
>>>>> message:
>>>>> "nt" contains characters that do not exist in code page 1252 (ANSI -
>>>>> Latin I). They will be converted to the system defualt character, if
>>>>> you click ok."
>>>>>
>>>>> After clicking ok, I'm still not seeing anything.
>>>>>
>>>>> I looked at kjvraw.conf, and see that it is trying to use ztext:
>>>>> [KJVraw]
>>>>> DataPath=./modules/texts/rawtext/kjvraw/
>>>>> ModDrv=zText
>>>>> BlockType=BOOK
>>>>> CompressType=ZIP
>>>>>
>>>>> I changed that to WEB settings, and it shows up ok.
>>>>>           
>>>> It was a cut and paste error in the conf. I fixed the conf and
>>>> re-zipped the file. It should work now (still haven't tested it! :)
>>>>
>>>>         
>>>>> Mark 1:9 seems to be a reasonable length (was over 15000 before).
>>>>>
>>>>> Line 7972 and 8275 are about 3800 characters long. There are a
>>>>> number of verses that are nearly 3000 characters long.
>>>>>           
>>>> I took a look at the verse at line 7972 in the raw NT module and it
>>>> seems reasonable. It is just the by-product of deep, rich markup.
>>>>
>>>> The nature of the strong's markup for the NT portion of the KJV2003
>>>> module and preserved here is that every word in the TR is represented
>>>> with a <w> tag.
>>>> The form of the w tag supplies the following attributes:
>>>>    src="n" where n contains the position of the Greek word in the TR.
>>>>    lemma="strong:G1234" where it provides the strong's number that
>>>> can be looked up in Strongs.
>>>>    morph="robinson:T-ASM" where it provides the morphology code that
>>>> can be looked up in Robinsons
>>>> If the word is translated into a phrase and some of the words in the
>>>> phase are not from it then the <w> tag is split into several
>>>> non-adjacent parts and each of these has extra attributes:
>>>>    type="x-split"
>>>>    subType="x-n" where n is the "split" number. (Not sure what it
>>>> means or how it was derived, or how it is used, if at all)
>>>> For Greek words that are not translated, the <w> element is fully
>>>> attributed but empty. Its position in the verse may be anywhere.
>>>> Additionally, in the KJV words that are in italics in the print
>>>> version are represented with <transChange type="added">added
>>>> words</transChange>
>>>> And if the verse contains a quote from Jesus, it is marked up as well.
>>>> Most verses in the NT part of the module also have elements
>>>> representing an audit trail.
>>>> There may be other markup.
>>>>
>>>> There are over 50 word from the Greek being represented in the verse.
>>>> There are a lot of italic words.
>>>> The verse contains a quote  of Jesus.
>>>>
>>>> So yes, length seems long, but it is very reasonable.
>>>>
>>>>         
>>>>> HTH,
>>>>>
>>>>> ----- Original Message ----- From: "DM Smith" <dmsmith555 at yahoo.com>
>>>>> To: "SWORD Developers' Collaboration Forum" <sword-devel at crosswire.org>
>>>>> Sent: Saturday, March 25, 2006 7:22 AM
>>>>> Subject: [sword-devel] KJV2006 - 4th Beta
>>>>>
>>>>>           
>>>>>> YAB (yet another beta) (downloadable from the links at the bottom
>>>>>> of http://www.crosswire.org/~dmsmith/kjv2006)
>>>>>>
>>>>>> Again, I really value your input and the time you take to evaluate
>>>>>> these betas.
>>>>>>             
>>>>> _______________________________________________
>>>>> sword-devel mailing list: sword-devel at crosswire.org
>>>>> http://www.crosswire.org/mailman/listinfo/sword-devel
>>>>> Instructions to unsubscribe/change your settings at above page
>>>>>           
>>>> _______________________________________________
>>>> sword-devel mailing list: sword-devel at crosswire.org
>>>> http://www.crosswire.org/mailman/listinfo/sword-devel
>>>> Instructions to unsubscribe/change your settings at above page
>>>>         
>>> _______________________________________________
>>> sword-devel mailing list: sword-devel at crosswire.org
>>> http://www.crosswire.org/mailman/listinfo/sword-devel
>>> Instructions to unsubscribe/change your settings at above page
>>>       
>> _______________________________________________
>> sword-devel mailing list: sword-devel at crosswire.org
>> http://www.crosswire.org/mailman/listinfo/sword-devel
>> Instructions to unsubscribe/change your settings at above page
>>     
>
>   



More information about the sword-devel mailing list