[jsword-devel] Patching one versification to another

Chris Burrell chris at burrell.me.uk
Fri Mar 15 01:31:34 MST 2013


Agreed on all your points below around sword and files and flexibility.

Splits are only used to refer to the master system. They are not necessary
but they add resolution to the mapping lookups. Here's the reason.  If a
and b split verses in a similar way we would like to be able to go from a
to b without ending up with a bigger chunk of text.

Gen.1.1=Gen.1.1a
Gen.1.2=Gen.1.1b

If both versifications define the above we can go from gen.1.1 in one
versification to the equivalent in the target. Without the split, the
intermediate master resolves to gen.1.1 which in turn resolves to Gen.1.1
and Gen.1.2

In your kjv example chapter 4, I'd like to introduce new ids instead of
extending the kjv verses (e.g. Dan.4e.5. Where e means extra and can be in
middle, at the beginning so long as we adopt a convention for naming them
but the id can be completely different say songofchildren.1 if we're
referring to extra bits in Daniel).

People know the kjv so when working on a mapping I think it's important to
have the ids refer to a well known system. It also means that you can use
the normal ids and look it up straight away to see what content belongs to
it in the kjv.  The extra ids and splits can be documented. The benefit of
this system is that the master reference only needs to be documented not
coded although it could be if we wanted to. The other problem with
modifying the kjv is that due to one versification being slightly odd, you
affect all the other ones as they then have to cope with the shift of 7
verses. Introducing a new versification becomes harder. Splits and new ids
cater for this.

For absent verses we could diet define a absent section in the file, maybe
that is an array of properties:
absent=gen.1.1
absent=gen.*
...

Just a few thoughts.
Chris
On 15 Mar 2013 01:10, "DM Smith" <dmsmith at crosswire.org> wrote:

> When searching the archives use the word "mapping", there are some
> pertinent discussions.
>
> If you come up with a mapping that works that's great! If our data
> structure is very clear and can be migrated to C++ (by someone else) that's
> even better. If we can externalize it as a file that can be loaded by
> JSword, then SWORD can use it directly.
>
> BTW, I hope to be able to externalize the V11N so that a module can supply
> one. With mapping (patching) another file would need to be supplied.
>
> The representation of a mapping in Java is a lot easier (as you have
> described) than getting the data correct.
>
> Some embedded comments below:
>
> On Mar 14, 2013, at 7:03 PM, Chris Burrell <chris at burrell.me.uk> wrote:
>
> I will wait to see if anyone responds, but I thought I'd emailed a few
> months ago and got no response (although I can't find my original email).
> While I understand that support for versification as a whole might be
> tricky, I'm not sure I agree with the fact that patching from one to
> another is tricky as well.
>
> If you allow you master versification to be able to have split verses,
> then everything becomes rather trivial
>
> *2 verse in the original, 1 verse in the KJV-based master*
> Gen.1.1=Gen.1.1
> Gen.1.2=Gen.1.1
>
>
> Right.
>
>
> *1 verse in the original, 2 verses in the master*
> Gen.1.1=Gen.1.1-Gen.1.2
>
> Is there a good representation that the rest of the verses are off by one?
>
>
> *Using splits, so that 2 separate versifications can refer to the same
> parts of verses*
> Gen.1.1=Gen.1.1a
> Gen.1.2=Gen.1.1b
> Gen.1.3=Gen.1.1.c
>
>
> I'm not sure that splits should be represented. The modules certainly
> wouldn't mark the division between a, b, c, ....
>
>
> Then another versification can also refer to the same parts. It would
> become necessary to keep track of what the parts are, such that they can be
> easily re-used when approrpriate. For example, a second versification might
> be mapped as follows:
>
> Gen.1.1=Gen.1.1a-Gen.1.1b
> Gen.1.2=Gen.1.c
>
> *No mapping*
> Osis IDs mean the same thing.
>
> yes.
>
> I think the above covers, "Split verse", "Merged verse", Different verse
> boundaries.
>
> *Chapter boundaries can be mapped equally the same:*
> Gen.1.1=Gen.1.1
> Gen.1.2=Gen.2.1
>
>
> Having a compact representation for the resulting shift would be good.
>
>
> indicates that that Genesis 1:2 can be found in Gen 2:1 in the master.
>
> *Extra verses*
> We simply introduce some identified ids (osis ids, or other form of unique
> identifier) in the master to identify the content of these verses
>
> Additional verses within a chapter can be represented using the above.
>
>
> So, the KJV has a chapter 4 with 20 verses. A v11n B differs from the KJV
> in that B has 6-11 that are not in the KJV and 12 to the end of the chapter
> in B is the same as 6-20 in the KJV.
>
> So then the KJV is not the base, but a modified representation of the KJV
> that has 7 more verses in chapter 4. This modified representation is what I
> was calling the "rosetta stone".
>
> I think a mapping has to be bi-directional.
>
> Missing verses
> A v11n might leave out something in the middle of a KJV chapter. (The ESV
> has this, but shrewdly has a note in those verses explaining that the verse
> is not Biblical).
>
> You said that if a mapping is not present then the verses are the same
> between the two. How do you represent a hole?
>
>
> *Psalm headings:*
> Ps.53.1=Ps.53.1 (not required, or required if map it to 'nothing' or '0')
> Ps.53.2=Ps.53.1
> Ps.53.3=Ps.53.2
>
> We can reduce this to if we want to introduce a less verbose way of
> mapping things.
> Ps.53.2-7-=Ps.53.1-+1 (where minus simply indicates that there is an
> offset of 1 compared to the master)
>
> I hadn't noticed this when I asked about a compact representation.
>
>
> I don't think we need to introduce splits on the left hand-side. The
> reason being, you can't do anything with an OSIS id of Gen.1.1a, since
> you're going to retrieve a whole verse anyway, so we can keep things with
> splits only-ever on the right hand side.
>
>
> I don't think we need splits on either side.
>
> For verse ranges, we can expand out to its list of verses contained in the
> source versification first. Then there needs to be a choice by the
> user/software of whether we're attempting to compare a contiguous section
> in the target versification, or verses of the same content..
>
> ---
>
> I saw some posts on the archives, but not a lot. One by Greg H, whom I
> agree with, in that the KJV versification should be the master + extended
> bits from the apocrypha. That makes it easier for people to write mappings.
>
> I agree with your last point, that coming up with the mappings can be
> hard, and it will take someone to understand where the verses really
> different in content. But I think the above system is pretty simple.
>
> I disagree with the idea of having a master versification being "the
> rosetta stone", if by that we mean inflexible to change.
>
>
> I didn't mean it should be inflexible to change. It may be resistant to
> change.
>
> There are bound to be changes (new/custom versifications, bug fixes,
> oversights, etc.) We can easily provide a tool that inserts splits and
> rewrites all the mappings safely for that.
>
>
> Yes.
>
>
> I'm happy to work with others if they want to work on this, but I've found
> most of the posts on the lists about this are rather old. I'll wait and see
> what comes of my post, but on the other hand, STEP needs this rather
> quickly (doing text comparisons, parallel texts and interlinears). I have
> the mappings for the Masoretic Text which would solve most of STEP's
> interlinear issues when using 1 or more ancient texts. And there's nothing
> to say that we don't put in the mappings one by one, especially if we
> provide a tool that can do that.
>
>
> Go for it.
>
>
> To be clear, I'm not trying to solve your last point of "In each of these
> it is important to determine what will be represented by a v11n".
>
>
> My point was that the granularity of a split would never be represented in
> the data structures of a v11n or in a module, so it can safely be ignored.
>
> This someone else can do when they come up with a different v11n. The
> problem of mapping them is distinctively separate and the one I'm trying to
> address.
>
> Maybe I'm missing something?
>
>
> Be ready for surprises. I don't know anything that's missing.
>
> Chris
>
>
>
> On 13 March 2013 22:46, DM Smith <dmsmith at crosswire.org> wrote:
>
>> The work has started. And it is very hard. Not trying to discourage you,
>> but it'd be better to work with others. Give it some time. Perhaps search
>> the sword-devel archives for discussions regarding the difficulties and the
>> work that needs to go into it.
>> Harry Plantinga (over at CCEL) has done some work on this, too. (Hope I
>> have his name right.)
>>
>> Here are some of the issues of comparing one version to another:
>>
>>    - Split verse. This causes an increase in the number of verses in one
>>    chapter.
>>    - Merged verses. This causes a decrease in the number of verses in
>>    one chapter.
>>    - Different verse boundaries. This causes no change in the number of
>>    verses in one chapter. (E.g. Some Greek NT have then end of John 1:3 as the
>>    start of John 1:4)
>>    - Split chapters. This causes the reduction of the number of verses
>>    in one chapter and the increase in the number of chatpers in a book. (The
>>    German tradition splits the last chapter of Malachi)
>>    - Different chapter boundaries. This causes the number of verses in
>>    one chapter to increase(decrease) and in the next to decrease(increase).
>>    (The Greek NT's sometimes put the last few verses of one chapter into the
>>    beginning of the other. or the first few verses of one chapter at the end
>>    of the prior).
>>    - Additional or fewer verses at the end of a chapter. (Mark 16:9-20
>>    which some regard as an apocryphal addition to the NT.)
>>    - Additional or fewer verses in the middle of a chapter. (The
>>    placement of the samaritan woman at the well differs by tradition.)
>>    - Additional or fewer chapters in the middle of a book.
>>    - Psalms having the canonical introduction to the Psalm being verse 1.
>>    - The Apocrypha is a mess of inconsistencies. In some versifications,
>>    the apocrypha is inserted into canonical books of the Bible. E.g. Esther,
>>    Daniel.
>>
>>
>> In each of these it is important to determine what will be represented by
>> a v11n. For example, verses that have different verse boundaries would not
>> be represented by different v11ns.
>>
>> I think Chris Little was working on a new v11n that basically took all
>> (many?) of the different traditions and merged them together. Then this
>> would be the rosetta stone. I don't know how far along he is on that. Or
>> whether it is still in ideation.
>>
>> With the "super" v11n, it would be used transitively to line up the
>> different verses.
>>
>> Some v11ns are traditions by language. E.g. Germans and Russians each
>> have their own particular v11ns. It'd be reasonable for a bi-lingual user
>> to view books in parallel from different v11ns. I wonder whether native
>> speakers will be needed to help figure out the mapping.
>>
>> While I have studied English, French, Italian, Spanish, German, Greek and
>> Hebrew and have a Masters of Divinity, I don't feel competent to tackle
>> this. (code wise yes, but not the actual mapping) Also, there are other
>> things that more readily occupy my frontal lobes.
>>
>> -- DM
>>
>>
>> On Mar 13, 2013, at 5:54 PM, Chris Burrell <chris at burrell.me.uk> wrote:
>>
>> Hi DM
>>
>> I'm wondering about doing some of the work to convert from one v11n to
>> another, especially if I don't hear back on the sword-devel.
>>
>> I'm thinking of doing the following :
>> - create a converter that goes through to a "master" versification as you
>> suggested. We define the mappings for those OsisIDs that don't match. The
>> master version is based on the KJV versification (for ease of being able to
>> create new mappings) + the books that aren't in the KJV.
>>
>> - mappings include just the bits that don't match up.
>>
>> - mappings can be define for sets of verses with offset, e.g. Psalm
>> 51:1-5=Psalm 51:1+1 (where +1 indicates an offset of 1 to be applied to the
>> source.
>>
>> - Have the concept of split verses (i.e. Rom 1:1a, Rom 1:1b, etc.)
>>
>> - A tool to create a split in the master versification, which rewrites
>> all the current versifications, such that when someone writes a mapping and
>> needs to split a verse, they can introduce the split without adversely
>> affecting every other mapping.
>>
>> - the lookup process takes an String (OSIS ID) with a source and target
>> versification. (also overload that with Key/Verse). It then queries the
>> master mapping, which returns 1 or more entries (an entry being a whole
>> verse, or set of split verses). Then goes to reverse mappings for the
>> target versification and does the same. For incomplete verses on one side
>> or the other, we round up the closest verse.
>>
>> - The reason for basing the versification on KJV (or some other English
>> versification), is that's it easy to work against. The alternative would be
>> to go for the most split versification ever found, but that becomes painful
>> in the future if someone decides to split another verse in 2 parts.
>>
>> - When introducing a split, we would want to record what the split
>> actually refers to (i.e. what content). This wouldn't be used by the
>> library, but instead be useful for people coming along and writing new
>> mappings.
>>
>> Those are some of my thoughts so far.
>> Chris
>>
>>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.crosswire.org/pipermail/jsword-devel/attachments/20130315/796686f9/attachment-0001.html>


More information about the jsword-devel mailing list