[sword-devel] av11n mappings
Chris Little
chrislit at crosswire.org
Sun Oct 3 21:28:50 MST 2010
On 10/3/2010 6:19 PM, Robert Hunt wrote:
> Dear all,
>
> I've been investigating for the last two weeks about creating a small
> open repository under the OpenScriptures banner for storing and
> maintaining (and even documenting) XML lists of versification schemes
> and international booknames, versification mappings, USFM and OSIS
> booknames and abbreviations, etc. I've already received a positive
> response from the author of Bibledit about starting with some of his
> lists. I realise that any particular format will never please everyone,
> but I'm interested in your comments on it's potential usefulness. I
> found that I needed such things personally for a project so rather than
> reinventing them yet again, I figured that every Bible program must need
> them so why not make them available (if they're not already).
>
> I guess my questions are:
>
> 1/ Are these sorts of lists already freely available in a suitable
> place (preferably independent of a particular program)? If so, no
> need for me to proceed.
There's quite a lot of data currently available. Little of it is in XML
or any other human-friendly format, but you're welcome to mine our data
and put it in a more presentable form. However, if you work with real
data (extracting v11n data from actual Bibles), you'll quickly discover
that the number of v11n systems is nearly equal to the number of
different translations (excluding those that use the KJV v11n exactly,
since that system actually is quite common).
Most of our data is at
https://crosswire.org/svn/sword-tools/trunk/versification/ (including
the basicv11ns subdirectory). The v11nsystem.pl script will generate a
v11n definition file from a variety of formats in the format that we use.
An explanation of our canon definition format, found in the XML files at
the above address, is at
http://www.crosswire.org/wiki/Alternate_Versification/Canon_Definition_Format
CCEL has data (v11n & mapping) that they received from Wycliffe,
presented in an XML (OSIS-like) format:
http://www.ccel.org/refsys/refsys.html. However, their data is extremely
inaccurate. (I don't know who is to blame for the inaccuracy & errors.)
There's also v11n & mapping data available as part of the STEP spec:
http://www.crosswire.org/bsisg/download.htm
As for localized book names, Logos has a ton of this data, which they
collected through community contributions back around 2000, when they
were gearing up to release Logos Series X. They may or may not be
amenable to sharing this.
> Assuming they're not:
> 2/ Where might the information be gleaned from (with suitable
> permissions)?
> 3/ Apart from the above (versification schemes & mappings, USFM/OSIS
> booknames/filename/abbreviation standards, international
> booknames/abbreviations), what other lists do you suggest might be
> useful?
> 4/ Would your program be interested in taking advantage of such XML
> lists?
> 5/ If not, would another format be helpful?
XML is fine; we can make converters. Our interest in using this kind of
data would be dependent on its utility and its accuracy.
--Chris
More information about the sword-devel
mailing list