[sword-devel] GenBook osisID and URIs
DM Smith
dmsmith555 at yahoo.com
Tue May 13 17:44:17 MST 2008
On May 13, 2008, at 8:14 PM, Chris Little wrote:
> DM Smith wrote:
>> Chris Little wrote:
>>> I think simply
>>>
>>> sword://Josephus/The War of the Jews/Book 1/Chapter 2/Section 3
>>>
>>> should work, or
>>>
>>> sword://Josephus/The%20War%20of%20the%20Jews/Book%201/Chapter%202/Section%203
>>>
>>> encoded as an URL.
>>>
>> This does not answer the osisID question. If one had to encode the
>> GenBook key into an osisID for an OSIS encoded GenBook how would it
>> be
>> represented, given that spaces are not allowe and periods have
>> reserved
>> meaning?
>>
>> Based upon the answer to that, how would the URL be?
>
> Ok, I understand now, but I don't really have an answer. They way we
> would want this GenBook key to be represented as an osisID is
> something
> like "Josephus:Wars.1.2.3". (The title is wrong in the module. Wars is
> supposed to be plural.)
>
> If I were to redo the module (which I probably will), I would do the
> osisIDs like that, and we would end up with a TreeKey of /Wars/1/2/3.
>
> At Perseus, the URL is
> http://www.perseus.tufts.edu/hopper/text.jsp?doc=Perseus%3Atext%3A1999.01.0148%3Abook%3D1%3Awhiston+chapter%3D2%3Awhiston+section%3D3
>
> So their hierarchy is: "book 1":"whiston chapter 2":"whiston section
> 3".
> (Whiston is the translator, and his divisions of the works of Josephus
> represent one of two significant systems that I know of.)
>
> But they also let you do lookup of "J. BJ 1.2.3" (J = Josephus, BJ =
> De
> Bello Judico) to get that passage.
>
> In their TEI source, they have:
> <body>
> <div1 type="Book" n="1" org="uniform" sample="complete">
> ...
> <milestone n="2" unit="Whiston chapter" />
> ...
> <milestone n="3" unit="Whiston section" />
> <milestone n="54" unit="section" />
>
>
>
> Since we want a solution here and now, we need to find a way to pass
> encode our TreeKeys as osisID and figure out how to pass those back
> and
> forth as URIs.
>
> osisIDs can use Letters and Numbers (in the document's encoding, UTF-8
> for example). All other characters are to be escaped. I _believe_, but
> would like to confirm this somehow, that '_' is the replacement
> character for space, and all other characters can be expressed via
> \{character} escapes. (So if you want '_' you have to encode it as
> '\_'.)
>
> That said, I think our osisID for the CURRENT version of Josephus
> would
> be the rather ugly:
>
> Josephus:The_War_of_the_Jews/Book_1/Chapter_2/Section_3
>
> We can either encode that directly as an URI:
>
> sword://Josephus/The_War_of_the_Jews/Book_1/Chapter_2/Section_3
>
> or decode it and re-encode as:
>
> sword://Josephus/The%20War%20of%20the%20Jews/Book%201/Chapter%202/Section%203
>
> (ignoring any changes we might decide to make to my previous proposal
> regarding embedded '/'.)
One of the things that I have been thinking about is a module having
two keys per entry: an internal key and an external key. Just like the
Perseus example.
Both would be valid for lookup. The external keys for a genbook would
be the tree text as we see it today. The internal key would be as you
gave examples above.
I've done this for several projects and it pretty simple to layer on
our existing genbook module.
I'm thinking about doing it synthetically in BibleDesktop for TEI
Strong's dictionaries, digging out the <orth> value and allowing the
user to choose which to show/search.
-- DM
More information about the sword-devel
mailing list