[sword-devel] GenBook osisID and URIs
Chris Little
chrislit at crosswire.org
Tue May 13 17:14:15 MST 2008
DM Smith wrote:
> Chris Little wrote:
>> I think simply
>>
>> sword://Josephus/The War of the Jews/Book 1/Chapter 2/Section 3
>>
>> should work, or
>>
>> sword://Josephus/The%20War%20of%20the%20Jews/Book%201/Chapter%202/Section%203
>>
>> encoded as an URL.
>>
> This does not answer the osisID question. If one had to encode the
> GenBook key into an osisID for an OSIS encoded GenBook how would it be
> represented, given that spaces are not allowe and periods have reserved
> meaning?
>
> Based upon the answer to that, how would the URL be?
Ok, I understand now, but I don't really have an answer. They way we
would want this GenBook key to be represented as an osisID is something
like "Josephus:Wars.1.2.3". (The title is wrong in the module. Wars is
supposed to be plural.)
If I were to redo the module (which I probably will), I would do the
osisIDs like that, and we would end up with a TreeKey of /Wars/1/2/3.
At Perseus, the URL is
http://www.perseus.tufts.edu/hopper/text.jsp?doc=Perseus%3Atext%3A1999.01.0148%3Abook%3D1%3Awhiston+chapter%3D2%3Awhiston+section%3D3
So their hierarchy is: "book 1":"whiston chapter 2":"whiston section 3".
(Whiston is the translator, and his divisions of the works of Josephus
represent one of two significant systems that I know of.)
But they also let you do lookup of "J. BJ 1.2.3" (J = Josephus, BJ = De
Bello Judico) to get that passage.
In their TEI source, they have:
<body>
<div1 type="Book" n="1" org="uniform" sample="complete">
...
<milestone n="2" unit="Whiston chapter" />
...
<milestone n="3" unit="Whiston section" />
<milestone n="54" unit="section" />
Since we want a solution here and now, we need to find a way to pass
encode our TreeKeys as osisID and figure out how to pass those back and
forth as URIs.
osisIDs can use Letters and Numbers (in the document's encoding, UTF-8
for example). All other characters are to be escaped. I _believe_, but
would like to confirm this somehow, that '_' is the replacement
character for space, and all other characters can be expressed via
\{character} escapes. (So if you want '_' you have to encode it as '\_'.)
That said, I think our osisID for the CURRENT version of Josephus would
be the rather ugly:
Josephus:The_War_of_the_Jews/Book_1/Chapter_2/Section_3
We can either encode that directly as an URI:
sword://Josephus/The_War_of_the_Jews/Book_1/Chapter_2/Section_3
or decode it and re-encode as:
sword://Josephus/The%20War%20of%20the%20Jews/Book%201/Chapter%202/Section%203
(ignoring any changes we might decide to make to my previous proposal
regarding embedded '/'.)
--Chris
More information about the sword-devel
mailing list