[osis-core] Re: Thoughts and Questions on compare file
Jim_Albright at wycliffe.org
Jim_Albright at wycliffe.org
Wed Oct 27 09:28:29 MST 2004
Thanks for looking at my problem domain.
>>>>>>>>see comments below
Jim Albright
704 843-0582
Wycliffe Bible Translators
Patrick Durusau <Patrick.Durusau at sbl-site.org>
10/27/2004 08:54 AM
To: Jim_Albright at wycliffe.org
cc: Jeff_Gayle at sil.org, osis-core at bibletechnologieswg.org
Subject: Thoughts and Questions on compare file
Jim,
Some thoughts and questions on the latest compare file.
At first blush I did not see anything that we can't handle in OSIS now
or with minor modification.
>>>>>>>>great
Decided to copy the osis-core list so we can get comments from others as
well.
Will be checking today on actually producing a more restricted schema,
that is one that takes out the x- capability and leaves only enumerated
values. I take it you are aware that your software interface does not
have to display the possibility of an x- value for attributes? That is
to say you could only display enumerated values and not leave the user
any mechanism for departing from the list?
>>>>>>>>>>>our interface will be a flat file structure Translation Editor,
like Word with paragraph and character style names
Side note to osis-core on the restricted schema: What I am envisioning
is a very small schema that imports osis-core and redefines/restricts
attribute values to the enumerated lists. Since such an instance would
be a conforming OSIS document (the greater always includes the lesser) I
don't see any compatibility problems.
>>>>>> exactly what I want
Specific comments follow:
Alluded_Text: Posting a note today to add this as enumerated value on seg.
>>>good
Attribution: Should this be on lg or l? Thinking that it is more likely
on l as part of a lg. On the other hand, don't you have the same case
where you would not be using lg or l?
>>>either will work ... there is a better case for line group as some
closures have two lines.
>>>you may want to add attributes to closer
Book_ID: I am not sure what you are asking for here? Isn't this already
contained in the osisID?
>>>> think we can skip okay
Chapter_Head, Chapter_Label, Chapter_Number: Aren't these variants of
title? That is to say that a chapter has a title and these are
'additional' titles of some particular type?
>>>>toss out chapter label
>>>>Chapter Number is <chapter osisID="Genesis.1"> we have that is osisID
>>>>Chapter Head is like Chapter Two
My first reaction is to have:
Chapter_Head = title type="main"
>>>>>I wouldn't be able to round trip this then
Chapter_Label = title type="sub"
>>>>toss out
Chapter_Number = title type="sub"
>>>> no
Granted that osisTitles (for type on title) only enumerates
<xs:enumeration value="acrostic"/>
<xs:enumeration value="continued"/>
<xs:enumeration value="main"/>
<xs:enumeration value="parallel"/>
<xs:enumeration value="psalm"/>
<xs:enumeration value="sub"/>
>>>>>>>>>> I think value="chapter" would work
Citation_Line1 (etc): Question, you list lg but I assume l?
>>>>> <citation> would be preferred
BTW, we have otPassage on seg. So, would <l><seg>...</seg></l>, meet the
need here?
Err, actually just looked at the John the Baptist example and so you
would want something like otPassage on lg. Hmmm, then you could do the
line1, line2, etc. with XSLT. Actually would reduce the amount of markup
you would need since if I am in a lg type=otPassage, then lines 1, 2,
etc. fall out from the structure. I think that works for me if it works
for you.
>>>> yes that is where I would want it .... but citation is more generic
>>>> I have cases where in the introduction a key verse in the following
book
>>>> is cited ... thus citation rather than otPassage.
Citation_Paragraph: Looks like a block quote that contains a paragraph,
which as you know can contain a reference element. I think this is what
we used to call <cit> in TEI, which had a <q> followed by a <ref> (don't
hold me to the names) element.
>>>>> Citation Paragraph and Citation Line1 are related ... if we can put
>>>>> a <cit> around them then we only need to put in p, l, ref ....
Not sure what would be different about using a block quote that contains
a reference element.
>>>>>>block quote is descriptive
>>>>>> citation is meaning based
Citation_Reference: Why isn't this simply a reference with a type? That
is to say all references are citation_references in some sense of the
term. Since we have an element for marking all references, why not use
that and add a type if necessary?
>>>>>>if we have <cit> or <citation> then just <ref> works ... the type is
inferred by context
Closing: What is the problem with closer here?
>>>>>> two types of closers need ... one for end of book, end of preface,
and other for "says the Lord" in prophecies.
Congregational_Response: I will post this to the list. Suggestion for
attribute value? response? congregation? on lg.
>>>>>>>either is okay with me
Copyright_Statement: Covered under rights in the header. This is the
standard location in Dublin Core. Don't think we gain anything by adding
another potential location for the information.
>>>>>>>> I see the need for three groups of things on the Copyright Page
>>>>>>>> this is still under development in TE
>>>>>>>> But for formatting the copyright page the three units are
>>>>>>>> Credits, Rights, and Copyright Statement ... with a possible code
for internal control (SIL adds the job code here)
>>>>> see note on Rights below
Unless you mean to say that the copyright page as an artifact needs to
be encoded. I suspect we could add a type to div but to be honest, I
don't see the point. Just make it a div and if you need copyright
information, get it from the header.
>>>> Yes I want div type="copyrightPage"
Credits: Same here, I could counsel just paragraphs with the usual
sub-elements. Don't gain anything if you have properly prepared the
header which has the very long enumeration of roles for credit, etc. I
guess in part I don't see any reason to privilege a poor retelling of
information already presented in a useful fashion in the header.
>>>>>> credits require the page number for each use ... so David C. Cook
lets us use pictures found on page x,x,x,x,x .... which may need to be
added by hand ... also sometimes the info in header is in English but in
Credits it will be in national language.
Not saying people should not enter it, but it is just an artifact of
printing and not something you will need to identify later for
processing. For those purposes, use the header information.
>>>>>Printing is our main goal so it is much more than an artifact
Doxology: Let me check on that one. I know we have discussed, probably
should be added.
>>>>> Yes .... found at end of each book of Psalms .... usually formatted
centered text
Embedded_text (all entries): We have discussed, posting to list for
adding type to q.
>>>>>> q type="embeddedText" is great
>>>>>> div type="embeddedText" works too
Emphasis: ??? Sorry, why isn't this covered by hi?
>>>>>hi says what it LOOKS like
>>>>>In a text to speech conversion how do you say italic text? How do you
say superscript?
Gloss: Hmmm, would require annotateRef so you could like to the word or
phrase being glossed. Suggest same places and content model as hi?
>>>>>>>> yes it is similar to emphasis, hand, ..... hi is okay as long as
there is <hi type="gloss">
or probably better <seg type="gloss">
Hand: Perhaps it is just confusion with the use of 'hand' to mean in
transcription circles the scribal hand and not references to hand in the
text, but I don't see this as a different element. Fair enough that the
text says: 'in my own hand' but that does not seem to me to be a
separate element in the structure of the text. We should discuss this
one. I would suggest hi or seg at first blush.
>>>>>>>>>>>but 'in my own hand' is formatted differently very often so I
need to distinguish it
>>>>>>>>> <seg type="hand">
Inscription_Paragraph: Why doesn't the inscription element work here?
>>>>> inscription/p should work fine... I wasn't thinking
Interlude: Selah is enumerated under osisLine. Suggest same for Interlude?
>>>> good Interlude and Selah are interchangeable
Intro (all): Suggest introduction type on div?
>>>>> yes <div type="introduction"> already there
Line1-*: I assume from your notes you are handling these with XPath
expressions?
Name_of_God: enumerate types, I assume you don't have any to add to the
list I posted?
>>>>> this is for YHWH
Paragraph_Continuation: As we discussed, this is handled automatically
in tree representations.
>>> yep
Parallel_Passage_Ref: Yes, handled by reference element
>>>>>>>>> yes
Quoted_Text: Why isn't this handled by q? As opposed to seg type =
quoted text?
>>>>> quoted text is ot quote in nt ... as opposed to direct speech
Refrain: add type to lg?
>>>> yes
Rights: In terms of accessible information, handled by Dublin Core
element in header. Is there some reason to duplicate here?
>>>>> may be enough here for content but
>>>>> <p type="credits">
>>>>> <p type="rights">
>>>>> <p type="copyrightStatement">
>>>>> would really help for formatting
Section_Head_List: Do you mean a list within a list? If so, note that
list contains list and all lists have head.
>>>>>> I would prefer <div type="list"> to allow for the A, B, C in Hebrew
for PSA 119
>>>>>> and the 1, 2, 3 in PRO 22
See_in_Glossary: ?? The reference element is not empty. Reference is not
limited to simply being Bible references but can contain a reference to
another part of the work, such as a glossary, perhaps a map, etc.
>>>> the need is to be able to put a star in the printed text, and
hyperlink in HTML
So_Called: Actually these are examples of the mentioned element.
>>>>> maybe a second look would help here..... I do have problems
distinguishing them but
>>>>> I believe there are two categories "mentioned" and "so called"
BTW, the example in the help file is incorrect. The last occurrence of
'sinners' is not a mentioned or so_called element. The quoted statement
of the Pharisee's were *using* the term. The preceeding uses are
examples of mentioned, that is the apeaker of those occurrences was not
*using* the term.
>>>>>> thanks for catching that
>>>> see note above
>>>> it looks like the error is in the NIV as so called is formatted with
quotes ....
>>>> Jesus uses sinners without quotes in the last line
>>>> I think the Pharisees' comment would mean true sinners but the NIV
put it in 'sinners'
>>>> so I have marked it correctly for the formatting of the text but NIV
should change
>>>> NIV is the only English text so far that I find the "sinners" used.
Speech_lines (all): I assume you are going to handle these with XPath?
>>>>> ??
Stanza_Break: add type to lg?
>>>>yes <lg type="stanza">
>>>>> just a few additions go a long way towards resolving my problems ...
like <lg type="stanza">
Title_Main and Title_Tertiary: I assume the type on title works for
main. Since sub titles should be inside of title, is there a need for
another type here?
>>>>>>three levels exist: main, secondary, tertiary
>>>>>> I can only find : main, sub in osis so would like one more
>>>>>> : main, sub, sub (nested sub should work okay ... teriary is very
rare)
Thinking <title type="main"> blah, blah <title type="sub">blah, blah
<title type="sub">blah, blah</title>(closes tertiary title) </title>
(closes secondary title) </title> closes main title, and allow you to do
uniform XPath expressions for all cases.
If you are going to use styles, etc., so no one sees the XML, suggest
the embedding method for more reliable XPath processing.
>>>>>> ?? please elaborate
Untranslated_Word: ??? Sorry, you have me on that one and I could not
find an example in the help file. Wouldn't this be foreign?
>>>>> foreign should work as it is in back translation and untranslated
Variant_Section_Paragraph/Head/Tail: Note sure what is being requested.
Look at rdg. By definition, rdg is a variant so everything inside is
about a variant. Perhaps if you could say a bit more about this one.
>>>>>> all of some endings to MRK are in italic showing it is a variant
>>>>> so <div type="variant"> would work well here
Verse_Number_Alternate: Hmmm, why not have more than one osisID, with
the alternative prefixed by a work? Display is of course up to the
application.
>>>>> so that would be <verse osisID="Genesis.32.1"><verse
osisID="xxx:Genesis.32.2>
>>>>> that would work ... and also on Chapter Number Alternate
Words_of_Christ: I take it that the who attribute works for you?
>>>> who works
>>>>>>>>> <div type="tableOfContents"> also requested
Hope you are having a great day!
Patrick
--
Patrick Durusau
Director of Research and Development
Society of Biblical Literature
Patrick.Durusau at sbl-site.org
Chair, V1 - Text Processing: Office and Publishing Systems Interface
Co-Editor, ISO 13250, Topic Maps -- Reference Model
Topic Maps: Human, not artificial, intelligence at work!
More information about the osis-core
mailing list