[osis-core] Re: [osis-user] class vs type
Patrick Durusau
patrick at durusau.net
Sat Mar 11 15:40:03 MST 2006
DM,
Interested in your suggestion of quarterly releases. Mostly my fault but
we have never really followed a traditional development sort of cycle,
public roadmap, etc.
Perhaps it is time for us to consider being a little less ad hoc and a
bit more formal.
I can't speak for everyone in the core group but the spirit is willing. ;-)
It would certainly put us in a position of better communication with the
user community and keep us moving forward. (There are some really cool
things I would like to see in 4.0 and later. ;-) Topic maps like stuff
and that sort of thing.)
I will make the rounds by phone this next week and see if there is
general agreement to doing a more formalized development process with
all the transparency of an open source project.
My situation may be about to change to the point where I could allocate
time for a more regular process.
See what you can provoke by posting to the OSIS list!
Hope you are having a great weekend!
Patrick
DM Smith wrote:
> Patrick, Troy,
> To respond to both.
>
> Patrick Durusau wrote:
>
>> Troy,
>>
>> Troy A. Griffitts wrote:
>>
>>> Patrick and DM,
>>> Very good point that it forces discussion about unresolved
>>> problems. But I'm not convinced we're not suggesting addition of the
>>> same thing: multiple x- attribute values on the type attribute to
>>> solve the problem.
>>>
>>> It seems the same to me:
>>> <tag type="x-multivalue:a:b:c">
>>> and
>>> <tag private="rend:a rend:b rend:c">
>>>
>>> I think I'd prefer the syntax of the later, because it preserves
>>> 'type' to allow something OSIS-valid for tag.
>>
> I am suggesting that OSIS be change to allow the following for type:
> (osisValue|x-attributeExtension)(\s+(osisValue|x-attributeExtension))*
> So this would result in
> <tag type="x-a x-b x-c">
> If private were defined at this point in time I would use it as Troy
> suggested
> <tag private="rend:a rend:b rend:c">
> But since private is not defined, I am going to use:
> <tag type="x-multivalue:a:b:c">
> Until appropriate support can be added.
>
>>>
>>> The more fundamental question arises... do we have any tags where it
>>> would be logical to allow multiple OSIS-valid types? or "Can a tag
>>> be of multiple types simultaneously?" And if so, what does subType
>>> mean in that context?
>>
>
> I'll give one: In the DTD I am converting there is a <bi> tag which is
> bold, italic. I need to convert this to:
> <hi type="italic><hi type="bold">...</hi></hi>
> This would be more natural as
> <hi type="bold italic">...</hi>
> Allowing this condensation results in simpler xml and that is less
> error prone.
>
>>>
>>> I still agree that DM's point is good. Currently, it's hard to
>>> store private data in an osis doc, and that might be a good thing.
>>>
>> Err, we will have to wait for DM to respond but I thought his point
>> was that it is better to *not* store private date in an osis doc.
>>
>> In other words, he wants a want to transform private data into a
>> public format, that is to keep the information but in an OSIS form.
>>
>> Is that close DM?
>
>
> Actually, I was arguing both. Having a private attribute allows me to
> work ahead of the current OSIS standard, which can be good, if I am
> working with the OSIS committee, following their guidance. But it also
> allows for decreased portability/increased proprietary documents.
> These have to be balanced. And for this reason, I urge caution.
>
> If the OSIS spec were to be updated quarterly on the basis of
> demonstrated existing need and also upon committee decision, then
> there would not be a big need to have private as a work ahead.
>
> However, with the KJV project that Troy did and which I am updating,
> there is a need to bury programmatic authoring decisions in the
> document so that revisions can be readily done. IMHO, if private is
> used, it should be "for internal use only" and should not be anything
> that an external processor must/should look at when rendering the
> document. I'm sure Troy sees more need for it than this.
>
> I don't think that two attribute are needed but it is as if there are
> two distinctly different purposes: osisFuture="..." and private="..."
> where osisFuture="..." represents a possible future for OSIS that has
> been discussed but not finalized and private="..." is truly "for
> internal use only".
>
>>
>> This may not be a type question although it started off talking abou
>> the type attriubte.
>
> No it is not a type question.
>
> I framed it as a type question only because in the set of global
> attributes, that seemed the only place such a behavior could be defined.
> I would suggest a different global attribute, say rend, style, class,
> role, or dohickey. (By the way the OSIS manual has rend as an
> attribute on the hi element! Clearly an error.)
>
>>
>> It seems that what DM needs to record is display behavior. I am not
>> sure I want to attempt to define even a fairly extensive range of
>> display behavior but that is without really looking to see what that
>> would take. It might be a fairly good sized list but any one
>> application would only need a few of those.
>
>
> Don't do it! Just open Word or OpenOffice Writer, go to the page to
> format a paragraph or to define a style and take a look at everything
> that can be blended. The vocabulary is finite but very large! Just
> provide a place to hang a symbolic constant that an external system
> can define (e.g. class values in HTML are interpreted by CSS).
>
>>
>> Sorta like type on milestone. There are any number of uses for
>> milestones but with very few values we caught the vast number of
>> usual cases where it would be used.
>
>
> One that is missing is paragraph break to record where in the
> "original" the paragraph broke. My point is not to add or argue this
> one, but that there is always "one more" that could be argued.
>
>>
>> Since I haven't looked I have no feel for whether such a list could
>> be created for display-behaviors, but suspect there must be some set
>> of usual and customary display behaviors, due to the limitations of
>> browsers if nothing else. All sorts of things are "possible" but
>> increasingly unlikely towards the margins.
>
>
> The class attribute in HTML is the attachment point for display
> behavior. It allows for CDATA but in practice is limited to a
> whitespace separated list of keys that can be used in a CSS
> stylesheet. The range of values is infinite, and in practice rather
> large and cannot be broken down into a well defined vocabulary, even
> for a problem domain, e.g. Bibles.
>
> And I don't think it is a problem domain in which OSIS should be
> wanting to define the range of display behaviors. Just provide a
> mechanism for that to happen and let people be creative in its use.
> The only caveat is that a document should be "accessible" when such
> styling is not applied. The providers of the document should also
> provide a CSS stylesheet or some kind of documentation as to the
> meaning of all "class" values. At least all open source documents.
>
>
>
>>
>> Hope you are having a great day!
>>
>> Patrick
>>
>>
>>> -Troy.
>>>
>>>
>>>
>>> Patrick Durusau wrote:
>>>
>>>> DM,
>>>>
>>>> I don't thinks that Tro was implying that he wasn't taking the
>>>> problem seriously.
>>>>
>>>> I do think you have a good point about avoiding, to the extent
>>>> possible, proprietary extensions that would decrease portability.
>>>>
>>>> Ultimately that is to no small degree a question of judging the
>>>> tradeoffs.
>>>>
>>>> Troy: What do you think about DM's comment in terms of embedding
>>>> arbitrary data that may not be documented or standardized?
>>>>
>>>> Hope you are having a great day!
>>>>
>>>> Patrick
>>>>
>>>> DM Smith wrote:
>>>>
>>>>> I think private would be good as a convienent place for a
>>>>> work-ahead, but I'd be concerned that it be used for a work around
>>>>> without meaningful discussion here. And I'd be concerned if it
>>>>> became an easy way out. As it stands, the problems I face and have
>>>>> posted here are real and have been taken seriously. I think in
>>>>> part because there is no (good) way to do it in OSIS. I've really
>>>>> appreciated the progress each version of OSIS has made. I'd like
>>>>> to see that continue.
>>>>>
>>>>> I have worked with a few DTD's now and I am impressed with OSIS.
>>>>> Most of the other DTDs allow for arbitrary markup that in essence
>>>>> makes a document proprietary as processing it would require custom
>>>>> routines. The way OSIS is written right now, only the
>>>>> attributeExtensions are proprietary.
>>>>>
>>>>> Troy A. Griffitts wrote:
>>>>>
>>>>>> Patrick,
>>>>>> A while back, we had briefly discussed adding a global
>>>>>> 'private' attribute to the schema. Basically, a place for
>>>>>> organizations to place private use information on any tag. Not
>>>>>> sure where people fell on the sides of that issue, but I would be
>>>>>> in favor of such an attribute.
>>>>>>
>>>>>> o It would allow me to have a basic valid OSIS document while
>>>>>> we debate how to move all the private data into best practice OSIS.
>>>>>> o It would allow helpful runtime information to be stored by
>>>>>> our engine and still allow schema validation against the document.
>>>>>> o It would allow any data to be stored (like DM's example)
>>>>>> which don't directly map to OSIS.
>>>>>>
>>>>>>
>>>>>>
>>>>>> Patrick Durusau wrote:
>>>>>>
>>>>>>> DM,
>>>>>>>
>>>>>>> I take it is your requirement that you be able to have NMTOKENS
>>>>>>> as the data type for the type attribute?
>>>>>>>
>>>>>>> I don't have any strong objections to space delimited data types
>>>>>>> so I will pass it along to the core group and see if we can get
>>>>>>> a consensus on that.
>>>>>>>
>>>>>>> Our next release will be OSIS 2.5 next Fall. I am hopeful we
>>>>>>> will see some more tools and stylesheets posted in the meantime.
>>>>>>>
>>>>>>> Hope you are having a great day!
>>>>>>>
>>>>>>> Patrick
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> DM Smith wrote:
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Patrick Durusau wrote:
>>>>>>>>
>>>>>>>>> DM,
>>>>>>>>>
>>>>>>>>> Are you saying that the DTD has a rend attribute that can have
>>>>>>>>> several possible values at the same time?
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Yes. Rend is defined as NMTOKENS. Which allows a space
>>>>>>>> separated list of NMTOKEN.
>>>>>>>> The global attributes on elements in this system are:
>>>>>>>> id ID #IMPLIED
>>>>>>>> lang IDREF #IMPLIED
>>>>>>>> n CDATA #IMPLIED
>>>>>>>> rend NMTOKENS #REQUIRED
>>>>>>>> type NMTOKEN #IMPLIED
>>>>>>>>
>>>>>>>> In some cases rend is not required.
>>>>>>>>
>>>>>>>> I have mapped
>>>>>>>> this DTD's OSIS's
>>>>>>>> id id
>>>>>>>> lang xml:lang
>>>>>>>> n n
>>>>>>>> rend subType
>>>>>>>> type type
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>> While I don't doubt that is possible, I am not sure why anyone
>>>>>>>>> would want to do it.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> It does not quite matter why. I am working with a legacy
>>>>>>>> document that has it and I need to preserve it.
>>>>>>>> It is needed to express that an element belongs to different
>>>>>>>> classes of presentation simultaneously. See below for more.
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Sounds like a hack to allow poorly written XSLT stylesheets.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> It has nothing to do with XSLT stylesheets. Its for CSS
>>>>>>>> stylesheets. XSL was originally intended to allow the
>>>>>>>> transformation and the styling of a document. XSLT only
>>>>>>>> implemented the transformation aspect. IIRC, it was felt that
>>>>>>>> CSS would do the job of presentation. I haven't looked at it
>>>>>>>> yet but it looks like xsl-fo is intended to style a document.
>>>>>>>>
>>>>>>>> The value of the HTML class attribute and this DTD's rend
>>>>>>>> attribute is that it allows for the separation of presentation
>>>>>>>> and content. Prior to it, one embedded the presentation
>>>>>>>> directly into the document.
>>>>>>>>
>>>>>>>>>
>>>>>>>>> For example, if I have <q type = "emphasis">, even though I
>>>>>>>>> only have one value for type, nothing prevents me from having
>>>>>>>>> different renderings of the contents of <hi> based upon its
>>>>>>>>> position in the markup tree, for example <hi type="emphasis">
>>>>>>>>> being rendered differently when it is a child of <title> from
>>>>>>>>> when it is a child of <p> versus when it is a child of <q>.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> This is absolutely true and has no bearing on whether rend has
>>>>>>>> one value or multiple ones. Or whether having multiple value is
>>>>>>>> of any value (pun intended).
>>>>>>>>
>>>>>>>> Allowing multiple values allows the expression of different
>>>>>>>> kinds of roles/dimensions to be used at the same time.
>>>>>>>>
>>>>>>>> For example, we may want an ordered list or an unordered list.
>>>>>>>> And because nesting is allowed, we may have lists of lists. And
>>>>>>>> any list in the tree can be either. But what if we want to have
>>>>>>>> different kinds of ordered lists and unordered lists.
>>>>>>>> Say
>>>>>>>> revealed - all the children of a node are shown with the parent
>>>>>>>> initially-hidden- all the children of a node are initially
>>>>>>>> hidden but stay shown until hidden again
>>>>>>>> popup - shown for a time when a user expresses interest in
>>>>>>>> them.
>>>>>>>> And, as an processing optimization it is needed to be known
>>>>>>>> whether the list of children is to not wrap, wrap in a narrow
>>>>>>>> presentation or a wide presentation.
>>>>>>>>
>>>>>>>> And in this example, it is both possible and reasonable to have
>>>>>>>> a list of children that have different behaviors.
>>>>>>>> (This is a simplification of real world example of a system I
>>>>>>>> wrote using CSS. There were other dimensions as well, such as
>>>>>>>> data source: synthetic, program generated, user input. The HTML
>>>>>>>> document were simple lists with <ul> and <ol> having multiple
>>>>>>>> class values, with each class value representing a different
>>>>>>>> concept.)
>>>>>>>>
>>>>>>>> To do this with a single value, I would need one for each
>>>>>>>> possible combination; in this case a set of 2x3x3=18 different
>>>>>>>> values. (Well actually 9, because HTML has ul and ol)
>>>>>>>>
>>>>>>>> In the case at hand, the element is a paragraph tag. It is not
>>>>>>>> clear what the different values are, but let me suppose that
>>>>>>>> they deal with justification, first-line indentation,
>>>>>>>> subsequent-line indentation, line spacing, handling of first
>>>>>>>> letter, etc. The application of these behaviors is entirely
>>>>>>>> unpredictable in the document, so creating a general purpose
>>>>>>>> stylesheet is out of the question and a specific one would end
>>>>>>>> up as a complex program that has too great a knowledge of the
>>>>>>>> document.
>>>>>>>>
>>>>>>>> Since I am dealing with transforming a legacy document into
>>>>>>>> OSIS, I need a way to preserve the values. I am wondering how.
>>>>>>>> (As a programmer, I can figure out many work arounds, such as
>>>>>>>> littering the document with hi elements or replacing spaces
>>>>>>>> with a character that is not allowed in NMTOKEN, like ':'
>>>>>>>> type="x-multivalue:a:b:c") And if this ability will be added to
>>>>>>>> a later version of osis, I would like to pick a hack that would
>>>>>>>> allow a good path to it tomorrow.
>>>>>>>>
>>>>>>>> In order to separate presentation from structured content,
>>>>>>>> there needs to be a semantic for deterministically attaching
>>>>>>>> presentation to content. This is what the class and rend
>>>>>>>> attribute provide.
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Does that catch the gist of the problem or have I
>>>>>>>>> misunderstood the issue? (It is 5 AM local time so the latter
>>>>>>>>> is entirely possible.)
>>>>>>>>>
>>>>>>>>> Hope you are having a great day!
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> And I hope you are having a good and full night of sleep.
>>>>>>>>
>>>>>>>> --DM
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Patrick
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> DM Smith wrote:
>>>>>>>>>
>>>>>>>>>> In html elements define a class attribute which indicates
>>>>>>>>>> that a particular element in its context belongs to a class
>>>>>>>>>> of that element. The primary use of this is to indicate where
>>>>>>>>>> styles can be attached. It appears that type can be used for
>>>>>>>>>> the same purpose, but not quite. In html, class is defined as
>>>>>>>>>> "class space separated list of classes" It's type is
>>>>>>>>>> CDATA, but in spirit is NMTOKEN. What this allows is for an
>>>>>>>>>> element to be cross-classified. That is more than one class
>>>>>>>>>> can apply.
>>>>>>>>>>
>>>>>>>>>> However in OSIS there is only one type "word" that can be used.
>>>>>>>>>>
>>>>>>>>>> I am working to convert an xml document to OSIS. This
>>>>>>>>>> document's DTD defines an attribute, rend, in much the same
>>>>>>>>>> way as class, in that it is a "role" to which style should be
>>>>>>>>>> applied, with the possibility of several "roles".
>>>>>>>>>>
>>>>>>>>>> What is the proper way to do this?
>>>>>>>>>>
>>>>>>>>>> I can figure out several ways to do this but none seem quite
>>>>>>>>>> right. For example, artificially, I can nest <hi> elements to
>>>>>>>>>> achieve a similar, but not quite the same result.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>> _______________________________________________
>>> osis-core mailing list
>>> osis-core at bibletechnologieswg.org
>>> http://www.bibletechnologieswg.org/mailman/listinfo/osis-core
>>>
>>>
>>>
>>
>
>
>
--
Patrick Durusau
Patrick at Durusau.net
Chair, V1 - Text Processing: Office and Publishing Systems Interface
Co-Editor, ISO 13250, Topic Maps -- Reference Model
Member, Text Encoding Initiative Board of Directors, 2003-2005
Topic Maps: Human, not artificial, intelligence at work!
More information about the osis-core
mailing list