[osis-users] Topic Maps
Troy A. Griffitts
scribe at crosswire.org
Fri Jan 15 18:38:38 MST 2010
Thanks Patrick. So had we planned a subjectIdentifier attribute on
either <w> or <name> (as Peter pointed out we added likely for proper
name indication)?
Steve, do you remember our discussion when we added marker to the <q>
attribute, when we talked about a generalized defaulting mechanism which
would allow the header to contain things like:
<default>//q[@level="1"]/@marker='"'</default>
<default>//q[@level="2"]/@marker="'"</default>
<default>//w[@lemma="([^:]*)"]/@lemma="strong:\1"</default>
Anyway, I was just wondering what happened to this idea? I'm not sure
I'd want to implement a fullblown xquery parser like what would be
required in my example above, but some basic defaulting mechanism would
still be nice.
Patrick, in your example, I'd like to be able to say something like:
<default>//w[@subjectIdentifier="(.*)"]/@subjectIdentifier="http://crosswire.org/names/\1"</default>
so I could simply use in my doc:
<w subjectIdentifier="jerusalem1">Jerusalem</w>
But this is merely to clean up my markup in the event our docs are ever
opened in an editor by a human, and to potentially prevent errors when
hand editing. Sorry, I just like to factor stuff out when possible.
Patrick Durusau wrote:
> The question is one of how much information do you want to store in the
> identifier that appears when you mark a reference to a subject?
Yes, having this level of indirection that a subjectIdentifier provides
serves a great purpose and is perfect if I'm 'at' an element I want to
dig deeper into. But my current objective is to find all place names in
a document, which would require me to dereference each identifier,
querying the referent for the 'type' of each subject, e.g., "geo-city".
Hence my poorly applied lemma/morph scheme:
<w lemma="placenames:jerusalem1"
morph="placenamestype:geo-city">Jerusalem</w>
makes processing for my immediate objective easier. You mentioned above
that the question is 'how much information' to store in the identifier
itself... So is this suggesting a solution like?:
<w subjectIdentifier="geo/city/jerusalem1">Jerusalem</w>
This would give me what I need to easily process the data (even if we
had to specify the full:
subjectIdentifier="http://crosswire.org/names/geo/city/jerusalem1")
Thanks for the discussion on this!
I feel your pain. My primary laptop died in December and I purchased a
netbooky hp dm3 thingy to hold me over until I could order a
replacement. I just finished MOVING all of my data over to this new
little thing's large (by comparison to my old system) 320Gig drive and
days later the new drive crashed. Now I'm booting Ubuntu on the new
computer with my old 100Gig drive plugged into the USB port (old drive
is PATA, new computer is SATA) until my real laptop replacement gets
here. And all my data on the 320Gig new drive is lost! I was picking
and choosing folders from my old drive and did moves instead of copies
so I could remember what I had already grabbed. Stupid me. Did you
find an affordable data recovery service?
Troy
>
> Take your example:
>
> <w
> subjectIdentifier="http://www.crosswire.org/names/jerusalem">Jerusalem</w>
>
> Elsewhere, there is a topic in a topic map that has that same
> subjectIdentifier property and it is a records that the subject it
> represents, is an instance of type place, along with names for it in
> other languages and any other information you want to record about that
> subject.
>
> The key is the use of a subjectIdentifier to identify the subject. Why?
>
> Because someone else, in another Bible project may have:
>
> <w
> subjectIdentifier="htttp//www.otherproject.org/geonames/israel/jerusalem">Jerusalem</w>
>
>
> Now what?
>
> Well, any topic can have a *set* of subjectIdentifier properties which
> signals that both subjectIdentifiers identify the same subject.
>
> (Note I have used the XTM syntax for the attributes but it would be
> possible to declare equivalent subject identifiers even if they were in
> different formats or structures. I am working on an example using XQuery
> to make that point. Probably won't be ready for a week or so. My main
> system died last night but due to disk mirroring and paying a lot of
> money, I got it back late this afternoon.)
>
> That will allow you to disambiguate all the names as well as to add far
> more information that you could possibly put in an attribute. Such as
> marking the morphology of a lemma and displaying for a user the
> distribution of that lemma over a book or range of books. (Assuming you
> represented all of those as occurrences or even associations with
> explicit roles if you liked.
>
> Yes, I have been thinking about topic maps and biblical texts a lot. ;-)
>
> Hope you are having a great day!
>
> Patrick
>
More information about the osis-users
mailing list