[sword-devel] search idea

Trevor Jenkins sword-devel@crosswire.org
Tue, 28 Dec 1999 13:59:27 +0000


On Sunday, 26 December, 1999 21:02:27, Jerry Hastings <hastings@dancris.com>
wrote:

> At 01:51 PM 12/26/1999 +0000, Trevor Jenkins wrote:
>>At the moment I putting a specification together. When I've connoceted a
>>first draft I'll put it around for comments.
>
> Please give some thought to adding a synonym feature to searches.

When I started down this track I thought that a thesaurus feature was
unecessary. However, after receiving your message and remembering some
searches that I've made I've changed my mind. :-) One tedious search that I
did with Online Bible for Macintosh was for jewels, sadly this involved me
consulting a Bible dictionary and then searching for every individual named
jewel. What I want then was the ability to search on this broader term
(jewel) and have OLBM find sapphire, carnelian, etc for me.

A thesaurus feature does not affect the indexing of the Bible texts, books,
dictionaries or any other material that might be of interest to us as
searchers. Primarily it is part of the search language. So instead of saying

    FIND sardius* or topaz* or carbuncle* or emerald* or sapphire* or
    diamond* or jacinth* or agate* or amethyst* or beryl* or onyx* or
    jasper* or carnelian* or chryso*

I could have said

    FIND NT(jewel)

and had all those terms found for me.

The structure of a thesuarus could be:

   term
       broader term
       narrower terms
       related terms
       use for
       scope
       synonyms

There is an ANSI standard for this structure, which is similar to the one
I've given above. The trick is to use same inverted file scheme for
thesaurus files as other text.

> Consider
> Genesis 32:15. The KJV has "forty kine", the BBE has "forty cows." I would
> like to be able to find it by either "cows" or "kine" and not have to
> select KJV or BBE. If I am in KJV give me the verse even though I searched
> for "cows."

One serious issue is how are the thesaurus links to be established (i.e.
that carnelian is a narrower term of jewel or that kine is a synonym for
cow.)

An automated solution might be possible if every Bible text was marked up
with Strong's numbers.

> One way to do this is to search every bible index not just the index for
> the currant module. A better way is to create an index that combines
> entries from other indexes.

I'd prefer to keep this as separate indices. I know it's feasible to have
multiple inverted files open and use them in parallel.

> An index of English words could be created by
> taking the data from all indexes of English Bibles.

Being a European (one of that rare breed of an Englishman who can speak more
than English and its American dialect) I would not restrict this feature to
English only translations. I might for example search the RSV, KJV, NIV
English translations against Swedish, Romanian and BSL translations.

> If we could use that kind of index there are some other enhancements we
> could add. A Bible text could be edited to include unsaid words that are
> implied or indirectly referred to. To verses that contain "Lamb of God" one
> could add the words "Jesus" and "Christ". Then when the text is index into
> the synonym index those verses will become hits for "Jesus" and/or "Christ"
> when a synonym search is done.

Whilst technically possible this is somewhat "dangerous". It could become
devisive due to differing theological assumptions.

> Some other indexes that would be great to have are indexes of mood,
> setting, topic, and action.

Interesting idea. However, when the topic of mood etc comes up on the Bible
Greek and Bible Hebrew (academic) lists the conversation goes on for weeks
without any real resolution as to the mood of specific verses.

> Of course there are already topical indexes.
> What I want to be able to do is search for verses that are of a certain
> type, like thankful in mood, and contain a word, like "Lord." I would like
> to search for verses that have a topic of "law" and contain a word, like
> "sin." Someone would have to compile the information for some of these
> indexes. But it would be good if in concept there was support for them. If
> the support was there I think we could get people to compile the
> information, though it will take some time.

A "better" solution might be to provide hooks for private thesaurus files so
that individuals and groups could create their own classification schemes.

I'll work on the inclusion of a thesaurus feature as part of the draft
specification.

Regards, Trevor

British Sign Language is not inarticulate handwaving; it's a living
language. So recognise it now.

--

<>< Re: deemed!