[sword-devel] diatheke issues

Eeli Kaikkonen eekaikko at mail.student.oulu.fi
Sun Aug 6 09:05:54 MST 2006


I have found some small issues in diatheke. Most of them are more like
feature requests than bugs. I would like to hear what you think. I'm also
interested in hearing what you think about the future of diatheke in general
because I'm using it with the JD Bible Bot.

As you can see I have proposals in code level for some problems. I could
learn to use diff and make patches if it's better. But please give some
feedback first.

***************************************************************

One problem for me is parsing the output. I won't go into details,
but in my program I need to know the last verse key which diatheke returned.
If I use a range like joh1:1-5 I should parse "John 1:5:" out of the
text, but it's very unsure because a commentary module may include that text
in the beginning of a line even though it's not a key returned by diatheke.

My proposal is that diatheke could return output which is meant to
be a bit more machine readable. Every key would be alone in one
line, the verse content would follow, and the next key would be
again alone in one line. Additionally the key lines should have one
special character (e.g. "\v") in the beginning to ease and speed up regexp
search and to make it reliable. It depends on key localization how
complicated the regexp is. We cannot be sure for example that some
weird language don't use : as a letter. If the key line begins with
some special character and has nothing else besides the key it would
be very reliable and fast.

Because it would not be good to break backwords compatibility this
could be made optional from command line. This is quite much work:
A new command line option into diatheke.cpp, giving that option to
doquery() in corediatheke.cpp, adding conditionally "\v" and ":\n" into
output in at least one place in doquery() and moving ":" to a
different place (e.g. from *output << ": <font face=\"";).

***********************************************************

I think diatheke should exit with some status and helpful message if the
key had no content in the module instead of showing <key>:(MODULE). This is
by the way a more general usability problem with frontends: people get
confused when they see nothing in OT portion of a NT only module.

This is debatable because returning empty content is not really an
error. It's only a usability issue. The programmer can of course
parse the output to find out if there's any text but using the
exit status is easier.

Solution: a local variable in doquery() which tells if sword has
returned some text. If no text was returned after all verses were
looped then exit from program with exit(<status>). It would not 
be bad to output the normal text if only the exit status would be
available.

***********************************************************

Also if key is "bad", e.g. has non-ascii characters in ascii-only
locale, diatheke gives only (MODULENAME). This happens with
Bible text, not with WebstersDict. Diatheke should exit with a status
and a message.

Solution: I'm not sure but think it's in doquery() /
(querytype == QT_BIBLE || querytype == QT_COMM).
There's "if (element) {" inside which the keytext and verse 
contents are put into output. Maybe "if (!element){exit(<status>)}"
or something?

*************************************************************

If the book name in -b argument is wrong diatheke just exits. It
should exit with a status and a message.

Solution: corediatheke.cpp has if (it == manager.Modules.end())
{return;}. It could be a message and exit(<status>), or
doquery() returns an int which is used in diatheke.cpp.

***********************************************************

Diatheke lacks --help option. It gives help text but it's in stderr.
It's basically wrong because it was not error if the user wanted to
see the help text. It's quite usual for command line programs to show output
in stderr and exit with some status if the command line options are wrong,
but if --help is used the output is in stdout.

This may seem nitpicking but I ran into this when I needed the help output
in my program.

Solution: in diatheke.cpp printsyntax() could take one argument which
would be a file descriptor. When options are parsed "--help" would be
checked first and printsyntax(stdout) called, then "return 0". In the
end of the file printsyntax(stderr) would be called and then
return <status>.

*************************************************************

With KJV or other Bible/comm module diatheke seems to add the module name
(KJV) in the last line alone. (WebstersDict) and other LD module names go
straight after the text. I think all module names should be alone in the
last line - again it would be easier to parse if needed.

Solution: add "\n" to two lines in corediatheke.cpp (when 
querytype == QT_SEARCH or querytype == QT_LD).

*************************************************************

Diatheke doesn't support clucene search.
corediatheke.cpp has: listkey = target->Search(ref, st, REG_ICASE)
Maybe it could be optionally a clucene search. I cannot offer search at all
in Bible Bot because normal search is too slow. I put my hope in clucene.

diatheke.cpp: add "lucene" to options, add type ST_LUCENE 5 into
corediatheke.h.

When parsing options:

else if (!::stricmp("lucene",argv[i+1])) {
#ifndef USELUCENE
    fprintf (stderr, "Lucene search is not supported in this compilation.");
    return <status>;
#endif
    searchtype = ST_LUCENE; i++;}
    
In corediatheke.cpp, in doquery():

if (querytype == QT_SEARCH) {
   char st = 1 - searchtype;
   // check if the module has clucene index
   bool supported = new bool; SWKey sc = SWKey(); // sc, ref, 
   target->Search(ref, st, 0, &sc, &supported);
   if (!supported && st == -4) {
      std::cerr << "No search index found for the module.\n" << endl;
      exit(0);}

*******************************************************************

If there is no filter for the requested output format diatheke
returns the original format. Is it possible to use a queue of
filters?

*******************************************************************

Module list with descriptions is not utf8 though it should be,
at least in my opinion it should. Mixing different kinds of encodings
makes a python programmer's life very hard.

I hope that localized keys are utf8! I have not tested it
because I have not found a way to change the localization. The point
is that ALL output should be utf8 if the user wants that.

******************************************************************

And one more: diatheke shows that Spurgeons Morning And Evening is
a Dictionary.

*****************************************************************

Some of these changes would require changes in the help text.
A separate man page would be even better.

-- 
Eeli Kaikkonen




More information about the sword-devel mailing list