[sword-devel] Introducing myself, SwordHammer, and asking a ton of questions
Tom Sullivan
info at beforgiven.info
Fri Dec 22 07:17:12 MST 2017
Y’all:
This will be a long one, so let me first summarize, then provide details
for each section:
Personal Introduction:
My name is Tom Sullivan, info at beforgiven.INFO.
Context of my interest in this forum:
I have developed a program, SwordHammer, to translate from WYSIWYG word
processor output to OSIS and Sword Modules. It is in beta, and is at
beforgiven.info/SwordHammer. I hope it will be useful to some of you,
and to a wider audience of authors, editors, and translators.
Questions:
But, there are issues with my understanding of OSIS, Sword, and what is
to be expected of front end programs, and of the back ends, jsword and
sword. I have read the docs fully and more than once, but there are
places where there are contradictions, things that are unclear to me,
and undoubtedly, things that I stupidly missed. So I would appreciate
some help. I need this information so I can make SwordHammer do its job
well. I have extensively tested it; it mostly works well, but there are
warts and blemishes. A prime goal is to isolate the user from technical
details.
Perhaps I could help with the documentation? It seems to need work.
Perhaps Crosswire could specify a simpler OSIS if nobody else is using it?
Future Direction:
Someday I hope to write a front-end for interlinear Bible presentation.
A very simple front end sync standard would be helpful.
More details:
*** Personal Introduction:
My name is Tom Sullivan and you can reach me at infor at BeForgiven.INFO. I
am a retired engineer, but in most jobs was also IT administrator or
involved with it. I started programming in FORTRAN on the IBM 360
mainframe in 1972. About a year and a half later, by God’s sovereign
grace alone, the Holy Spirit regenerated my heart and gave me faith in
Jesus Christ whose bondslave I have been ever since (not that I have
always acted the part very well).
What follows about myself is not all that important, but may perhaps
help others understand my strengths, weaknesses, and motivation.
At the end of my career I was mostly using C, C#, and (legacy) VB6. When
Microsoft released that unnatural chimera called Windows 8, started
putting spyware (“telemetry”) in Windows 7, tried to cram Windows 10
(replete with spyware and ads) down my throat, and etc., a divorce was
in order. (Microsoft may possibly have a somewhat different
perspective.) I settled on Debian and tried to stay with C# using Mono,
but it was buggy, and then Microsoft bought Xamarin (who owned Mono).
Forced to choose a new language, I learned Python 3 and have written an
awful lot of Python code to convert my own stuff over from Windows to
Linux. Although I must still use Windows for a few unmovable commercial
programs, I am now almost fully on Debian, still learning, and still
considering myself to be a Linux newbie even after two years.
Of course, Linux is a different world. The quantity and quality of the
software is amazing, especially considering its largely volunteer
origin, and the price is right. But everywhere, documentation tends to
be a weak spot. What struck me most is that, compared with commercial
offerings, there is a lack of good Bible programs for Linux. The
relative lack of modern, translations, commentaries, and reference
material is even worse. Those of us who are Linux users are, in effect,
an unreached people group. It is clear that there are many who are hard
at work on Bible programs and that they are making good progress to
remedy the gap. With regard to input material, this obvious need is what
has motivated me to tackle the project of writing SwordHammer in
addition to my other retirement job of translating Puritan and other
classic works into modern English. (Of course, I hope to use SwordHammer
to publish my own works as well.) Readers who have used programs like
Libronix will understand the great value of linking Christian works and
their Bible references to the actual Bible passages. Too many readers do
not bother to look up references. This is in addition to linking
commentaries, dictionaries, etc., to the Word of God to facilitate
careful study.
So OK, I am the new employee who has still to find the bathrooms. I have
a lot to learn. But perhaps I can also bring some helpful outsider’s
viewpoints and suggestions. I have waited until now to join this list
because, first, I wanted to learn and test my learning about OSIS and
Sword modules, and second, because I wanted to present SwordHammer in
beta as an indication that I am serious about helping out with the Sword
project and making the whole “ecosystem” better, if God will so allow me.
*** Context of my interest in this forum:
I have developed a program, SwordHammer, to translate from WYSIWYG word
processor output to OSIS and Sword Modules. It is in beta, and is at
beforgiven.info/SwordHammer. I hope it will be useful to some of you,
and to a wider audience of authors, editors, publishers, and
translators. My target audience includes the kind of people who, while
they can do their jobs well, even excellently, on their computer, some
of them will need to call for support for almost every IT problem. I
have written the instruction manual accordingly.
I encourage you all to at least download the documentation, if not try
SwordHammer as well. It seems a lot to ask, but if any of you could be
so kind as to look at my documentation and point out to me places where
I am inaccurate with respect to OSIS or Sword modules; such are most
likely to show up in what I have termed “General Questions.”
SwordHammer uses as input the .ODT files produced by LibreOffice, which
(should) conform to OASIS, which is an international standard, ISO/IEC
26300-1.
The essential working principle of a WYSIWYG word processor is to enable
a user to present a document to a reader in terms of how the document
looks to the reader; the word processor is largely unaware of how the
user derives meaning or information from the appearance of the document.
By contrast, SwordHammer’s job is to extract meanings from appearance
and translate those meanings into OSIS, then Sword Modules. SwordHammer
does this by analyzing the document and picking out all of the section,
paragraph, font, and other information pertaining to the text. We will
call these kind of things “attributes.” Each time SwordHammer encounters
a new set of attributes (even if only one attribute differs) it
generates a question for the user as to its meaning (verse, footnote,
italic, title, and so on). The questions are presented to the user in a
(new) question document that contains the user’s original input with
questions interspersed in context. When the user answers those
questions, SwordHammer learns the meaning of the formatting and is hence
able to translate the document into OSIS. If one thinks about it, this
is not an unreasonable approach; readers distinguish between items in a
document by appearance also.
SwordHammer also asks a large number of questions which are driven in
part by the requirements of OSIS and Sword Module generation. So far,
SwordHammer has been tested on Bibles, Commentaries, and Generic
(General) Books.
It is my belief that SwordHammer is greatly easier to use, more
productive, and less error-prone than forcing authors to learn OSIS and
about Sword Modules. SwordHammer attempts to insulate and isolate the
author or document editor, etc., from having to learn and interact with
either OSIS or the details of Sword Modules. (SwordHammer currently does
not support all OSIS features, and even fully developed may never get
there, but I hope that it is a good start.)
In developing SwordHammer I have, of course, interacted deeply with
Sword Module creation and OSIS. There are a number of issues and
questions that have arisen that will appear below. Obviously, since
SwordHammer is an automated system, I will have a bias towards using
OSIS in the most consistent and simple manner possible. This bias may
perhaps show in some of my questions.
*** Questions (* marks a new question)
Before going further, let me just say that I fully appreciate the fact
that software development is hard work, that the work is being done by
unpaid volunteers, that it is painful, but necessary when someone tells
you that your program has a bug, and that not all back ends or front
ends will have the same feature set. My goal here is to learn so that I
may develop SwordHammer to its fullest potential as the Lord gives me
health, strength, and opportunity.
In particular, I have been testing with these front ends: Xiphos 4.04 ,
BibleTime 2.10.1, and BibleDesktop 2.0 Beta. My reports of problems with
these is not to slam them, but they just happen to be three that work on
my Debian Stretch system. These may not be up to date. For example, I
have been informed by Xiphos that version 4.0.7 is current. But here is
why I use what may be old versions: I am expecting some of my users to
be unsophisticated; to expect them to compile from source is
unrealistic. It also seems unrealistic to expect them to use other than
stable distros. Debian does support package updates for bug fixes; may I
please ask developers to kindly supply such fixes as needed.
* When first studying the documentation for OSIS, I looked in vain for
further information. I was finally able to make contact with SIL, and
they said that they no longer used OSIS, even though they were involved
with its development. My impression is that only Crosswire supports OSIS
anymore.
* Is this correct?
* If it is correct, should the OSIS specification be pared down for
simplicity? For example, does it really matter if a section is tagged
“colophon” or “gazetteer” in terms of how front ends will handle the
material? Similarly, there are a number of what may be seen as
subdivisions of the general category of “translator’s notes,” such as
“transChange,” “translation,” or “variant.” Will not front ends present
all of these as non-cross-reference notes in the same way? Which, if
any, do actually deserve special treatment? (I realize this is a general
question and only expect general answers at this time.) In particular,
<quote> seems to me to be completely useless (except for its use for
red-letter). Why not just pass through the source material’s mark for
the language being used?
* Would it be worthwhile to re-write the OSIS documentation to reflect
the needs of Crosswire and incorporate the articles in the Wiki into the
same document, so there is a comprehensive and consistent guide? And,
yes, being an experienced author and editor, I believe I would be
willing to volunteer for this job if I can get both good input and
feedback; right now I don’t know enough just yet.
*** I present the following bug reports. Implied in all of them are the
following questions:
* Is this something that should work, but I must be doing something
wrong? If so what?
* Is this something that will never be supported?
* Is this something that needs to be fixed?
* Etc. I am looking for education here. Many thanks.
* Using: osisCore.2.1.1-cw-latest.xsd, there are Work sub elements that
give errors:
<contributor>
<type>
<identifier>
<description>
<subject>
<source>
* <chapterLabel> is not accepted in chapter title by
osisCore.2.1.1-cw-latest.xsd
* In both Xiphos and BibleTime:
. Tabs not recognized
. <lg> and <l> delineated poetry does not work reliably
. Fonts are random sizes (sometimes)
. Images do not display description on mouseover for vision impaired
. underline does not work
. small caps does not work
. strikethrough does not work
. tables do not recognize <align> elements
. fail to properly handle 0x2019 right single quotation mark in a title
when used as an apostrophe. A title is truncated. In regular text, it
does not appear. Note: 8217 2019 ’ RIGHT SINGLE QUOTATION MARK is
the recognized standard for the English apostrophe.
. Non-ASCII characters do not print, in spite of documentation that says
to use UTF-8. BibleDesktop does not have this problem.
. fail with words of Christ in red if who="Jesus" used in quote
milestone (as opposed to non-milestone). The red never shuts off.
* BibleDesktop does not display all notes if there are a lot of them.
* On pg 67, the OSIS manual shows <item><list><item> to make second
level list items. However, this produces extra bullet points. The
correct manner is to just use <list><item>, where <list> is a child of
the previous list. Bibletime and Xiphos then display correctly, but
BibleDesktop shows an extra bullet point. ALL programs insert their own
markers in spite of the user supplying their own, such as numbering.
HOWEVER, this causes OSIS validation errors, AND if done the way of the
OSIS manual, you do not get validation errors, but you do get double
bullet points for nested entries.
* In Windows, xml2gbs that is loaded by Xiphos produces only a .bdt
file. The same XML file works fine for xml2gbs Linux version. There are
no error messages generated. The output is similar, but while the Linux
version gives titles, the Windows version gives the value of the osisID.
The problem is the version that comes in Xiphos. The downloadable from
http://www.crosswire.org/ftpmirror/pub/sword/utils/win32/ works, but the
file versions seem old.
* In contrast to documentation, VC2012 redistributable is needed.
* A footnote marker of a dagger or special character will cause Xiphos
and Bibletime to mishandle the footnote. BibleDesktop works fine.
* BibleTime does not display an image in a generic book.
* BibleTime ignores paragraphing.
* I have not figured out how to include non-canonical and separate
introductory or extended explanatory text in a Bible. ESV2011 does it
however. How do you do that?
* The OSIS specification provides for internal hyperlinks: <a> should be
able to link to an osisID. I cannot get this to work. Bible passage
links work fine. The ability to use internal links/hyperlinks can be of
great value, especially in General books, to produce tables of contents,
alphabetical indexes, Scripture indexes and other kinds of useful links.
* I have seen the terms “General Books” and “Generic Books.” Which is to
be preferred. (Earthshaking question, right?)
*** Future direction:
Lord willing, I hope to develop a front end for interlinear Bibles such
as Bagster’s Interlinear. (The kind of interlinear I am considering is
one which displays a line with modern language, beneath it a line of
Greek or Hebrew, beneath that a line of morphology, etc., with each word
and its information vertically aligned.)
But this possible project brings up a question: I am aware of the
existence of BibleSync. It has a lot of neat features for groups. But
consider what plug-ins have done for Firefox in terms of user
flexibility and new features. In a loosely similar manner, a simple
inter-program communication system would allow users to use multiple
Bible front ends and sync them. Different programs have different
strengths and weaknesses. To sync them would be an overall enhancement
to all of us. Obviously, this would be beneficial to a proposed
interlinear-only front-end.
My proposal is a simple system that will not overly burden front end
developers: Use a local file. All front ends emit a passage reference to
a local file when a user selects a passage. All front ends read that
file about every second or so and, if the reference changes, go to that
reference. A menu item or button switches participation on or off. This
is a one-user, local system.
Comments?
--
Tom Sullivan
info at BeForgiven.INFO
FAX: 815-301-2835
---------------------
Great News!
God created you, owns you and gave you commands to obey.
You have disobeyed God - as your conscience very well attests to you.
God's holiness and justice compel Him to punish you in Hell.
Jesus Christ became Man, was crucified, buried and rose from the dead
as a substitute for all who trust in Him, redeeming them from Hell.
If you repent (turn from your sin) and believe (trust) in Jesus Christ,
you will go to Heaven. Otherwise you will go to Hell.
Warning! Good works are a result, not cause, of saving trust.
More info is at www.esig.beforgiven.info
Do you believe this? Copy this signature into your email program
and use the Internet to spread the Great News every time you email.
More information about the sword-devel
mailing list