[sword-devel] Parsing Strong's to support more flexible layout in user interface
Troy A. Griffitts
scribe at crosswire.org
Sat Nov 9 16:54:40 EST 2019
Dear Tobias,
Yes, our StrongsGreek and StrongsHebrew modules need work. I know the
Xiphos repo has improved Strongs modules. I was actually surprised to
see that our Strongs modules are still not using any SourceType entry in
the .conf file. They should be updated to use TEI, as this is what we
decided to use for lexicon markup, but I guess never updated our modules.
Regarding your use case. In SWORD, the common way for parsing out and
making available useful information which might be found in a module
entry is the EntryAttributes mechanism. This would allow requests like:
module.getEntryAttributes()["Word"]["Transcription"]["Main"];
module.getEntryAttributes()["Word"]["Transcription"]["Phonetic"];
module.getEntryAttributes()["Definition"]["Body"]["Text"];
module.getEntryAttributes()["Reference"]["000"]["Text"];
module.getEntryAttributes()["Reference"]["000"]["EntryKey"];
module.getEntryAttributes()["Reference"]["001"]["Text"];
module.getEntryAttributes()["Reference"]["001"]["EntryKey"];
If you haven't used EntryAttributes in SWORD yet, have a look at
something like the KJV with the tool: sword/examples/cmdline/lookup.
This utility prints out all the entry attributes associated with an
entry (among other things).
You may have noticed, above, one quirk with SWORD Entry Attributes in
that they are always referenced by a 3 level key. Searches are done
using / notation with empty segments and trailing '.' designating
wildcards, e.g., Word//Lemma./G1234/. This would find any entry in the
KJV with an EntryAttribute having strongs G1234 in the "Lemma.*" entries
for any Word number.
So, to implement what you want, I would suggest we update our
StrongsGreek and StrongsHebrew with markup and use info from a source we
know we can designate the copyright information (this is my concern from
other lexicon data out there). We could talk with Tyndale about the
pedigree of their data for their Greek and Hebrew Strong definitions and
possibly finalize a module we could share, or improve on ours with data
from various sources we can cite, like
http://crosswire.org/svn/sword-tools/trunk/flashtools/ for adding
"RealGreek" and "RealHebrew" word headings to our current lexica.
What we should really start with is a TEI markup of our current lexica,
if we decide to improve our current modules.
THEN, with markup in place, parsing the entries into entry attributes
would be really simple by adding a filter... in fact, looking for an
example, I strangely found:
http://crosswire.org/svn/sword/trunk/src/modules/filters/greeklexattribs.cpp
I wonder if we use this filter on any module, currently...
Troy
On 11/9/19 3:31 AM, Tobias Klein wrote:
>
> Hi,
>
> I'm currently working on Strong's support for Ezra Project.
>
> I've been implementing a Strong's parsing functionality that enables
> flexible formatting of the Strong's definitions (from StrongsGreek and
> StrongsHebrew) in my frontend.
> Without this functionality the frontend would have to "dump" the
> definition of a Strong's key and it wouldn't have freedom in how the
> definition is formatted / layed out.
> Having this functionality available, the frontend can work with
> individual parts of the Strong's definition and apply specific
> formatting and layout.
> The parsing divides a Strong's entry into:
> - Transcription
> - Phonetic transcription
> - Definition
> - List of references
>
> In case of Ezra Project the formatting looks like this now:
> https://raw.githubusercontent.com/tobias-klein/ezra-project/master/screenshots/strongs_formatting_example.png
>
> I'm pasting the definition of my StrongsEntry class below, which is
> the base for this implementation (see
> https://github.com/tobias-klein/node-sword-interface/blob/master/src/strongs_entry.hpp):
>
> class StrongsEntry
> {
> public:
> StrongsEntry(std::string key, std::string rawEntry);
> virtual ~StrongsEntry(){}
> static StrongsEntry* getStrongsEntry(sword::SWModule* module, std::string key);
> std::string rawEntry;
> std::string key;
> std::string transcription;
> std::string phoneticTranscription;
> std::string definition;
> std::vector<StrongsReference> references;
> private:
> void parseFromRawEntry(std::string rawEntry);
> void parseFirstLine(std::string firstLine);
> void eraseEmptyLines(std::vector<std::string>& lines);
> void parseDefinitionAndReferences(std::vector<std::string>& lines);
> };
>
> Now I'm wondering whether something like this could actually be useful
> as part of the Sword engine, since the use case of "flexible Strong's
> formatting" may also be relevant for other frontends.
>
> Best regards,
> Tobias
>
>
> _______________________________________________
> sword-devel mailing list: sword-devel at crosswire.org
> http://www.crosswire.org/mailman/listinfo/sword-devel
> Instructions to unsubscribe/change your settings at above page
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.crosswire.org/pipermail/sword-devel/attachments/20191109/5ce474c5/attachment-0001.html>
More information about the sword-devel
mailing list