<!DOCTYPE html>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
<br>
<br>
<div class="moz-cite-prefix">Le 15/03/2024 à 01:03, Kahunapule
Michael Johnson a écrit :<br>
</div>
<blockquote type="cite"
cite="mid:93ad81b3-6a00-4773-a7f5-24c08aababe3@eBible.org">
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<div class="moz-cite-prefix">Right now, all modules on eBible.org
force Strong's numbers to be G or H followed by 4 or 5 digits,
with leading zeroes as necessary to make 4 digits. The reason
for this is that Paratext and the DBL software choke on any
other format. The decision was forced on me, really.</div>
<div class="moz-cite-prefix"><br>
</div>
<div class="moz-cite-prefix">Ideally, I would consider the Real
Solution to be that any process that READS Strong's numbers
should tolerate the presence or absence of leading zeroes.
Indeed, the G or H, if missing, should be inferred from the
Testament in which it is found. (Tagging of the longer Esther
and Daniel should require an explicit G or H.) But if you write
Strong's numbers, maximum compatibility would come from sticking
to the Paratext/DBL pattern. Maximum encoding efficiency, of
course, would be in the other direction, stripping out the
redundant leading zeroes and implied G or H would save space,
but at this point, I think maximum compatibility is more
important.</div>
<div class="moz-cite-prefix"><br>
</div>
<div class="moz-cite-prefix">Right now, asking for all modules to
be rebuilt one way or another is a really big ask. It is
probably easier to preprocess all Strong's numbers to make the
format consistent within the back end. That way a string
comparison in the search should work just fine. We would just
have to decide what the search format should be. G or H should
be supplied to disambiguate when necessary, and leading zeroes
either supplied or stripped. Make sense?<br>
</div>
<div class="moz-cite-prefix"><br>
</div>
<div class="moz-cite-prefix">Of course, if a strong consensus on
Strong's number formatting could be obtained and manifested in
code in all relevant Sword Project front and back end software,
I could go either way. My Bible translation source would still
have the Paratext/DBL format, but stripping out leading zeroes
in writing OSIS files is not hard. For now, though, I must agree
with Karl about the probability of his trademarked Real Solution
coming to pass. Sigh. <br>
</div>
<div class="moz-cite-prefix"><br>
</div>
<div class="moz-cite-prefix">On 3/14/24 11:23, Karl Kleinpaste
wrote:<br>
</div>
<blockquote type="cite"
cite="mid:1c5c1a1f-f218-4bd5-bf48-fa0c97e7bd83@kleinpaste.org">
<meta http-equiv="Content-Type"
content="text/html; charset=UTF-8">
<font face="FreeSerif">Quite honestly, the Real Solution™ to
this problem is to bite the bullet, make a concrete decision
that Strong's numbers are to be encoded in exactly one way,
and re-work all existing modules to conform to that standard.
Personally, I advocate that such a standard would stipulate
Strong's numbers to be encoded in minimal (natural) digits:
Encoding an OT reference as "1" means a Heb Strong's
dictionary key of "00001" and an NT "1401" means a Grk
Strong's dictionary key of "01401", that is, zeroes to create
dictionary module keys are prepended to natural numbers to
fill exactly 5 digits.<br>
<br>
I've never bothered to attempt a final fix to this problem in
Xiphos for exactly the reason that, no matter which direction
I might take, it will be an unreliable hack; that in turn is
because the very concept of a leading '0' as a weak
discriminant between Heb and Grk Strong's numbers is itself an
unreliable hack. Whenever the subsequent conceptual change
came along, to distinguish Heb/Grk numbers according to a
leading H or G (that is, lucene search using e.g.
"lemma:G1401"), <i>that</i> was the point at which the
leading-zero-encoding nonsense should have been forced into
the trash bin.<br>
<br>
It was not, and here we are.<br>
<br>
Probability of the Real Solution™ coming to pass: Vanishingly
close to zero.<br>
</font> <br>
<fieldset class="moz-mime-attachment-header"></fieldset>
<pre class="moz-quote-pre" wrap="">_______________________________________________
sword-devel mailing list: <a
class="moz-txt-link-abbreviated moz-txt-link-freetext"
href="mailto:sword-devel@crosswire.org" moz-do-not-send="true">sword-devel@crosswire.org</a>
<a class="moz-txt-link-freetext"
href="http://crosswire.org/mailman/listinfo/sword-devel"
moz-do-not-send="true">http://crosswire.org/mailman/listinfo/sword-devel</a>
Instructions to unsubscribe/change your settings at above page
</pre>
</blockquote>
<p><br>
</p>
</blockquote>
<br>
Hello I'm coming back to you so that we can really decide on a
solution to this problem.<br>
Is it so difficult to agree? It's such a shame to have unusable
modules with strong numbers just for convenience. <br>
So I ask again, is the solution with H for the Hebrew numbers and G
for the Greek ones satisfactory and can we therefore go in that
direction? <br>
If so, I'll take care of the modules and inform those responsible as
far as possible.<br>
As Karl and Michael have already given their opinion and Michael is
the one in charge of the most modules, for my part if nobody reacts,
it seems to me that we should accept their de facto opinion.<br>
The idea of agreeing with the usfm format seems to me to be a
relevant opinion that we should take into account.<br>
If someone could document this decision on the wiki, that would be
great.<br>
<br>
<br>
</body>
</html>