[sword-devel] OSHB module

Fr Cyrille fr.cyrille at tiberiade.be
Sat Apr 6 11:35:56 EDT 2024



Le 15/03/2024 à 01:03, Kahunapule Michael Johnson a écrit :
> Right now, all modules on eBible.org force Strong's numbers to be G or 
> H followed by 4 or 5 digits, with leading zeroes as necessary to make 
> 4 digits. The reason for this is that Paratext and the DBL software 
> choke on any other format. The decision was forced on me, really.
>
> Ideally, I would consider the Real Solution to be that any process 
> that READS Strong's numbers should tolerate the presence or absence of 
> leading zeroes. Indeed, the G or H, if missing, should be inferred 
> from the Testament in which it is found. (Tagging of the longer Esther 
> and Daniel should require an explicit G or H.) But if you write 
> Strong's numbers, maximum compatibility would come from sticking to 
> the Paratext/DBL pattern. Maximum encoding efficiency, of course, 
> would be in the other direction, stripping out the redundant leading 
> zeroes and implied G or H would save space, but at this point, I think 
> maximum compatibility is more important.
>
> Right now, asking for all modules to be rebuilt one way or another is 
> a really big ask. It is probably easier to preprocess all Strong's 
> numbers to make the format consistent within the back end. That way a 
> string comparison in the search should work just fine. We would just 
> have to decide what the search format should be. G or H should be 
> supplied to disambiguate when necessary, and leading zeroes either 
> supplied or stripped. Make sense?
>
> Of course, if a strong consensus on Strong's number formatting could 
> be obtained and manifested in code in all relevant Sword Project front 
> and back end software, I could go either way. My Bible translation 
> source would still have the Paratext/DBL format, but stripping out 
> leading zeroes in writing OSIS files is not hard. For now, though, I 
> must agree with Karl about the probability of his trademarked Real 
> Solution coming to pass. Sigh.
>
> On 3/14/24 11:23, Karl Kleinpaste wrote:
>> Quite honestly, the Real Solution™ to this problem is to bite the 
>> bullet, make a concrete decision that Strong's numbers are to be 
>> encoded in exactly one way, and re-work all existing modules to 
>> conform to that standard. Personally, I advocate that such a standard 
>> would stipulate Strong's numbers to be encoded in minimal (natural) 
>> digits: Encoding an OT reference as "1" means a Heb Strong's 
>> dictionary key of "00001" and an NT "1401" means a Grk Strong's 
>> dictionary key of "01401", that is, zeroes to create dictionary 
>> module keys are prepended to natural numbers to fill exactly 5 digits.
>>
>> I've never bothered to attempt a final fix to this problem in Xiphos 
>> for exactly the reason that, no matter which direction I might take, 
>> it will be an unreliable hack; that in turn is because the very 
>> concept of a leading '0' as a weak discriminant between Heb and Grk 
>> Strong's numbers is itself an unreliable hack. Whenever the 
>> subsequent conceptual change came along, to distinguish Heb/Grk 
>> numbers according to a leading H or G (that is, lucene search using 
>> e.g. "lemma:G1401"), /that/ was the point at which the 
>> leading-zero-encoding nonsense should have been forced into the trash 
>> bin.
>>
>> It was not, and here we are.
>>
>> Probability of the Real Solution™ coming to pass: Vanishingly close 
>> to zero.
>>
>> _______________________________________________
>> sword-devel mailing list:sword-devel at crosswire.org
>> http://crosswire.org/mailman/listinfo/sword-devel
>> Instructions to unsubscribe/change your settings at above page
>
>

Hello I'm coming back to you so that we can really decide on a solution 
to this problem.
Is it so difficult to agree? It's such a shame to have unusable modules 
with strong numbers just for convenience.
So I ask again, is the solution with H for the Hebrew numbers and G for 
the Greek ones satisfactory and can we therefore go in that direction?
If so, I'll take care of the modules and inform those responsible as far 
as possible.
As Karl and Michael have already given their opinion and Michael is the 
one in charge of the most modules, for my part if nobody reacts, it 
seems to me that we should accept their de facto opinion.
The idea of agreeing with the usfm format seems to me to be a relevant 
opinion that we should take into account.
If someone could document this decision on the wiki, that would be great.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://crosswire.org/pipermail/sword-devel/attachments/20240406/69b7ae9d/attachment.htm>


More information about the sword-devel mailing list