[sword-devel] Module release: FreCrampon

David Haslam dfhdfh at protonmail.com
Wed Mar 19 04:00:28 EDT 2025


A Word Frequency analysis on the FreCrampon text has uncovered 21 instances of words where the ordinary ASCII apostrophe was used to mark elision, instead of the proper character U+2019 RIGHT SINGLE QUOTATION MARK used in all the other places.

> Word Count
> C'était 1
> Qu'elle 2
> d'Ochran 1
> d'Og 1
> d'alentour 1
> d'après 2
> d'avoir 1
> d'insolence 1
> d'où 1
> d'épouvante 1
> j'en 1
> jusqu'à 1
> l'impie 1
> n'agissez 1
> n'avaient 1
> n'avez-vous 1
> n'est 1
> n'ont 1
> n'y 1
> qu'on 2
> s'y 1

These should each be changed to use U+2019.

The issue should be fixed upstream in https://fr.wikisource.org/wiki/Bible_Crampon_1923

FIO. The attached 7-Zip file contains the Word Frequency analysis.
NB. References were first stripped out, and both types of NBSP were replaced by an ordinary space.

Best regards,

David

Sent with [Proton Mail](https://pr.tn/ref/SWXT9A5YZ67G) secure email.

On Tuesday, March 18th, 2025 at 10:02 PM, David Haslam <dfhdfh at protonmail.com> wrote:

> But see the continuation of my AI chat exchange in https://grok.com/share/bGVnYWN5_0023c289-2171-4f8b-8ad7-98e0f086eeb8
>
> It may still be preferable to use U+202F NARROW NO-BREAK SPACE [NNBSP] to separate the Guillemots from the word at start and end of each quotation.
>
> cf. This is already the case in Frenchmodule FreLXXGiguet.
>
> If we go down that route consistently, then some preprocessing would be required before performing a Word Frequency analysis on the module's text content.
> i.e. As part of module testing, in order to uncover any further anomalies.
>
> Best regards,
>
> David
>
> Sent with [Proton Mail](https://pr.tn/ref/SWXT9A5YZ67G) secure email.
>
> On Tuesday, March 18th, 2025 at 9:42 PM, David Haslam <dfhdfh at protonmail.com> wrote:
>
>> I hadn't thought of doing this analysis until today.
>>
>> My recent detailed observations about the FreBBB module also are applicable to FreCampon.
>>
>> FreCrampon contains 22143 U+00A0 NO BREAK SPACE (NBSP)
>>
>> Of these, 6152 are not followed immediately by a punctuation mark!
>> All but one of those are followed by a word character.
>>
>> This means that 15991 of the NBSP are followed by a punctuation mark.
>>
>> All 6151 of the othe locations match the PCRE [[:punct:]]\xA0\w+
>>
>> i.e. These are each preceded by a punctuation mark.
>>
>> I therefore recommend that each NBSP be replaced by U+2008 PUNCTUATION SPACE
>>
>> The 1 exception out of the 6152 is where the NBSP occurs strangely at the end of a verse!
>>
>> Mark 2:22: Et personne ne met du vin nouveau dans des outres vieilles : autrement, le vin fait rompre les outres et le vin se répand, et les outres sont perdues. Mais le vin nouveau doit se mettre dans des outres neuves. «
>>
>> i.e. It's the invisible character after an ordinary space after the «
>> That « may well be a typo, as it should surely be a »
>>
>> To sum up: Replace all U+00A0 by U+2008 and correct the above typo!
>>
>> We should advise the upstream source to do likewise!
>>
>> Best regards,
>>
>> David
>>
>> Sent with [Proton Mail](https://pr.tn/ref/SWXT9A5YZ67G) secure email.
>>
>> On Monday, March 10th, 2025 at 9:51 AM, domcox at crosswire.org domcox at crosswire.org wrote:
>>
>>> Dear All,
>>>
>>> This is to announce that we have just now uploaded FreCrampon
>>> in the CrossWire (main) repository.
>>>
>>> ## Language:
>>> French
>>>
>>> ## Description:
>>> La Bible Augustin Crampon 1923
>>>
>>> ## Category:
>>> Biblical Text
>>>
>>> ## Version:
>>> This is an update. Version: 3.2
>>>
>>> ## What's new:
>>> Complete rereading of the whole Bible, correction of NT cross-references.
>>>
>>> Many thanks to Cyrille_LAfricain for the hard work.
>>>
>>> We wish you enjoyable reading,
>>> The Module Team
>>>
>>> P.S.: This email is sent automatically on upload of a new/updated module
>>>
>>> _______________________________________________
>>> sword-devel mailing list: sword-devel at crosswire.org
>>> http://crosswire.org/mailman/listinfo/sword-devel
>>> Instructions to unsubscribe/change your settings at above page
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://crosswire.org/pipermail/sword-devel/attachments/20250319/3f4dd236/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: FreCrampon.diatheke.strip.space.word.count.tab.7z
Type: application/x-compressed
Size: 99977 bytes
Desc: not available
URL: <http://crosswire.org/pipermail/sword-devel/attachments/20250319/3f4dd236/attachment-0001.bin>


More information about the sword-devel mailing list