<div dir="ltr"><div>Thank you both for your interest !</div><div><br></div><div>> What about commentary?<br><div>> <a href="https://www.awmi.net/reading/online-bible-commentary/" target="_blank">https://www.awmi.net/reading/online-bible-commentary/</a></div></div><div><br></div><div>Not yet, I'm really focusing on bibles for the time being - that's a lot of work already !<br></div><div>But nothing prevents adapting the solution to commentaries in the future, I'll keep that idea in mind :-)</div><div><br></div><div>> If you want to use CzeBKR as your test case, I am ready to help<br>
> you with any testing or Czech issues or whatever </div><div><br></div><div>Thanks a lot !</div><div>I've just pushed a scraper configuration for this bible : <a href="https://github.com/UnasZole/bible-scraper/blob/master/src/main/resources/scrapers/GenericHtml/KralickaWikisource.yaml">https://github.com/UnasZole/bible-scraper/blob/master/src/main/resources/scrapers/GenericHtml/KralickaWikisource.yaml</a></div><div>Main books were easy to parse - deuterocanonical books extracted from a different manuscript were a bit messier.</div><div>I made a few assumptions (I interpret italics in verse as translation additions, and side notes in deuterocanonical books as section titles, etc.)</div><div>Feel free to test it : after checking out and building the repository, you should just need to run for example:</div><div><br></div><div>> ./run.sh scrape -s GenericHtml -i KralickaWikisource -b Ps -c 1 -w USFM</div><div><br></div><div>Cheers,</div><div><br></div><div>Arnaud<br></div><div><br></div><div class="gmail_quote"><div dir="ltr" class="gmail_attr">Le dim. 2 juin 2024 à 08:50, Matěj Cepl <<a href="mailto:mcepl@cepl.eu">mcepl@cepl.eu</a>> a écrit :<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On Sun Jun 2, 2024 at 1:09 AM CEST, Arnaud Vié wrote:<br>
> I'm open to any kind of feedback or suggestions of course !<br>
> In particular :<br>
><br>
> - if you have any specific website in mind that you would like to be<br>
> able to build sword modules from, let me know, we can try to add it.<br>
> (Currently I only included a few French websites, but I'm interested to add<br>
> some other languages).<br>
<br>
Sword module CzeBKR is sourced from the Czech WikiSource [1]<br>
and there seems to be the official way [2] how to get source<br>
in some hopefully more useful formats (plain text, RTF, HTML,<br>
EPubs). I was using my own home-grown Python script [3], but it<br>
seems like with all web-scrapping scripts it rotten away (that<br>
script is under some of kind of very free open source license,<br>
let’s say MIT/X11 … I am going to add the proper LICENSE file<br>
momentarily). It started at [4] (look at the source view), but it<br>
doesn’t seem to be that useful anymore.<br>
<br>
> - And if you are knowledgeable about the intellectual property laws in<br>
> other countries, I'm interested : currently, I've added a section to the<br>
> README explaining why the usage of the scraper on any public website is<br>
> allowed in France with references to the related texts, but it would<br>
> probably be useful to have similar information for users from other<br>
> countries.<br>
<br>
I am absolutely certain, there are no problems with CzeBKR:<br>
<br>
1. It is WikiSource, so we have somebody else to blame ;)<br>
2. The original Bible of Kralice [5] is from the sixteenth<br>
century and it is absolutely in the public domain.<br>
3. Source for the WikiSource was a scan [6] of the book<br>
from 1918, without any authors shown. The works of only<br>
possible editor of that Bible I know about [7] (and he is<br>
not shown on the title page, but he was working in the<br>
early 20th century with the International Bible Society on<br>
the revision of the Bible) are under the Bern Convention<br>
(death in 1929 + 75 years) in the public domain as well.<br>
4. We are in EU as well.<br>
<br>
If you want to use CzeBKR as your test case, I am ready to help<br>
you with any testing or Czech issues or whatever.<br>
<br>
Blessed Sunday!<br>
<br>
Matěj<br>
<br>
[1] <a href="https://cs.wikisource.org/wiki/Bible_kralick%C3%A1_(1918)" rel="noreferrer" target="_blank">https://cs.wikisource.org/wiki/Bible_kralick%C3%A1_(1918)</a> <br>
[2] <a href="https://ws-export.wmcloud.org/?lang=cs&title=Bible_kralick%C3%A1_%281918%29" rel="noreferrer" target="_blank">https://ws-export.wmcloud.org/?lang=cs&title=Bible_kralick%C3%A1_%281918%29</a><br>
[3] <a href="https://gitlab.com/crosswire-bible-society/CzeBKR/-/blob/master/kralicka.py" rel="noreferrer" target="_blank">https://gitlab.com/crosswire-bible-society/CzeBKR/-/blob/master/kralicka.py</a><br>
[4] <a href="https://cs.wikisource.org/wiki/Speci%C3%A1ln%C3%AD:Exportovat_str%C3%A1nky/Bible_kralick%C3%A1_(1918)" rel="noreferrer" target="_blank">https://cs.wikisource.org/wiki/Speci%C3%A1ln%C3%AD:Exportovat_str%C3%A1nky/Bible_kralick%C3%A1_(1918)</a><br>
[5] <a href="https://en.wikipedia.org/wiki/Bible_of_Kralice" rel="noreferrer" target="_blank">https://en.wikipedia.org/wiki/Bible_of_Kralice</a><br>
[6] <a href="http://archive.org/details/biblsvatanebvec00socigoog" rel="noreferrer" target="_blank">http://archive.org/details/biblsvatanebvec00socigoog</a><br>
[7] <a href="https://cs.wikipedia.org/wiki/Jan_Karafi%C3%A1t" rel="noreferrer" target="_blank">https://cs.wikipedia.org/wiki/Jan_Karafi%C3%A1t</a><br>
-- <br>
<a href="http://matej.ceplovi.cz/blog/" rel="noreferrer" target="_blank">http://matej.ceplovi.cz/blog/</a>, @mcepl@floss.social<br>
GPG Finger: 3C76 A027 CA45 AD70 98B5 BC1D 7920 5802 880B C9D8<br>
<br>
The ratio of literacy to illiteracy is a constant, but nowadays<br>
the illiterates can read.<br>
-- Alberto Moravia<br>
<br>
_______________________________________________<br>
sword-devel mailing list: <a href="mailto:sword-devel@crosswire.org" target="_blank">sword-devel@crosswire.org</a><br>
<a href="http://crosswire.org/mailman/listinfo/sword-devel" rel="noreferrer" target="_blank">http://crosswire.org/mailman/listinfo/sword-devel</a><br>
Instructions to unsubscribe/change your settings at above page<br>
</blockquote></div></div>