<!DOCTYPE html>
<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
  </head>
  <body>
    Hi Arnaud,<br>
    What do you think to move bible-scraper from github repo to our
    gitlab repo? I did this but not with the last commits. I make you
    dev on it. <a class="moz-txt-link-freetext" href="https://gitlab.com/crosswire-bible-society/bible-scraper">https://gitlab.com/crosswire-bible-society/bible-scraper</a><br>
    <br>
    <div class="moz-cite-prefix">Le 02/06/2024 à 11:46, Arnaud Vié a
      écrit :<br>
    </div>
    <blockquote type="cite"
cite="mid:CA+kNJPh296xGUcrn5fvH5q6mu_N7zTgbwH27NCgwtFWT-WxcfQ@mail.gmail.com">
      <meta http-equiv="content-type" content="text/html; charset=UTF-8">
      <div dir="ltr">
        <div>Thank you both for your interest !</div>
        <div><br>
        </div>
        <div>> What about commentary?<br>
          <div>> <a
href="https://www.awmi.net/reading/online-bible-commentary/"
              target="_blank" moz-do-not-send="true"
              class="moz-txt-link-freetext">https://www.awmi.net/reading/online-bible-commentary/</a></div>
        </div>
        <div><br>
        </div>
        <div>Not yet, I'm really focusing on bibles for the time being -
          that's a lot of work already !<br>
        </div>
        <div>But nothing prevents adapting the solution to commentaries
          in the future, I'll keep that idea in mind :-)</div>
        <div><br>
        </div>
        <div>> If you want to use CzeBKR as your test case, I am
          ready to help<br>
          > you with any testing or Czech issues or whatever </div>
        <div><br>
        </div>
        <div>Thanks a lot !</div>
        <div>I've just pushed a scraper configuration for this bible : <a
href="https://github.com/UnasZole/bible-scraper/blob/master/src/main/resources/scrapers/GenericHtml/KralickaWikisource.yaml"
            moz-do-not-send="true" class="moz-txt-link-freetext">https://github.com/UnasZole/bible-scraper/blob/master/src/main/resources/scrapers/GenericHtml/KralickaWikisource.yaml</a></div>
        <div>Main books were easy to parse - deuterocanonical books
          extracted from a different manuscript were a bit messier.</div>
        <div>I made a few assumptions (I interpret italics in verse as
          translation additions, and side notes in deuterocanonical
          books as section titles, etc.)</div>
        <div>Feel free to test it : after checking out and building the
          repository, you should just need to run for example:</div>
        <div><br>
        </div>
        <div>> ./run.sh scrape -s GenericHtml -i KralickaWikisource
          -b Ps -c 1 -w USFM</div>
        <div><br>
        </div>
        <div>Cheers,</div>
        <div><br>
        </div>
        <div>Arnaud<br>
        </div>
        <div><br>
        </div>
        <div class="gmail_quote">
          <div dir="ltr" class="gmail_attr">Le dim. 2 juin 2024 à 08:50,
            Matěj Cepl <<a href="mailto:mcepl@cepl.eu"
              moz-do-not-send="true" class="moz-txt-link-freetext">mcepl@cepl.eu</a>>
            a écrit :<br>
          </div>
          <blockquote class="gmail_quote"
style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On
            Sun Jun 2, 2024 at 1:09 AM CEST, Arnaud Vié wrote:<br>
            > I'm open to any kind of feedback or suggestions of
            course !<br>
            > In particular :<br>
            ><br>
            >    - if you have any specific website in mind that you
            would like to be<br>
            >    able to build sword modules from, let me know, we
            can try to add it.<br>
            >    (Currently I only included a few French websites,
            but I'm interested to add<br>
            >    some other languages).<br>
            <br>
            Sword module CzeBKR is sourced from the Czech WikiSource [1]<br>
            and there seems to be the official way [2] how to get source<br>
            in some hopefully more useful formats (plain text, RTF,
            HTML,<br>
            EPubs). I was using my own home-grown Python script [3], but
            it<br>
            seems like with all web-scrapping scripts it rotten away
            (that<br>
            script is under some of kind of very free open source
            license,<br>
            let’s say MIT/X11 … I am going to add the proper LICENSE
            file<br>
            momentarily). It started at [4] (look at the source view),
            but it<br>
            doesn’t seem to be that useful anymore.<br>
            <br>
            >    - And if you are knowledgeable about the
            intellectual property laws in<br>
            >    other countries, I'm interested : currently, I've
            added a section to the<br>
            >    README explaining why the usage of the scraper on
            any public website is<br>
            >    allowed in France with references to the related
            texts, but it would<br>
            >    probably be useful to have similar information for
            users from other<br>
            >    countries.<br>
            <br>
            I am absolutely certain, there are no problems with CzeBKR:<br>
            <br>
                1. It is WikiSource, so we have somebody else to blame
            ;)<br>
                2. The original Bible of Kralice [5] is from the
            sixteenth<br>
                   century and it is absolutely in the public domain.<br>
                3. Source for the WikiSource was a scan [6] of the book<br>
                   from 1918, without any authors shown. The works of
            only<br>
                   possible editor of that Bible I know about [7] (and
            he is<br>
                   not shown on the title page, but he was working in
            the<br>
                   early 20th century with the International Bible
            Society on<br>
                   the revision of the Bible) are under the Bern
            Convention<br>
                   (death in 1929 + 75 years) in the public domain as
            well.<br>
                4. We are in EU as well.<br>
            <br>
            If you want to use CzeBKR as your test case, I am ready to
            help<br>
            you with any testing or Czech issues or whatever.<br>
            <br>
            Blessed Sunday!<br>
            <br>
            Matěj<br>
            <br>
            [1] <a
href="https://cs.wikisource.org/wiki/Bible_kralick%C3%A1_(1918)"
              rel="noreferrer" target="_blank" moz-do-not-send="true"
              class="moz-txt-link-freetext">https://cs.wikisource.org/wiki/Bible_kralick%C3%A1_(1918)</a>
            <br>
            [2] <a
href="https://ws-export.wmcloud.org/?lang=cs&title=Bible_kralick%C3%A1_%281918%29"
              rel="noreferrer" target="_blank" moz-do-not-send="true">https://ws-export.wmcloud.org/?lang=cs&title=Bible_kralick%C3%A1_%281918%29</a><br>
            [3] <a
href="https://gitlab.com/crosswire-bible-society/CzeBKR/-/blob/master/kralicka.py"
              rel="noreferrer" target="_blank" moz-do-not-send="true"
              class="moz-txt-link-freetext">https://gitlab.com/crosswire-bible-society/CzeBKR/-/blob/master/kralicka.py</a><br>
            [4] <a
href="https://cs.wikisource.org/wiki/Speci%C3%A1ln%C3%AD:Exportovat_str%C3%A1nky/Bible_kralick%C3%A1_(1918)"
              rel="noreferrer" target="_blank" moz-do-not-send="true"
              class="moz-txt-link-freetext">https://cs.wikisource.org/wiki/Speci%C3%A1ln%C3%AD:Exportovat_str%C3%A1nky/Bible_kralick%C3%A1_(1918)</a><br>
            [5] <a
              href="https://en.wikipedia.org/wiki/Bible_of_Kralice"
              rel="noreferrer" target="_blank" moz-do-not-send="true"
              class="moz-txt-link-freetext">https://en.wikipedia.org/wiki/Bible_of_Kralice</a><br>
            [6] <a
href="http://archive.org/details/biblsvatanebvec00socigoog"
              rel="noreferrer" target="_blank" moz-do-not-send="true"
              class="moz-txt-link-freetext">http://archive.org/details/biblsvatanebvec00socigoog</a><br>
            [7] <a
              href="https://cs.wikipedia.org/wiki/Jan_Karafi%C3%A1t"
              rel="noreferrer" target="_blank" moz-do-not-send="true"
              class="moz-txt-link-freetext">https://cs.wikipedia.org/wiki/Jan_Karafi%C3%A1t</a><br>
            -- <br>
            <a href="http://matej.ceplovi.cz/blog/" rel="noreferrer"
              target="_blank" moz-do-not-send="true"
              class="moz-txt-link-freetext">http://matej.ceplovi.cz/blog/</a>,
            @mcepl@floss.social<br>
            GPG Finger: 3C76 A027 CA45 AD70 98B5  BC1D 7920 5802 880B
            C9D8<br>
            <br>
            The ratio of literacy to illiteracy is a constant, but
            nowadays<br>
            the illiterates can read.<br>
                -- Alberto Moravia<br>
            <br>
            _______________________________________________<br>
            sword-devel mailing list: <a
              href="mailto:sword-devel@crosswire.org" target="_blank"
              moz-do-not-send="true" class="moz-txt-link-freetext">sword-devel@crosswire.org</a><br>
            <a href="http://crosswire.org/mailman/listinfo/sword-devel"
              rel="noreferrer" target="_blank" moz-do-not-send="true"
              class="moz-txt-link-freetext">http://crosswire.org/mailman/listinfo/sword-devel</a><br>
            Instructions to unsubscribe/change your settings at above
            page<br>
          </blockquote>
        </div>
      </div>
      <br>
      <fieldset class="moz-mime-attachment-header"></fieldset>
      <pre class="moz-quote-pre" wrap="">_______________________________________________
sword-devel mailing list: <a class="moz-txt-link-abbreviated" href="mailto:sword-devel@crosswire.org">sword-devel@crosswire.org</a>
<a class="moz-txt-link-freetext" href="http://crosswire.org/mailman/listinfo/sword-devel">http://crosswire.org/mailman/listinfo/sword-devel</a>
Instructions to unsubscribe/change your settings at above page
</pre>
    </blockquote>
    <br>
    <div class="moz-signature">-- <br>
      Vous aimez la Bible ? Vous êtes étudiant en théologie ? Utilisez
      l'application libre <a href="https://xiphos.org/">Xiphos</a> ou <a
        href="https://andbible.github.io/">Andbible</a> et accédez aux
      textes sources, à des commentaires, des dictionnaires et beaucoup
      d'autres fonctionnalités... Me contacter pour des traductions en
      français.</div>
  </body>
</html>