[sword-devel] Module upload: Shona

Michael H cmahte at gmail.com
Tue Aug 29 09:50:21 MST 2017


The aggregation into a large USFM file is 1 command line (cat .\*\*.dat >
shona.sfm) .

Splitting that into standard book files is 1 more command (csplit /\\id /
shona.sfm) .

You need to check each \c tag has it's own line , incase the chapter files
end abnormally without a final newline/return.

However, you end up with files numbered 001.dat, 002.dat  that then need to
be renamed. still trivial, but measured in minutes not seconds.


On Tue, Aug 29, 2017 at 11:27 AM, David Haslam <dfhmch at googlemail.com>
wrote:

> Teus has since added all the missing *\toc#* markers to the  Shona
> <https://github.com/teusbenschop/shona>   repo.
>
> After the last commit, the USFM tag statistics were as follows:
>
> Count   SFM tag Description (updated for USFM 3.0)
> -----   --------        -----------------------------------
> 04948   \add    Translator's added words begin
> 04948   \add*   Translator's added words end
> 01189   \c      Chapter
> 00066   \h      Running header (h=h1)
> 00066   \id     Identification
> 00065   \mt     Major title (mt=mt1)
> 00001   \mt1    Major title (portion 1)
> 00031   \mt2    Major title (portion 2)
> 00009   \nb     No break with previous paragraph
> 06445   \p      Paragraph
> 00066   \rem    Remark
> 01774   \s      Section heading (s=s1)
> 00066   \toc1   Table of contents 1 (Long  table of contents text)
> 00066   \toc2   Table of contents 2 (Short table of contents text)
> 00066   \toc3   Table of contents 3 (Book abbreviation)
> 31102   \v      Verse[s]
> 15739   \x      Cross reference element begin
> 15739   \x*     Cross reference element end
>
> Observation:
> The data structure in the GitHub repository is not one USFM file per book,
> but one [USFM] data file per chapter, each in a suitable numbered
> directory,
> plus a separate data file (in directory 0) for the USFM header lines.
>
> In order to convert the text to OSIS, some preprocessing would be required
> to get the source text to one USFM file per book (as used by ParaTExt).
>
> Best regards,
>
> David
>
>
>
>
>
> --
> View this message in context: http://sword-dev.350566.n4.
> nabble.com/Module-upload-Shona-tp4657457p4657513.html
> Sent from the SWORD Dev mailing list archive at Nabble.com.
>
> _______________________________________________
> sword-devel mailing list: sword-devel at crosswire.org
> http://www.crosswire.org/mailman/listinfo/sword-devel
> Instructions to unsubscribe/change your settings at above page
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.crosswire.org/pipermail/sword-devel/attachments/20170829/d6392127/attachment.html>


More information about the sword-devel mailing list