[sword-devel] Persian Module
DM Smith
dmsmith555 at yahoo.com
Wed Jan 12 20:36:23 MST 2005
In OSIS a verse tag can either be a container, as in <verse>text</verse>
or a marker as in <verse/>text. According to the spec, one or the other
should be used for a work but not both.
This is needed to handle overlapping structures. Such as a verse that
starts in one paragraph and ends in another. Another example is a quote
that starts within a verse and spans several verses.
Alternatively, OSIS allows for these kind of text elements to also be
markers.
This only becomes an issue if text is richly marked up with text elements.
I am not sure if sword cares which form is used.
vkaehne at doctors.org.uk wrote:
>Dear knowing ones,
>
>There are a few things I am struggling just now with and wonder whether I could get some advice:
>
>As described previously my text is some XML variety, the dump of paratext. Everything is marked up - which is good, but uses different tags than OSI - which is bad. I am in the process to change it over to osis, but as I can not yet script I must do things by hand - which is a bit grim.
>
>q1) in an Osis prepared module do the verses need an osisID ? I reverted the Suaheli module (mod2osis) and found that only the chapters are tagged, while the verses appear to be simple a verse per line. I assume that the software counts the verses "by hand". Is this true?
>
>q2) do the chapters need to have a complete osisID a la "Matt.1" or are there short versions possible - read in the Osis manual that a simple leading blank will be interpreted as referring to the current text, but teh refference is a bit ambiguous and not covered by an example.
>
>q3) Currently the chapters are coded as <chapter value="1"> and the verses as <verse value="1">. A simple search and replace would need to be done at chapter level to get all verses coded or at book level to get all chapters coded properly, but a e.g. sed script would probably do this in a minute for the whole book. Are there some sample scripts about which would do the above, which I could adapt? Also a regex would probably cover this but I am clueless in these too.
>
>q4) the Bible is obviously in unicode with intermittent changes from l->r and r->l. The - to me - odd result of this is that each verse follows following scheme "<verse value="1"></verse> edhfoo fgfuwgfp " with teh text trailing the end marker. At least this what I see when I open the module in gedit, emacs and kate. is this normal and ok?
>
>Thanks
>
>Peter
>
>
>____________________________________________________________
>This e-mail has been scanned by the StreamShield Protector antivirus system.
>Doctors.net.uk is used by over 107,000 UK doctors.
>____________________________________________________________
>
>
>_______________________________________________
>sword-devel mailing list
>sword-devel at crosswire.org
>http://www.crosswire.org/mailman/listinfo/sword-devel
>
>
>
>
More information about the sword-devel
mailing list