[jsword-devel] Re: [sword-devel] OSIS 2.0 and JSword

Joe Walker joseph.walker at gmail.com
Thu Aug 19 02:10:59 MST 2004


I have 1 more thought that is probably specific to J-Sword:

2. we could write a BookDriver to read OSIS files directly. the first
time we see an OSIS file we write 2 indexes, one in a verse index that
says for verse X start here and end here. The start and end points
would be an XML valid superset that contains the start and end markers
requested. We would the trim as described in thought 1.
The second index is a search index. the same as at the moment.

Joe.

On Thu, 19 Aug 2004 10:06:19 +0100, Joe Walker <joseph.walker at gmail.com> wrote:
> Hi,
> 
> I have 1 thought:
> 
> 1. It would probably be possible to write some code to effectively
> remove all content outside of two arbitary markers in an XML document.
> E.G.
> 
> <osis>
>     blah
>     <verseStart/>
>     <q>
>         blah
>         <verseEnd/>
>         blah
>     </q>
>     blah
> </osis>
> 
> Becomes:
> 
> <osis>
>     <verseStart/>
>     <q>
>         blah
>         <verseEnd/>
>     </q>
> </osis>
> 
> So you start with a full OSIS document and do that process for each
> and every verse, storing the result in a SWORD module.
> 
> Joe.
> 
> 
> 
> On Thu, 19 Aug 2004 09:30:28 +1000, Kahunapule Michael P. Johnson
> <kahunapule at mpj.cx> wrote:
> > At 03:17 19-08-04, DM Smith wrote:
> > >Looking past a JSword 1.0 release, I was studying the OSIS 2.0 schema
> > >and it looks like it may be tough to handle well. Specifically, there
> > >are elements that can be either a marker or a container. With regard to
> > >Bibles specifically a  verse may start smack dab in the middle of one of
> > >these other elements. Or one of these elements may end in a verse. And
> > >it might not be just one element that is split by a verse, it may be
> > >several.
> > >...
> > >Does anyone know of a best practice for OSIS, or any other XML field?
> >
> > I believe that the recommendation in the XSEM documentation is valid for OSIS as well. Basically, it gives lots of good reasons for making the natural poetry and prose structure of the document primary, and the chapter/verse structure secondary. It then goes on to say that for use in applications where verse priority is required (i. e. a Bible search engine), it would make sense to use XSLT to transform the XML to prioritize the chapters and verses as containers, and make the other elements milestones. This makes sense to me. The sword engine is entirely verse-oriented, as are most Bible search engines, but it is highly desirable to preserve poetry and prose formatting within each displayed verse range.
> >
> > Of course, the details of what the chapter and verse priority version of OSIS, XSEM, or other similar XML standard isn't fully clear, but I would think that if you start with OSIS, you would want to keep it as close to OSIS as you can while still making every verse well-formed XML, and while making the other stuff into milestones. OSIS already allows verses to be containers, for example, and it has a milestone mechanism that can be extended to the containers for anything that can cross verse boundaries, such as paragraphs. Perhaps using the milestone element just after every verse open tag to put in redundant reminders of what kind of paragraph you are in would make sense; likewise for character styles, quotations being continued, etc.
> >
> > Thoughts?
> >
> >
> >
> >
> > _______________________________________________
> > sword-devel mailing list
> > sword-devel at crosswire.org
> > http://www.crosswire.org/mailman/listinfo/sword-devel
> >
>


More information about the jsword-devel mailing list