org.crosswire.jsword.book
Class OSISUtil

java.lang.Object
  extended by org.crosswire.jsword.book.OSISUtil

public final class OSISUtil
extends Object

Some simple utilities to help working with OSIS classes.

Author:
Joe Walker [joe at eireneh dot com]
See Also:
for license details. The copyright to this program is held by it's authors.

Nested Class Summary
static class OSISUtil.ObjectFactory
          A generic way of creating empty Elements of various types
 
Field Summary
static String ATTRIBUTE_DIV_LANG
           
static String ATTRIBUTE_DIV_OSISID
           
static String ATTRIBUTE_HI_TYPE
           
static String ATTRIBUTE_NOTE_TYPE
           
static String ATTRIBUTE_OSISTEXT_OSISIDWORK
           
static String ATTRIBUTE_REFERENCE_OSISREF
           
static String ATTRIBUTE_SEG_SUBTYPE
           
static String ATTRIBUTE_SEG_TYPE
           
static String ATTRIBUTE_SPEAKER_WHO
           
static String ATTRIBUTE_TEXT_OSISIDWORK
           
static String ATTRIBUTE_VERSE_OSISID
           
static String ATTRIBUTE_W_LEMMA
           
static String ATTRIBUTE_W_MORPH
           
static String ATTRIBUTE_WORK_OSISWORK
           
private static Set EXTRA_BIBLICAL_ELEMENTS
           
private static OSISUtil.ObjectFactory factory
           
static String HI_BOLD
          Constant to help narrow down what we use "hi" for.
static String HI_ITALIC
          Constant to help narrow down what we use "hi" for.
static String HI_UNDERLINE
          Constant to help narrow down what we use "hi" for.
static String LEMMA_STRONGS
          Constant for a Strongs numbering lemma
private static Logger log
          The log stream
static String MORPH_ROBINSONS
           
static String MORPH_STRONGS
          Constant for Strongs numbering morphology
static String NOTETYPE_STUDY
          Constant for the study note type
static String OSIS_ATTR_EID
           
static String OSIS_ATTR_SID
           
static String OSIS_ELEMENT_CELL
           
static String OSIS_ELEMENT_DIV
           
static String OSIS_ELEMENT_FOREIGN
           
static String OSIS_ELEMENT_HEADER
           
static String OSIS_ELEMENT_HI
           
static String OSIS_ELEMENT_ITEM
           
static String OSIS_ELEMENT_L
           
static String OSIS_ELEMENT_LB
           
static String OSIS_ELEMENT_LG
           
static String OSIS_ELEMENT_LIST
           
static String OSIS_ELEMENT_NAME
           
static String OSIS_ELEMENT_NOTE
           
static String OSIS_ELEMENT_OSIS
           
static String OSIS_ELEMENT_OSISTEXT
           
static String OSIS_ELEMENT_P
           
static String OSIS_ELEMENT_Q
           
static String OSIS_ELEMENT_REFERENCE
           
static String OSIS_ELEMENT_ROW
           
static String OSIS_ELEMENT_SEG
           
static String OSIS_ELEMENT_SPEAKER
           
static String OSIS_ELEMENT_SPEECH
           
static String OSIS_ELEMENT_TABLE
           
static String OSIS_ELEMENT_TITLE
           
static String OSIS_ELEMENT_VERSE
           
static String OSIS_ELEMENT_W
           
static String OSIS_ELEMENT_WORK
           
private static String OSISID_PREFIX_BIBLE
          Prefix for OSIS IDs that refer to Bibles
static String SEG_CENTER
          Constant to help narrow down what we use seg for.
static String SEG_COLORPREFIX
          Constant to help narrow down what we use seg for.
static String SEG_JUSTIFYRIGHT
          Constant to help narrow down what we use seg for.
static String SEG_SIZEPREFIX
          Constant to help narrow down what we use seg for.
static String SEG_SMALL
          Constant to help narrow down what we use seg for.
static String SEG_SUPERSCRIPT
          Constant to help narrow down what we use seg for.
static String VARIANT_CLASS
           
static String VARIANT_TYPE
          Constant for the variant type segment
 
Constructor Summary
private OSISUtil()
          Prevent Instansiation
 
Method Summary
static org.jdom.Element createOsisFramework(BookMetaData bmd)
          Helper method to create the boilerplate headers in an OSIS document from the current metadata object
static OSISUtil.ObjectFactory factory()
          An accessor for the ObjectFactory that creates OSIS objects
static Collection getDeepContent(org.jdom.Element div, String name)
          Find all the instances of elements of type find under the element div.
static String getPlainText(org.jdom.Element root)
          A simplified plain text version of the data in this Element with all the markup stripped out.
private static String getTextContent(org.jdom.Element ele)
           
static Verse getVerse(org.jdom.Element ele)
          Walk up the tree from the W to find out what verse we are in.
private static void getVerseContent(Iterator iter, StringBuffer buffer)
           
static String getVerseText(org.jdom.Element root)
          Get the verse text from an osis document consisting of a single verse.
private static void recurseChildren(org.jdom.Element ele, StringBuffer buffer)
          Helper to extract the Strings from a nest of JDOM elements
private static void recurseDeepContent(org.jdom.Element start, String name, List reply)
          Find all the instances of elements of type find under the element div.
private static void recurseElement(Object sub, StringBuffer buffer)
          If we have a String just add it to the buffer, but if we have an Element then try to dig the strings out of it.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

HI_BOLD

public static final String HI_BOLD
Constant to help narrow down what we use "hi" for. In this case the bold tag

See Also:
Constant Field Values

HI_ITALIC

public static final String HI_ITALIC
Constant to help narrow down what we use "hi" for. In this case the italic tag

See Also:
Constant Field Values

HI_UNDERLINE

public static final String HI_UNDERLINE
Constant to help narrow down what we use "hi" for. In this case the underline tag

See Also:
Constant Field Values

SEG_JUSTIFYRIGHT

public static final String SEG_JUSTIFYRIGHT
Constant to help narrow down what we use seg for. In this case the justify right tag

See Also:
Constant Field Values

SEG_CENTER

public static final String SEG_CENTER
Constant to help narrow down what we use seg for. In this case the justify right tag

See Also:
Constant Field Values

SEG_SMALL

public static final String SEG_SMALL
Constant to help narrow down what we use seg for. In this case the small tag

See Also:
Constant Field Values

SEG_SUPERSCRIPT

public static final String SEG_SUPERSCRIPT
Constant to help narrow down what we use seg for. In this case the sup tag

See Also:
Constant Field Values

SEG_COLORPREFIX

public static final String SEG_COLORPREFIX
Constant to help narrow down what we use seg for. In this case the color tag

See Also:
Constant Field Values

SEG_SIZEPREFIX

public static final String SEG_SIZEPREFIX
Constant to help narrow down what we use seg for. In this case the font-size tag

See Also:
Constant Field Values

NOTETYPE_STUDY

public static final String NOTETYPE_STUDY
Constant for the study note type

See Also:
Constant Field Values

VARIANT_TYPE

public static final String VARIANT_TYPE
Constant for the variant type segment

See Also:
Constant Field Values

VARIANT_CLASS

public static final String VARIANT_CLASS
See Also:
Constant Field Values

LEMMA_STRONGS

public static final String LEMMA_STRONGS
Constant for a Strongs numbering lemma

See Also:
Constant Field Values

MORPH_ROBINSONS

public static final String MORPH_ROBINSONS
See Also:
Constant Field Values

MORPH_STRONGS

public static final String MORPH_STRONGS
Constant for Strongs numbering morphology

See Also:
Constant Field Values

OSIS_ELEMENT_TITLE

public static final String OSIS_ELEMENT_TITLE
See Also:
Constant Field Values

OSIS_ELEMENT_TABLE

public static final String OSIS_ELEMENT_TABLE
See Also:
Constant Field Values

OSIS_ELEMENT_SPEECH

public static final String OSIS_ELEMENT_SPEECH
See Also:
Constant Field Values

OSIS_ELEMENT_SPEAKER

public static final String OSIS_ELEMENT_SPEAKER
See Also:
Constant Field Values

OSIS_ELEMENT_ROW

public static final String OSIS_ELEMENT_ROW
See Also:
Constant Field Values

OSIS_ELEMENT_REFERENCE

public static final String OSIS_ELEMENT_REFERENCE
See Also:
Constant Field Values

OSIS_ELEMENT_NOTE

public static final String OSIS_ELEMENT_NOTE
See Also:
Constant Field Values

OSIS_ELEMENT_NAME

public static final String OSIS_ELEMENT_NAME
See Also:
Constant Field Values

OSIS_ELEMENT_Q

public static final String OSIS_ELEMENT_Q
See Also:
Constant Field Values

OSIS_ELEMENT_LIST

public static final String OSIS_ELEMENT_LIST
See Also:
Constant Field Values

OSIS_ELEMENT_P

public static final String OSIS_ELEMENT_P
See Also:
Constant Field Values

OSIS_ELEMENT_ITEM

public static final String OSIS_ELEMENT_ITEM
See Also:
Constant Field Values

OSIS_ELEMENT_FOREIGN

public static final String OSIS_ELEMENT_FOREIGN
See Also:
Constant Field Values

OSIS_ELEMENT_W

public static final String OSIS_ELEMENT_W
See Also:
Constant Field Values

OSIS_ELEMENT_VERSE

public static final String OSIS_ELEMENT_VERSE
See Also:
Constant Field Values

OSIS_ELEMENT_CELL

public static final String OSIS_ELEMENT_CELL
See Also:
Constant Field Values

OSIS_ELEMENT_DIV

public static final String OSIS_ELEMENT_DIV
See Also:
Constant Field Values

OSIS_ELEMENT_OSIS

public static final String OSIS_ELEMENT_OSIS
See Also:
Constant Field Values

OSIS_ELEMENT_WORK

public static final String OSIS_ELEMENT_WORK
See Also:
Constant Field Values

OSIS_ELEMENT_HEADER

public static final String OSIS_ELEMENT_HEADER
See Also:
Constant Field Values

OSIS_ELEMENT_OSISTEXT

public static final String OSIS_ELEMENT_OSISTEXT
See Also:
Constant Field Values

OSIS_ELEMENT_SEG

public static final String OSIS_ELEMENT_SEG
See Also:
Constant Field Values

OSIS_ELEMENT_LG

public static final String OSIS_ELEMENT_LG
See Also:
Constant Field Values

OSIS_ELEMENT_L

public static final String OSIS_ELEMENT_L
See Also:
Constant Field Values

OSIS_ELEMENT_LB

public static final String OSIS_ELEMENT_LB
See Also:
Constant Field Values

OSIS_ELEMENT_HI

public static final String OSIS_ELEMENT_HI
See Also:
Constant Field Values

ATTRIBUTE_TEXT_OSISIDWORK

public static final String ATTRIBUTE_TEXT_OSISIDWORK
See Also:
Constant Field Values

ATTRIBUTE_WORK_OSISWORK

public static final String ATTRIBUTE_WORK_OSISWORK
See Also:
Constant Field Values

ATTRIBUTE_VERSE_OSISID

public static final String ATTRIBUTE_VERSE_OSISID
See Also:
Constant Field Values

ATTRIBUTE_DIV_OSISID

public static final String ATTRIBUTE_DIV_OSISID
See Also:
Constant Field Values

OSIS_ATTR_SID

public static final String OSIS_ATTR_SID
See Also:
Constant Field Values

OSIS_ATTR_EID

public static final String OSIS_ATTR_EID
See Also:
Constant Field Values

ATTRIBUTE_W_LEMMA

public static final String ATTRIBUTE_W_LEMMA
See Also:
Constant Field Values

ATTRIBUTE_HI_TYPE

public static final String ATTRIBUTE_HI_TYPE
See Also:
Constant Field Values

ATTRIBUTE_SEG_TYPE

public static final String ATTRIBUTE_SEG_TYPE
See Also:
Constant Field Values

ATTRIBUTE_SEG_SUBTYPE

public static final String ATTRIBUTE_SEG_SUBTYPE
See Also:
Constant Field Values

ATTRIBUTE_REFERENCE_OSISREF

public static final String ATTRIBUTE_REFERENCE_OSISREF
See Also:
Constant Field Values

ATTRIBUTE_NOTE_TYPE

public static final String ATTRIBUTE_NOTE_TYPE
See Also:
Constant Field Values

ATTRIBUTE_SPEAKER_WHO

public static final String ATTRIBUTE_SPEAKER_WHO
See Also:
Constant Field Values

ATTRIBUTE_W_MORPH

public static final String ATTRIBUTE_W_MORPH
See Also:
Constant Field Values

ATTRIBUTE_OSISTEXT_OSISIDWORK

public static final String ATTRIBUTE_OSISTEXT_OSISIDWORK
See Also:
Constant Field Values

ATTRIBUTE_DIV_LANG

public static final String ATTRIBUTE_DIV_LANG
See Also:
Constant Field Values

OSISID_PREFIX_BIBLE

private static final String OSISID_PREFIX_BIBLE
Prefix for OSIS IDs that refer to Bibles

See Also:
Constant Field Values

EXTRA_BIBLICAL_ELEMENTS

private static final Set EXTRA_BIBLICAL_ELEMENTS

log

private static final Logger log
The log stream


factory

private static OSISUtil.ObjectFactory factory
Constructor Detail

OSISUtil

private OSISUtil()
Prevent Instansiation

Method Detail

factory

public static OSISUtil.ObjectFactory factory()
An accessor for the ObjectFactory that creates OSIS objects


getVerseText

public static String getVerseText(org.jdom.Element root)
Get the verse text from an osis document consisting of a single verse. The document is assumed to be valid OSIS2.0 XML. While xml valid is rigidly defined as meaning that an xml parser can validate the document, it does not mean that the document is valid OSIS. This is a semantic problem that is not validated. This method assumes that the root element is also semantically valid.

This means that the top level element's tagname is osis. This can contain either a osisText or an osisCorpus. If it is an osisCorpus, then it contains an osisText. However, as a simplification, since JSword constructs the whole doc for the fragment, osisCorpus can be ignored.

The osisText element contains a div element that is either a container or a milestone. Again, JSword is providing the div element and it will be provided as a container. It is this div that "contains" the verse element.

The verse element may either be a container or a milestone. Sword OSIS books differ in whether they provide the verse element. Most do not. The few that do are using the container model, but it has been proposed that milestones are the best practice.

The verse may contain elements that are not a part of the original text. These are things such as notes.

Milestones require special handling. Beginning milestones elements have an sID attribute, while ending milestones have an eID with the same value as the opening. So everything between the start and the corresponding end is the content of the element. Also, for a given element, say div, they have to be properly nested as if they were container elements.

Parameters:
root - the whole osis document.
Returns:
The Bible text without markup

getPlainText

public static String getPlainText(org.jdom.Element root)
A simplified plain text version of the data in this Element with all the markup stripped out.

Returns:
The Bible text without markup

getVerseContent

private static void getVerseContent(Iterator iter,
                                    StringBuffer buffer)

getTextContent

private static String getTextContent(org.jdom.Element ele)

getDeepContent

public static Collection getDeepContent(org.jdom.Element div,
                                        String name)
Find all the instances of elements of type find under the element div.


getVerse

public static Verse getVerse(org.jdom.Element ele)
                      throws BookException
Walk up the tree from the W to find out what verse we are in.

Parameters:
ele - The start point for our verse hunt.
Returns:
The verse we are in
Throws:
BookException

createOsisFramework

public static org.jdom.Element createOsisFramework(BookMetaData bmd)
Helper method to create the boilerplate headers in an OSIS document from the current metadata object


recurseDeepContent

private static void recurseDeepContent(org.jdom.Element start,
                                       String name,
                                       List reply)
Find all the instances of elements of type find under the element div. For internal use only.


recurseElement

private static void recurseElement(Object sub,
                                   StringBuffer buffer)
If we have a String just add it to the buffer, but if we have an Element then try to dig the strings out of it.


recurseChildren

private static void recurseChildren(org.jdom.Element ele,
                                    StringBuffer buffer)
Helper to extract the Strings from a nest of JDOM elements

Parameters:
ele - The JDOM Element to dig into
buffer - The place we accumulate strings.

Copyright ยจ 2003-2005