[sword-devel] XML DOM
DJ Ortley
djortley at gmail.com
Fri Mar 9 13:54:02 MST 2007
In regards to the lowest common denominator comment you made, that's one of
the things I thought would probably come up. Which is a thing I can
understand. I didn't know that there was a lot of focus on speed, but it
makes sense. I've been impressed with how fast Sword seems to work at
times.
I don't know much about how Sword stores its modules as there is no
documentation I can find, and I've yet to actually ask whats going on.
Right now I'm still working through the API trying to understand what's
happening with the goal of finding a suitable way to implement access to the
various Deuterocanons. Looking through the archives, I come to the
conclusion that implementing such changes, as long as they are done the
right way, are mostly acceptable to the community.
The thing that prompted me to ask about DOM support was when I was looking
through the source in the utilities folder. It seemed that a lot of work
could be saved if some library were used.
Maybe things could be broken into two parts. The core API and the
utilities, with the utilities having greater allowance for use of third
party libraries that might not necessarily be suitable for a hand held...
One isn't going to be using a handheld to make modules anyways (well,
hopefully not at least.)
Just a thought. Maybe not a good one, but there it is.
By the way, aside from poking around through the code, is there some sort of
documentation or outline (aside from the API primer) of whats going on
anywhere? If not, could someone give me a quick and dirty sketch of some
sort?
Thanks.
-DJ
On 3/9/07, DM Smith <dmsmith555 at yahoo.com> wrote:
>
> DJ Ortley wrote:
> > Looking through the source code, it seems to me (which are key words
> > that indicate this is only an opinion, one which may not be worth
> > much) that using a library such as Xerces or some sort of XML DOM like
> > library would be of benefit.
> >
> > I was wondering if any thought had been given to that previously?
>
> This is the approach that JSword uses. We actually use JAXP which is an
> interface layer over a plug-in implementation of XML. So in some cases
> we use Crimson and in others we use Xerces. It all depends upon what is
> bundled with the user's JDK. SAX is a better model for most processing
> than DOM, as most processing does not need an object representation of
>
> That said, I think that there are significant advantages and also
> disadvantages to using it.
> To me the most significant advantages are that it is a full
> implementation of an XML parser and we don't need to maintain it.
>
> Disadvantages:
> It is a full implementation of the XML parser. Sword doesn't need a full
> implementation of the parser. Our documents have a well defined
> vocabulary (i.e. the DTD specs) and we only need a parser sufficient to
> parse that vocabulary.
>
> Parsing serves two purposes: search/indexing, i.e. stripping out only
> the text from the "verse" and display, i.e. converting the module raw
> source into some kind of presentation source. The former benefits from
> being very fast. Sword's "stripping" routines are built for speed. It
> would be a huge performance loss to use a true XML parser. For the most
> part, parsing for converting to a display representation can be slower
> because it will likely be fast enough.
>
> The other thing is that the Sword library has taken a least common
> denominator approach to its requirements. It is targeted to small
> handhelds (phones, pdas and the like) and to computers of all ages,
> colors and creeds. Introducing a fairly large library would need to be
> optional (like curl, icu4c and lucene) and it would still leave the need
> for the current custom parsing.
>
> Earlier I submitted a patch to make the parser more accurate and it was
> rejected as a performance hit and too big/risky of a change. And these
> were the reasons that I was given.
>
> _______________________________________________
> sword-devel mailing list: sword-devel at crosswire.org
> http://www.crosswire.org/mailman/listinfo/sword-devel
> Instructions to unsubscribe/change your settings at above page
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.crosswire.org/pipermail/sword-devel/attachments/20070309/ba33db8a/attachment.html
More information about the sword-devel
mailing list