[sword-devel] Unicode Bible program

Paul Gear sword-devel@crosswire.org
Fri, 25 Feb 2000 09:53:50 +0000


Joel Mawhorter wrote:

> On Thu, 24 Feb 2000, Troy A. Griffitts wrote:
> > > Since SWORD is written in C++, Unicode support would have to be done with a third
> > > party library. Using a library would allow processing of Unicode text but there
> > > isn't any consistent way to render Unicode text across platforms. I'm not really
> > > familiar with the SWORD code; would anyone care to comment on how easy it would
> > > be to subclass SWText to create a Unicode text class? Does anything in the
> > > parent classes assume 8 bit chars?
> >
> > I'm sure we would all love for this project to support unicode.  If I
> > knew anything technical about unicode I'd comment on if there is
> > anything that would hinder you from subclassing SWText and creating your
> > own.  I believe the code is modular enough so that if say, the search
> > method did ASCII specific things like the call to strstr(...), you may
> > still override the search method in your own SWText subclass and provide
> > unicode searching algorithms.
> >
> > I would love to learn more and take out all 8bit dependencies to help
> > you, but it's just a question of time.
>
> I understand that completely! I don't expect you to put aside what you're
> doing to look into the Unicode issue. There are a lot of people who need SWORD
> who don't need Unicode. I am reading through the Unicode standard now and
> experimenting with different options. If there is anyone on this list who is a
> Unicode expert, I'd really appreciate some help in understanding some of the
> issues involved.
>
> I realize that SWORD is quite modular, however, since there would be very
> little code shared between the ASCII portion of SWORD and any Unicode
> extensions, I'm not sure that adding to SWORD would be the best option.
> Besides Java provides some very compelling reasons to use it for Unicode
> support. I was playing around with a Java Swing demo program today and I
> inserted some Hebrew characters into an English text string. Not only did it
> display them but it got the right to left order correct without me having to do
> anything. Java has a lot of things like that built in that would need to be
> done manually in C++.

Some more comments (i am not an expert, but i've looked into it a little):
- 7-bit ASCII is a subset of Unicode.  If you use UTF-8 encoding (not what Java uses,
but what XML uses), then all of the 7-bit codes convert over directly.
- Eventually all of Sword _must_ support Unicode, because that is the way the Web is
going, given that XML is defined to be Unicode, and currently the best standards for
text markup use it.

Because of this, i believe that adding to Sword is the best option.  If Troy's Java
bindings can be made to work, then there is no reason we can't use a Java frontend
plugged into the Sword libraries.  (That would make me a much happier and more
productive programmer, too.  :-)

> > > Troy, would
> > > it be possible to include new software like this under the SWORD umbrella?
> >
> > Of course.  That's what CrossWire is for:  to sponsor opensource
> > ministry projects.
>
> Great. I'm not interested in forging out on my own even if a program seperate
> from SWORD/BibleTime is needed.
>
> ...
> > there is a cvs module: jsword that you can checkout that include many of
> > the sword classes ported to java.  I've done quite a bit more that I
> > have not yet checked in.  I was doing it for fun on the bus ride to
> > comdex last year and have since lost my laptop nic pcmcia card.  I'll
> > move it to a floppy and check it in if anyone is interested.

Definitely interested, Troy.
--
Paul
---------
"He must become greater; i must become less." - John 3:30
http://www.bigfoot.com/~paulgear