[jsword-devel] Some characters not returned correctly

Joe Walker jsword-devel@crosswire.org
Thu, 23 Oct 2003 17:35:39 +0100


Hi,

Can you give me an example, module/verse that fails?
I've written a test harness that reads everything in every module, and 
the results are OK for me, but I probably don't test for 0xFFFD.

Joe.

Jeremy Brown wrote:

>It seems like your recent updates have done a lot of good.  Some versions
>like the Aleppo Codex and ChiNCVS/ChiNCVT, which I couldn't get to work
>before, are now working for me.  (I'm using your underlying Java to
>extract verses and create files readable on a Palm).
>
>However, both in the program I've written, and in the JSword GUI, certain
>European languages are having their accented characters returned as
>"unknown character", though these languages used to work.  
>
>When I use BookData.getPlainText() on these verses, and print out the
>character numbers, I get 0xFFFD (unknown character) for these. 
>
>Some examples are in the French Louis Segond, Spanish Reina Valera, Norsk,
>Danske.  
>
>The accent character for these languages should fall in the range of the
>first 256 unicode characters (normal ISO 8859-1 character set).  All the
>characters above this range seem to work fine--for example, Hebrew,
>Chinese, and Russian are coming out OK (Aleppo Codex, Chinese NCVS,
>Russian Makarij).
>
>Some languages that I would expect to have the same problems but don't
>are: Hungarian, Swedish.
>
>Did how you are determining the character encoding change?  Like I say,
>these European languages used to work fine, but now some of their
>characters are getting converted to 0xFFFD before they are output with
>BibleData.getPlainText()
>
>Thanks for all your great work, and for any help you can provide.
>
>Jeremy Brown
>Biola University
>
>_______________________________________________
>jsword-devel mailing list
>jsword-devel@crosswire.org
>http://www.crosswire.org/mailman/listinfo/jsword-devel
>  
>