[jsword-devel] i18n basic
DM Smith
jsword-devel@crosswire.org
Mon, 05 Apr 2004 09:34:30 -0400
The first thing anyone thinks of when internationalizing a program is the
text a user sees.
I consider this i18n basic. The following is an analysis of it in JSword.
The upshot is that I think there are some opportunities for improvement. I
have specific proposals for these opportunities. If you like, I will work on
it.
For Java the key mechanisms for it are ResourceBundles and MessageFormat.
The former to locate the translations and the latter for formatting
composited text.
For JSword the key design used with in the program already is:
1) That resources for a class can be in three different areas.
a) In the same package as the classes using the resources.
b) In the resources jar stored in one of the following locations
i) stored in a file in the root w/ '.' between parts of the
pkg name
e.g.
org.crosswire.common.util.MyResources_de_CH.properties
ii) stored in a file with '/' between parts of the pkg name
e.g.
org/crosswire/common/util/MyResources.properties
c) In ~/jsword in the same kind of locations as those stored in
resource.jar
2) Externalized Strings are represented in an Enum (currently from
Apache).
This allows for the explicit cataloging of the externalized strings.
The current implementation uses MsgBase, LucidException and EventException
to represent the translations. (I have supplied an additional mechanism
ActionFactory). LogicError does not internationalize its messages.
EventException does its lookup for a resource bundle called Exception. There
are a few problems with the implementation. (it is not a big deal since it
is only used once)
1) It assumes that the message passed to it is a key in the resource
bundle.
2) This resource does not exist.
3) This resource is required to be in a) the same package as the class.
This means that every change to the program requires.
MsgBase is designed to be subclassed in every package that has strings that
need to be externalized. By practice the derived class is always called Msg.
MsgBase extends apache's Enum. And each Msg is a member of the enum. The
ResourceBundle lookup is for Msg which finds Msg.class and loads it as the
resource itself.
There are a few problems/weaknesses with this implementation.
1) The resource is required to be in the same package as the derived Msg
class.
2) Access protection is protected for MsgBase constructor and private for
each derived Msg class.
One cannot create a Msg_de.java (which probably is a good thing).
This means that translations must be put into property files in
the same package
as the derived Msg class.
3) The Msg objects are enumerated explicitly as Msg objects and the
literal text of the Msg is used as a key to lookup the resource.
a) If a spelling error or some other change happens to the text,
every property file with its translation need to be modified.
b) the keys have spaces in them and need to be escaped
I\ think\ that\ it\ looks\ bad=Don't you?
Here are the changes that I think would solve these problems:
1) Keys are independent from their messages and are of the form:
public class MsgKey extends Enum
and
public class MsgKeyImpl extends MsgKey
(It does not matter to me what the class names are. I always struggle
with finding good class names).
2) Resources are allowed to be in all the locations but will be held in
property files just like the others in resource. This can be done with the
new CWClassLoader.
3) MsgBase derives from Object and uses MsgKey to do the lookup. With a
little bit of magic it does not require init to do the resource loading:
ResourceBundle resources =
ResourceBundle.getResources(
getClass().getName(), // load the derived classes
resources by class name
Locale.getDefault(), // for the user's locale
new CWClassLoader()); // looking in all the right
places
EventException and LogicError are renamed to LucidRuntimeExceptoin and
LucidError. And that the common code is factored into a static methods in
LucidException or a new class LucidUtil. Since we are at Java 1.4, I suggest
a further change to use the chaining of throwables in the class.
I think that these classes don't need to do the i18n themselves. This can be
done in the class that is doing the creation of the object throwing a new
lucid exception using its Msg class. If we want to separate error messages
from other messages, we could have a convention:
class ExceptionMsg extends MsgBase
in each pkg using lucid exceptions.
The advantage of reusing MsgBase is that we reuse code.
_________________________________________________________________
Is your PC infected? Get a FREE online computer virus scan from McAfeeŽ
Security. http://clinic.mcafee.com/clinic/ibuy/campaign.asp?cid=3963