[sword-devel] Greek dictionary - input needed
Chris Little
chrislit at crosswire.org
Tue Jan 20 22:29:18 MST 2009
Ben Morgan wrote:
> On Wed, Jan 21, 2009 at 12:41 AM, DM Smith <dmsmith555 at yahoo.com
> <mailto:dmsmith555 at yahoo.com>> wrote:
>
>
> ICU has the notion of a collation key, which can be used for such a
> purpose. (I think we've gotten to the point where ICU is a
> requirement for UTF-8 modules.) In ICU, the collation key is locale
> dependent. (For example, Germans sort accent marks differently than
> French. In Spanish dictionaries, at least older ones, ch come before
> ca.) I really don't see any way around having a static collation for
> a module. If so, the collation would need to be fixed wrt either a
> fixed locale or a locale based upon the language of the module.
DM's suggestion (not merely the part pertaining to ICU) sounds good to
me. It does represent a rather radical change since it's a proposal for
a whole new driver type, but that might be what we need in order to get
the kind of flexibility we need going forward.
> ICU is not a requirement for using UTF-8 modules; rather than use ICU,
> most frontends (certainly BPBible, GnomeSword, BibleTime and I think
> MacSword as well) have defined their own string manager code (generally
> using the platform - qt, glib or python).
DM is really correct that we're coming to the point where ICU is going
to be a necessity for app i18n/l10n. ICU provides up-to-date collation
and normalization facilities that are a necessity for correctly managing
Unicode data in anything other than a braindead manner (like our
byte-ordered LD entries currently are). Searching, including functions
like accent normalization and correct case folding, aren't possible
without certain level of Unicode knowledge within the app. And when we
actually think about doing lookup via transliteration (something every
other piece of professional Bible software handles) we can either go to
the effort of rolling our own transliteration facility or use the
ready-made one provided in ICU (as Logos does).
MacSword may be exempt from needing ICU for a while, as would any other
MacOS or iPhone program, for the simple fact that many of ICUs
functionality should be available through platform APIs. That's because
Apple has included ICU on both of these platforms, though it won't ever
be the most recent release and may lack some data.
> Personally, BPBible doesn't use ICU for two reasons - the extra size for
> ICU and the transliterators it supplies. When compiling with ICU, it
> adds transliteration filters, which are really buggy - crashes, mixed up
> xml, etc.
The extra download size added by ICU data is 3mb, less than the size of
2 Bibles. In 2009, I can't see anyone complaining about a 3mb increase
in download size. Even PDAs and cell phones are shipping with gigs of
memory.
Regarding stability of the transliterators, I've just disabled all but
the primary Latin transliterators, which should eliminate most problems.
If problems remain, please let us know (preferably via the bug tracker).
We can add some of the other Latin-oriented transliterators back at a
later date, once we've checked them and established their stability.
Put simply, complete i18n and l10n of Sword and Sword frontends aren't
within our reach without ICU.
--Chris
More information about the sword-devel
mailing list