[osis-core] Problem with character counting in grains
Harry Plantinga
osis-core@bibletechnologieswg.org
Mon, 17 Jun 2002 10:50:43 -0400
Character counting for grains can be problematic. E.g. there
are Greek characters with accents and breath marks that may be
represented with one, two, three, or more unicode characters --
e.g. iota + accute accent + rough breathing characters or a single
iota-accute-rough character. So it's not possible to tell just
by looking at some text how many characters are used in the
underlying representation.
Do we count characters after normalizing the representation, or
ignore a certain list of accents and modifiers, or ??
-Harry