[osis-core] Problem with character counting in grains

Harry Plantinga osis-core@bibletechnologieswg.org
Mon, 17 Jun 2002 10:50:43 -0400


Character counting for grains can be problematic.  E.g. there
are Greek characters with accents and breath marks that may be
represented with one, two, three, or more unicode characters --
e.g. iota + accute accent + rough breathing characters or a single
iota-accute-rough character. So it's not possible to tell just 
by looking at some text how many characters are used in the 
underlying representation. 

Do we count characters after normalizing the representation, or
ignore a certain list of accents and modifiers, or ??

-Harry