[sword-devel] conf utf-8
DM Smith
dmsmith555 at yahoo.com
Mon Feb 14 05:22:32 MST 2005
UTF-8 has big and little endian byte orderings.
If there is no byte mark, it will be significant to use a particular
byte ordering (either little-endian or big-endian).
If there is a BOM, then it can be interrogated and the UTF can be
interpret in either fashion.
Even so, I think that it would be best to settle upon a particular byte
ordering.
Windows does it backward from the rest of the world.
Chris Little wrote:
>
>
> Troy A. Griffitts wrote:
>
>> My guess about the characters which keep the .conf file from
>> being recognized... try adding a few newlines to the beginning of the
>> file. I would guess that XXX[Section Name] at the beginning is just
>> causing our .conf reader to not recognize the "Section Name".
>
>
> The three characters are the Unicode byte-order mark (BOM). See
> http://www.unicode.org/faq/utf_bom.html#BOM for full details. But,
> basically, it's the codepoint U+FEFF, encoded at the beginning of a
> file. From this character, you can tell whether you have UTF-16
> big-endian, UTF-16 little-endian, or UTF-8.
>
> I would recommend we go ahead and support it (to the extent that we
> check for it and throw it away) since it's not something that just
> notepad adds to file. (No need to fix before the trip, though, I think.)
>
> --Chris
>
> _______________________________________________
> sword-devel mailing list
> sword-devel at crosswire.org
> http://www.crosswire.org/mailman/listinfo/sword-devel
>
More information about the sword-devel
mailing list