# *******************************************************************************
# *
# *   Copyright (C) 1995-2001, International Business Machines
# *   Corporation and others.  All Rights Reserved.
# *
# *******************************************************************************

# IMPORTANT NOTE
#
# This file is not read directly by ICU. If you change it, you need to
# run gencnval, and eventually pkgdata to update the representation that
# ICU uses for aliases.

# This is an alias file used by the character set converter.
#
# Format:
#
#     Actual file name || Algorithm name     alias1 alias2 ...
#
# except for column 1 (file names) case insensitive. Names are separated
# by whitespace.
#
# All names can be tagged by including a space-separated list of tags in
# curly braces, as in ISO_8859-1:1987{IANA} iso-8859-1 { MIME } or
# some-charset{MIME IANA}. The order of tags does not matter, and
# whitespace is allowed between the tagged name and the tags list.
#
# The tags can be used to get standard names using ucnv_getStandardName().
#
# Here is a list of tags used in this file:
#
# IANA          The IANA charset name, as documented in RFC 1700.
# MIME          The MIME charset name, used for content type tagging. 

# The world is getting more complicated...
# Supporting XML parsers, HTML, MIME, and similar applications
# that mark encodings with unique charset names, we are forced to
# make this table much more static than before.

# It means that a new encoding, one that differs from an
# old one by changing a code point, e.g., to the Euro sign,
# must not get an old alias, because it would mean that
# old files with this alias would be interpreted differently.

# If an encoding gets updated by assigning characters to previously
# unassigned code points, then a new name is not necessary.
# Also, some codepages map unassigned codepage byte values
# to the same numbers in Unicode for roundtripping. It may be
# industry practice to keep the encoding name in such a case, too
# (example: Windows codepages).

# Especially, the aliases listed in the list of character sets
# that is maintained by the IANA (http://www.iana.org/) must
# not be changed to mean encodings different from what this
# list shows.
# Currently, the IANA list is at
# http://www.isi.edu/in-notes/iana/assignments/character-sets

# Name matching is case-insensitive. Also, dashes '-', underscores '_'
# and spaces ' ' are ignored in names (thus cs-iso-latin-1 and csisolatin1
# are the same).
# However, the names in the left column are directly file names
# or names of algorithmic converters, and their case must not
# be changed - or else code and/or file names must also be changed.

# Fully algorithmic converters

UTF-8 { MIME }           ibm-1208 cp1208 
UTF-16BE { MIME }        UTF16_BigEndian x-utf-16be
UTF-16LE { MIME }        UTF16_LittleEndian x-utf-16le

# The ICU UTF-16 converter uses the current platform's endianness.
# It does not autodetect endianness from a BOM.
UTF-16 { MIME }          UTF16_PlatformEndian ISO-10646-UCS-2 { IANA } csUnicode ibm-17584 ibm-13488 ibm-1200 cp1200 ucs-2
UTF16_OppositeEndian

UTF-32BE { MIME }        UTF32_BigEndian
UTF-32LE { MIME }        UTF32_LittleEndian

# The ICU UTF-32 converter uses the current platform's endianness.
# It does not autodetect endianness from a BOM.
UTF-32 { MIME }          UTF32_PlatformEndian ISO-10646-UCS-4 { IANA } csUCS4 ucs-4 ibm-1232
UTF32_OppositeEndian

UTF-7 { IANA MIME }
# On UTF-7:
# RFC 2152 (http://www.imc.org/rfc2152) allows to encode some US-ASCII
# characters directly or in base64. Especially, the characters in set O
# as defined in the RFC (!"#$%&*;<=>@[]^_`{|}) may be encoded directly but are not
# allowed in, e.g., email headers.
# By default, the ICU UTF-7 converter encodes set O directly.
# By choosing the option "version=1", set O will be escaped instead.
# For example:
#     utf7Converter=ucnv_open("UTF-7,version=1");

SCSU { IANA }

ISO-8859-1 { MIME }              LATIN_1 ibm-819 cp819 latin1 8859-1 csisolatin1 iso-ir-100 cp367 ISO_8859-1:1987 { IANA } l1 ANSI_X3.110-1983   819 #!!!!! There's whole lot of names for this
US-ASCII { MIME }                ascii ascii-7 ANSI_X3.4-1968 { IANA } ANSI_X3.4-1986 ISO_646.irv:1991 iso646-us us csASCII 646 iso-ir-6