Viewing the converter details

After you have selected a converter to view, click the link to see all of the details about that converter.

This table shows the details that are listed on that page:

Converter detail Description
Type of converter This is the internal converter implementation used. The ucnv_getType() API returns this value.
Minimum number of bytes This is the minimum number of bytes required by this encoding.
Maximum number of bytes This is the maximum number of bytes required by this encoding.
Substitution character This is the byte sequence used when a converter encounters an unmappable Unicode character.
Is ASCII [\x20-\x7E] compatible? This is the byte range \x20 to \x7E compatible with ASCII? For example, some codepages map the ASCII backslash \x5C to the Yen Symbol \u00A5. Sometimes, special shift bytes are required to display the ASCII range, and ASCII does not know about codepage shifting or escape sequences. Some codepages are EBCDIC based.

Only the range [\x20-\x7E] is used for this comparison. This is because some ISO controls are rotated, and most users are interested in the graphical interpretation of ASCII.

Is ASCII [\\u0020-\\u007E] ambiguous? This is the value that ucnv_isAmbiguous() returns. When this value is TRUE, it usually implies that this is a non-ASCII compatible codepage and an ASCII compatible codepage is available.
Contains ambiguous aliases? This indicates at least one of the aliases for this converter is also on a different converter.
Converters with conflicting aliases If there are any converters with conflicting aliases, then this has the list of converters with their conflicting alias and standard. Care should be taken when using any of the aliases on this list when a standard is not specified on the ICU conversion API. This information can usually be queried from ucnv_getCanonicalName() or ucnv_getStandardName().
Always generates Unicode NFC? When this value is TRUE, a conversion from this codepage to Unicode always generates Unicode in Normalization Form Composed (NFC). When this value is UNKNOWN, there is a possibility that this converter generates Unicode text that is not in NFC depending on the input. It also applys an NFC transformation that could change the original text.

This value is derived by creating a Unicode Set with the value [[:NFC_Quick_Check=yes:]&[:ccc=0:]]. Then, you can confirm that it is a full superset of the codepage's Unicode Set. More details about Unicode Normalization can be found in Unicode Standard Annex #15.

Contains BiDi characters? When this value is TRUE, then a conversion to or from this codepage can contain bidirectional characters. These are right to left characters, for example, Hebrew and Arabic characters. When displaying data from this codepage, you might require to apply the BiDi algorithm described in Unicode Standard Annex #9.
List of languages representable by this codepage This is a list of languages that are representable by this codepage. This data comes from ucnv_getUnicodeSet() and ulocdata_getExemplarSet(), and makes sure that the returned UnicodeSet for the language is a complete subset of the given codepage. The list of languages comes from uloc_getAvailable().
Set of Unicode characters representable by this codepage This is a list of Unicode characters that are representable by this codepage. For example, some text data is written as the bytes of the codepage and converted to Unicode. This is the set of possible Unicode characters to which the text could be converted to. This set can contain multi-codepoint characters. This data comes from ucnv_getUnicodeSet().