Internationalization encoding functionality

The internationalization encoding functionality is transparent for installations which use only 7-bit ASCII characters within scripts and other data sources. It works correctly without any user coding changes. This is because the UTF-8 representation of a 7-bit ASCII value is identical to the ASCII representation.

Users who have scripts or data sources that contain single-byte values above the ASCII range can modify Tcl processing to be "encoding aware". That is, ordinal character values in the range 128-255. These modifications are to take advantage of Tcl encoding conversion functionality and to prevent unintended default Tcl encoding conversions.

If the input data encoding is not the same as the system encoding, then input/output has the potential to corrupt the data. This happens if the correct encoding is not configured for the channel. This is especially true for default code pages which have unassigned values.

Encodings can be specified for a channel with the Tcl fconfigure command. Explicit encoding operations are performed on strings using the encoding convertfrom and encoding convertto commands. These commands are useful in cases where the default conversion of external data to UTF-8 does not achieve the expected result.