Non-ASCII data converted to UTF-8
Existing configuration files containing non-ASCII data (byte values above 7F hex) are converted to UTF-8.
A site upgrade utility automatically converts standard configuration files. A binary to UTF-8 conversion utility permits you to convert any non-standard configuration files.
These conversion utilities first scan input files:
- If a file contains only ASCII, then no conversion is required.
- If a file contains byte values from 80-FF hex and is not valid UTF-8, then it is converted.
- If a file contains byte values from 80-FF hex and it is valid UTF-8, then it has probably already been converted. In this case, a message is given: file xxx is already UTF-8, do you wish to convert it anyway? You have the option to convert or skip. This reduces the risk of running the conversion twice and double-converting files.
These configuration files are automatically converted:
- NetConfig
- tclprocs/*
- Tables/*
- Xlate/*
A behavior change happens for XML processing. In previous versions, a
dynamic encoding conversion was performed during XML-to-XML translation if the encoding was
changed by modifying ?xml.&encoding
. This conversion
no longer happens because all is internally normalized to unicode. Upgraded protocol
threads are binary encoding so there is no conversion in the threads. This can result in
different behavior for sites that use an XML-to-XML translation where the encoding is
altered in translation.
To compensate for this behavior change, specify XML encoding mode for the inbound and outbound threads in this type of interface. This must be performed manually. You can use a utility script that looks for translate files that have XML as the input and output format. Then the script can find NetConfig routes that invoke these translations. You can then choose what action to take. This type of utility is considered desirable, but not a requirement.