Configuration tips
Because Java is configured to use UTF-8 by default, the encoding is
specified as ANSI when invoking exec
.
- If trash characters or question marks are displayed for non-English characters, then generally there is an encoding mismatch.
- If the codepage does not contain the character, then it displays as a question mark. For example, Chinese Big5 characters on a U.S. workstation.
- Beware of cp437 on Windows as it is the DOS ASCII/Graphics codepage and some of the graphics glyphs are similar to standard symbols. cp437 is the default codepage for the Command window on many Windows locales.
- In the system, the Java IDE on Windows only reads and writes in the non-Unicode locale.
- Tcl defaults to UTF-8 for file operations and is used in the engine and testing routines to read and write configuration files. That is, there is a mismatch between the IDE and Tcl.
- If using any of the standard
I/O channels in
hcitcl
,hciwish
, or for debugging a procedure, then check the encoding to ensure it is what you expect. - Use caution with editors, for example, Notepad, that save files in UTF-8 but put the Byte Order Mark at the front of the file in the first three bytes. It might be necessary to remove these using a binary editor if the file is read by the engine or testing routines.
- Note that Tcl has two string-length functions. One returns the character length and the other the byte length of the string. They are not the same.
- Tcl strings are not C strings.
Tcl encodes a null character, byte with a value of zero, into a two-byte UTF-8
encoded Unicode code point. Because of this,
\x0
, or any form with or without quotes, is not an empty string. Instead, it is a string with character length one and byte length two. Use""
for an empty string, quotes without any content, in commands such asmsgget -cvtnull
. - Configuration files, for
example, NetConfig, should use the UTF-8 encoding, not the ANSI locale codepage.
This is necessary when the engine employs Tcl to input and decode the contents of
the file.
The IDE is implemented in Java, which invokes command-line utilities by
exec
for many of its testing and file functions.The command-line parameters must be passed in the locale ANSI codepage.
Because Java is configured to use UTF-8 by default, the encoding is specified as ANSI when invoking
exec
.