Limitations of unicode
These are the limitations of unicode:
- PDL start and stop bytes cannot happen in
multi-byte characters.
Multi-byte refers to the encodings that are used outside of the system. Control characters meet this requirement for most encodings, so they are usually not a problem.
- Thread names, site names, and process names must be ASCII alpha-numeric. This is due to limitations in the Raima version that is used.
- Separator characters for VRL, HL7, X12, and similar message formats must be ASCII. Segment names must also be ASCII.
- Unicode characters are limited
to the 16-bit Basic Multilingual Plane (BMP).
Unicode version 3 introduced a 31-bit code space comprising many supplemental planes. Tcl only supports the 16-bit code space and Java only adds support for supplemental characters in version 5. The 31-bit code space cannot be used until the Tcl and Java issues are resolved.
- Object names that interact with advanced security server may be limited to ASCII.
- The FRL subfield prefix must be ASCII.
- 1.2.10. Extended ASCII backward compatibility limitation.
- European characters that have a value greater than 127 become 2-byte UTF-8 characters. For those existing implementations that contain extended ASCII, this might not be backward compatible.