Limitations of unicode

These are the limitations of unicode:

  • PDL start and stop bytes cannot happen in multi-byte characters.

    Multi-byte refers to the encodings that are used outside of the system. Control characters meet this requirement for most encodings, so they are usually not a problem.

  • Thread names, site names, and process names must be ASCII alpha-numeric. This is due to limitations in the Raima version that is used.
  • Separator characters for VRL, HL7, X12, and similar message formats must be ASCII. Segment names must also be ASCII.
  • Unicode characters are limited to the 16-bit Basic Multilingual Plane (BMP).

    Unicode version 3 introduced a 31-bit code space comprising many supplemental planes. Tcl only supports the 16-bit code space and Java only adds support for supplemental characters in version 5. The 31-bit code space cannot be used until the Tcl and Java issues are resolved.

  • Object names that interact with advanced security server may be limited to ASCII.
  • The FRL subfield prefix must be ASCII.
  • 1.2.10. Extended ASCII backward compatibility limitation.
  • European characters that have a value greater than 127 become 2-byte UTF-8 characters. For those existing implementations that contain extended ASCII, this might not be backward compatible.