Introduction
- The Unicode Standard is the universal character encoding standard used for representation of text for computer processing.
- The Unicode Standard provides the capacity to encode all of the characters used for the written languages of the world.
- It uses a 16-bit encoding that provides code points for more than 65,000 characters.
- Unicode standard and ISO 10646 provide an extension mechanism called UTF-16 that allows for encoding as many as a million more characters, without the use of escape codes.
- The Unicode Standard endorses two forms that correspond to ISO 10646 transformation formats, UTF-8 and UTF-16.
- The ISO/IEC 10646 transformation formats UTF-8 and UTF-16 are essentially ways of turning the encoding into the actual bits that are used in implementation.