Skip to main content

## AppendixCThe ASCII and Unicode Character Sets

Java uses the Unicode character set for representing character data. The Unicode set represents each character as a 16-bit unsigned integer. It can, therefore, represent 2$$^{16}$$ $$=$$ 65,536 different characters. This enables Unicode to represent characters from not only English but also a wide range of international languages.

Unicode supersedes the ASCII character set (American Standard Code for Information Interchange). The ASCII code represents each character as a 7-bit or 8-bit unsigned integer. A 7-bit code can represent only 2$$^7$$ $$=$$ 128 characters. In order to make Unicode backward compatible with ASCII, the first 128 characters of Unicode have the same integer representation as the ASCII characters.

The following table shows the integer representations for the printable subset of ASCII characters. The characters with codes 0 through 31 and code 127 are nonprintable characters, many of which are associated with keys on a standard keyboard. For example, the delete key is represented by 127, the backspace by 8, and the return key by 13.