A world-wide standard developed to help overcome the limitations of ASCII that was first released as a standard in October 1991.

With Unicode each character uses a unique number between U+0000 and U+10FFFF, Unicode may be 8-bit, 16-bit, or 32-bit. Numbers, mathematical notation, popular symbols and characters from all languages are assigned a code point, for example, U+0041 is an English letter “A.” Below is an example of how “Computer Hope” would be written in English Unicode.

U+0043 U+006F U+006D U+0070 U+0075 U+0074 U+0065 U+0072 U+00A0 U+0048 U+006F U+0070 U+0065

A common type of Unicode is UTF-8, which utilizes 8-bit character encoding. It is often used in Linux environments, to encode foreign characters so they display properly when output to a text file.

ASCII, BOM, Character, Code page, Software terms, UTF

Microsoft Windows users can also find Unicode code points by running the character map utility.

In Microsoft Word, if you highlight a character and press the Alt+X keyboard shortcut, it displays the Unicode code for that character.

  • The official Unicode website.