Text in a computer or on the Web is composed of characters. Characters represent letters of the alphabet, punctuation, or other symbols.
Each character is represented in the computer or Web page using one or more bytes. The key to which bytes (or sequences of bytes) represent which characters is called a character encoding.
Web browsers and computer applications need to understand any encoding used for your text, so that they can correctly associate bytes with characters and produce readable text.
In the past, different organizations assembled different sets of characters and created encodings for them – one set may cover just Latin-based Western European languages (excluding EC countries such as Bulgaria or Greece, but also Latin-script languages such as Turkish and Czech), another may cover a particular Far Eastern language (such as Japanese), others may be one of many sets devised in a rather adhoc way for representing another language somewhere in the world.
Using multiple encodings to support the range of languages needed for an application is problematic, and individual encodings may not even support all your needs for representing a given language. In addition, it is usually impossible to combine different encodings on the same Web page or in a database, and so it becomes very difficult to support multilingual Web pages.
The Unicode Consortium provides a large, single character set that aims to include all the characters needed for any writing system in the world, including ancient scripts (such as Cuneiform, Gothic and Egyptian Hieroglyphs). It is now fundamental to the architecture of the Web and operating systems, and is supported by all major web browsers and applications. This Unicode Standard also describes properties and algorithms for working with characters.
This approach makes it much easier to deal with multilingual pages or systems, and provides much better coverage of your needs than most traditional encoding systems. For more information, see the Unicode home page or my tutorial on Unicode.