Coding Text in Binary

Introduction

Bits are grouped into 8-bit bytes in most computers.
A byte can represent 256 different combinations of zeros and ones.
One byte is used to code one unique character.
ASCII uses only 7 bits for each character, leaving the 8th bit for error checking.
Provides 128 unique codes for:
- Lowercase and uppercase English alphabet
- Digits 0 to 9
- Punctuation and special characters
- Non-display signals (e.g., null, escape key)
Limitation: Different computers might use different sets of codes.
ASCII is widely used in personal computers as the American Standard Code for Information Interchange.

ASCII's 128 character limit is restrictive for:
- Special characters in European languages.
- Diagrammatic fonts in languages like Chinese and Korean.
ASCII cannot represent all these characters due to limited unique values.

Designed to overcome ASCII limitations by providing a larger character set.
Unicode versions are at least 16 bits long.
Offers a vast number of combinations:
- Covers over 120,000 characters.
- Includes 129 modern and historic scripts, as well as symbol sets.
"Unicode" signifies its universal capability to encode almost all modern and ancient languages.

Unicode addresses the limitations of ASCII by offering more comprehensive language and character support, albeit with increased space requirements.