💻

Understanding ASCII and Unicode Coding Systems

Nov 14, 2024

Coding Text in Binary

Introduction

  • Discuss the systems for coding character data: ASCII and Unicode.
  • Explanation of why Unicode was introduced.

ASCII Coding System

  • Bits are grouped into 8-bit bytes in most computers.
  • A byte can represent 256 different combinations of zeros and ones.
  • One byte is used to code one unique character.
  • ASCII uses only 7 bits for each character, leaving the 8th bit for error checking.
  • Provides 128 unique codes for:
    • Lowercase and uppercase English alphabet
    • Digits 0 to 9
    • Punctuation and special characters
    • Non-display signals (e.g., null, escape key)
  • Limitation: Different computers might use different sets of codes.
  • ASCII is widely used in personal computers as the American Standard Code for Information Interchange.

Limitations of ASCII

  • ASCII's 128 character limit is restrictive for:
    • Special characters in European languages.
    • Diagrammatic fonts in languages like Chinese and Korean.
  • ASCII cannot represent all these characters due to limited unique values.

Unicode Coding System

  • Designed to overcome ASCII limitations by providing a larger character set.
  • Unicode versions are at least 16 bits long.
  • Offers a vast number of combinations:
    • Covers over 120,000 characters.
    • Includes 129 modern and historic scripts, as well as symbol sets.
  • "Unicode" signifies its universal capability to encode almost all modern and ancient languages.

Downsides of Unicode

  • Requires twice as much space to represent each character compared to ASCII.

Conclusion

  • Unicode addresses the limitations of ASCII by offering more comprehensive language and character support, albeit with increased space requirements.