Understand how letters, symbols, and emojis are stored as binary numbers.
Have you ever wondered how a machine that only understands 'on' and 'off' can display a heartfelt text message or a laughing emoji?
Computers are essentially giant calculators that only speak in binary (0s and 1s). To display text, we need a 'codebook' that tells the computer which number represents which letter. This is where ASCII (American Standard Code for Information Interchange) comes in. Created in the 1960s, it originally used bits to represent different characters (). This included uppercase letters, lowercase letters, numbers, and basic punctuation. For example, the capital letter 'A' is assigned the decimal value , which in -bit binary is .
Let's find the binary for the letter 'B'. 1. In the ASCII table, 'A' is , so 'B' is . 2. Convert to binary: . 3. This gives us the 8-bit sequence: .
Quick Check
If ASCII uses 7 bits for its primary characters, how many unique symbols can it represent?
Answer
128 characters.
While ASCII worked for English, it couldn't handle the thousands of characters in languages like Chinese, Arabic, or even modern emojis. To solve this, Unicode was developed. Instead of being limited to characters, Unicode can represent over characters using up to bits. The most common version, UTF-8, is backwards compatible with ASCII. This means the first characters of Unicode are identical to the ASCII table, ensuring that old files still work on new systems while allowing for a global reach.
Consider the word 'Hi!'. 1. In ASCII, this takes bytes ( bits). 2. In a 32-bit Unicode format, a single complex emoji could take bytes ( bits). 3. Even though the emoji is one 'character' to us, it requires more binary data than a short English word.
Quick Check
Why is Unicode preferred over ASCII for modern web applications?
Answer
Because Unicode supports international characters and emojis, whereas ASCII is limited to basic English characters.
To read a binary message, you must first group the bits into bytes (8-bit chunks). Each byte corresponds to a decimal number on the ASCII/Unicode table. This process is called decoding. If the computer misinterprets the encoding (e.g., trying to read Unicode as ASCII), you get 'mojibake'—those strange, unreadable symbols you sometimes see on broken websites.
Decode the following sequence: . 1. First byte: . ASCII is 'C'. 2. Second byte: . ASCII is 'A'. 3. Third byte: . ASCII is 'T'. 4. The message is 'CAT'.
What is the decimal ASCII value for the letter 'A'?
Which encoding system was designed to support all the world's languages?
A 7-bit ASCII system can represent 256 different characters.
Review Tomorrow
In 24 hours, try to recall the difference between the number of bits used in ASCII versus Unicode.
Practice Activity
Find an ASCII table online and try to write your name in binary code by converting the decimal values of each letter.