ASCII and Unicode. ASCII Inside a computer, EVERYTHING is a number – that includes music, sound,...

12
ASCII and Unicode

Transcript of ASCII and Unicode. ASCII Inside a computer, EVERYTHING is a number – that includes music, sound,...

Page 1: ASCII and Unicode. ASCII Inside a computer, EVERYTHING is a number – that includes music, sound, and text. In the early days of computers, every manufacturer.

ASCII and Unicode

Page 2: ASCII and Unicode. ASCII Inside a computer, EVERYTHING is a number – that includes music, sound, and text. In the early days of computers, every manufacturer.

ASCII

• Inside a computer, EVERYTHING is a number – that includes music, sound, and text.

• In the early days of computers, every manufacturer had their own code for characters (HP, IBM, Sperry, Digital)

• The users didn’t care as long as they pressed A on the keyboard and got A on the screen

Page 3: ASCII and Unicode. ASCII Inside a computer, EVERYTHING is a number – that includes music, sound, and text. In the early days of computers, every manufacturer.

As time passed

• More computers meant more data and more users

• People wanted to share or buy or sell data from colleagues

• The incompatibilities in the character codes made a problem

• Eventually everyone decided that a “standard code” was needed

Page 4: ASCII and Unicode. ASCII Inside a computer, EVERYTHING is a number – that includes music, sound, and text. In the early days of computers, every manufacturer.

ASCII

• Several codes were considered but• ASCII won! American Standard Code for

Information Interchange• The beginnings of the Internet in the 70’s also

gave impetus to the desire for a standard code so that email clients didn’t have to know a dozen different codes just to read email from different machines across the Net

Page 5: ASCII and Unicode. ASCII Inside a computer, EVERYTHING is a number – that includes music, sound, and text. In the early days of computers, every manufacturer.

ASCII

• So what? Why do I care? Mostly, you don’t.• If you press A on the keyboard and get an A on

the screen, what does the code matter?• ASCII is efficient, known by almost every device,

easy to transmit and receive• Has codes for A-Z, a-z, 0-9, space, punctuation

marks and a few control codes• 256 different codes (1 character fits into 1 byte

of data)

Page 6: ASCII and Unicode. ASCII Inside a computer, EVERYTHING is a number – that includes music, sound, and text. In the early days of computers, every manufacturer.

Time passes

• The world starts using computers and not just countries who use the “Roman” alphabet

• First response: learn English• But eventually the realization came that a new

code that was bigger was needed• Around 2000 Unicode was released• “Universal Code”• Has over 64K different codes

Page 7: ASCII and Unicode. ASCII Inside a computer, EVERYTHING is a number – that includes music, sound, and text. In the early days of computers, every manufacturer.

Unicode

• Covers all human alphabets and has room for more!

• Includes ASCII as first 256 codes• A Unicode character takes TWICE as much space

(at least!) as an ASCII character (2 bytes)• Now in development for 4 bytes!• Becoming the default code for many applications• Several programming languages adding a type for

“fat” or “wide” characters = Unicode!

Page 8: ASCII and Unicode. ASCII Inside a computer, EVERYTHING is a number – that includes music, sound, and text. In the early days of computers, every manufacturer.

Unicode

• Note: Unicode is NOT a “translator program”• What it does do is allow you (if you know a

foreign language) to write the foreign words properly spelled with the correct characters

• They will get transmitted correctly• The recipient still has to know how to read

them, but at least the words will be correctly spelled

Page 9: ASCII and Unicode. ASCII Inside a computer, EVERYTHING is a number – that includes music, sound, and text. In the early days of computers, every manufacturer.

Comparing strings

• Easy to test for equality –has to be exact match, spaces, case, length

• What does it mean to say one string is greater than another (or less than)?

• Comparison made by looking at the ASCII codes of the characters in the strings

Page 10: ASCII and Unicode. ASCII Inside a computer, EVERYTHING is a number – that includes music, sound, and text. In the early days of computers, every manufacturer.

ASCII – order you need to know

• Upper case letters all in alphabetical order• ‘A’ < ‘B’ < ‘C’ < ‘D’ < … < ‘Y’ < ‘Z’• Lower case letters all in alphabetical order too• ‘a’ < ‘b’ < ‘c’ < ‘d’ < … < ‘y’ < ‘z’• How do the two alphabets relate to each

other? • Lowercase letters are higher (greater) (after)

the uppercase letters

Page 11: ASCII and Unicode. ASCII Inside a computer, EVERYTHING is a number – that includes music, sound, and text. In the early days of computers, every manufacturer.

More on order

• ‘A’ < … < ‘Z’ < … < ‘a’ < … < ‘z’• Where do the digits fit in there?• ‘0’ < ‘1’ < ‘2’ < ‘3’ < … < ‘8’ < ‘9’• Digits come before the uppercase letters• ‘0’ < ‘… < ‘9’ < … < ‘A’ < … < ‘Z’ < … < ‘a’ < … <

‘z’• Only one more!!

Page 12: ASCII and Unicode. ASCII Inside a computer, EVERYTHING is a number – that includes music, sound, and text. In the early days of computers, every manufacturer.

One more character

• There are lots of punctuation marks and control codes in ASCII (256 codes, after all!)

• You do NOT have to know these!• Except for one special character – the space• ‘ ‘ is the first printable character – it comes

before all the others you have seen here• ‘ ‘ < … < ‘0’ < … < ‘9’ < … < ‘A’ < … < ‘Z’ < … < ‘a’ < … < ‘z’ Know this!