Quiz on Ch - Tarleton State University · Many quantities of interest in the real-world are...
Transcript of Quiz on Ch - Tarleton State University · Many quantities of interest in the real-world are...
Data and Computers
Computers are multimedia devices, dealing with a vast array of information categories
Computers store, present, and help us modify • Numbers
• Text
• Audio
• Images and graphics
• Video
• Smell (machine olfaction!)
4
Data compression
Reduction in the amount of space (memory) needed to store the data
Measured by the Compression ratio = The size of the compressed data divided by the size of the original data
Example: Two files are compressed with the ZIP utility:
• One is originally 200 MB, and becomes 150 MB after compression
• The other is originally 15 MB, and 11 MB after compression
Which file is better compressed?
5
Quiz
A video file is originally 3.5 GB long.
We compress it to 490 MB. What is the compression ratio?
6
Data compression
The Compression ratio is always between 0 and 1 (between 0% and 100%)
Compression techniques can be
Lossless → the data can be retrieved without any loss of the original information
Lossy → some information may be lost in the process (but it doesn’t matter for the purposes of the intended application)
7
Analog vs. Digital
Many quantities of interest in the real-world are infinite and continuous
• Zeno’s paradox: “That which is in locomotion must arrive at the half-way stage before it arrives at the goal. ” — Aristotle, Physics VI:9
But computers are finite and discrete!
How do we represent an infinite and continuous quantity in a computer?
Answer: We approximate → represent only enough to satisfy our computational needs and our senses of sight and sound.
8
Information can be represented in one of two ways: analog or digital
Analog data
A continuous representation, similar to the actual information it represents
Digital data
A discrete representation, breaking the information up into separate elements
9
Analog and Digital Information
Computers cannot work well with analog data, so we digitize the data
Digitizing = Breaking data into pieces and representing those pieces separately, by using a finite number of binary digits
Fine distinction: there are two operations performed:
• one in time (a.k.a. sampling)
• the other in amplitude (a.k.a. quantizing)
Explain on thermometer, using a waveform example!
10
Digitizing = Breaking data into pieces and representing those pieces separately, by using a finite number of binary digits
Fine distinction: there are two operations performed:
• one in time (a.k.a. sampling)
• the other in amplitude (a.k.a. quantizing)
11
Quiz
A digital thermometer has a scale from 50 to 100 degrees (F). The temperature is represented on 7 bits. What is the smallest temperature difference that it can measure?
12
Analog and Digital Information
Why do we use binary to represent digitized data?
• Price: transistors are cheap to produce
–Remember Babbage!
• Reliability: transistors don’t get jammed
–Remember Babbage!
13
Electronic Signals
Important facts about electronic signals
• An analog signal continually fluctuates in voltage up and down
• A digital signal has only a high or low state, corresponding to the two binary digits
14
Figure 3.2
An analog and a digital signal
• All electronic signals (both analog and digital) degrade due to absorption in transmission lines
• The amplitude (voltage) of electronic signals (both analog and digital) fluctuates due to environmental effects, a.k.a. noise
15
Figure 3.3
Degradation of analog and digital signals
The difference is that digital signals can be regenerated!
Binary Representations
One bit can be either 0 or 1
• One bit can represent two things
Two bits can represent four things (Why?)
How many things can three bits represent?
How many things can four bits represent?
16
Binary Representations
How many things can n bits represent?
What happens every time you increase the number of bits by one?
18 EoL
For next time:
• Read pp.54-57 of text
• Solve end of chapter ex. 1 through 5, 27, 28 in notebook
19
Quiz
A weather log file is originally 4.2 GB long.
We compress it to 150 KB. What is the compression ratio?
20
Quiz
An Analog-to-Digital Converter (ADC) accepts an input voltage between -3 and +12 V, and uses 9 bits to represent it digitally.
What is the precision?
21
Binary Representations
How many things can n bits represent?
Reversing the problem: How many bits are
needed to represent N things?
• All desktops in this lab
22
3.2 Representing Numeric Data
Negative integers
Signed-magnitude representation
The sign represents the ordering, and the digits represent the magnitude of the number
24
Negative Integers
There is a problem with the sign-magnitude representation: plus zero and minus zero.
• More complex hardware is required!
Solution: Let’s not represent the sign explicitly!
“Complement” representation
25
Ten’s complement
Using two decimal digits, represent 100 numbers
• If unsigned, the range would be 0…?
• Let 1 through 49 represent 1 … 49
• Let 50 through 99 represent -50 … -1
26
Ten’s complement
27
Top: representations (the “label on the jar”)
Bottom: the actual numbers that are being
represented (the “content of the jar”)
QUIZ
Given the following representations, find in each case what actual number is being represented:
• 51
• 52
• 96
• 47
28
Why the “complement” in ten’s complement?
100 – 50 = 50
100 – 49 = 51
……………………..
100 – 1 = 99
In general:
100 – i is representation of – i
30
Let’s use ten’s complement!
To perform addition, add the numbers and discard any carry
32
Now you try it
48 (signed-magnitude)
- 1
47
How does it work in
the new scheme?
Important conclusions
In the complement representation:
• Positive and negative numbers are treated the same! We can add without knowing if they’re positive or negative!
• Subtraction is performed as addition, by changing signs: a – b = a + (-b). This greatly simplifies the hardware!
34
For next time:
• Read pp.58-61 of text
• Go through all examples presented so far in Ch.3 and make sure you understand them. Ask next time if you have questions!
• Solve end-of-chapter ex. 33, 34, 35 in notebook
35
Two’s complement Addition and subtraction are the same as in unsigned:
-127 1000 0001
+ 1 0000 0001
-126 1000 0010
Ignore any Carry out of the MSB:
-1 1111 1111
+
-1 1111 1111
-2 1111 1110
40
Two’s complement Formula to compute the negative of a number on k digits:
• for ten’s comp: Negative(I) → 10k - I
• for two’s comp: Negative(I) → 2k - I
Practice: find the 8-bit two’s comp. representations of:
-2
-51
+ 128 (trick question!)
- 129 (trick question!)
0
41
Warning: The example on p.62 has a typo in the previous (4th) edition:
28 is 256, not 128.
“Fast” two’s complement Easier way to change the sign of a number:
Flip all bits, then add 1
Try it out! Find the negatives of the following
two’s complement numbers:
0000 0011
1000 0000
1000 0001
1000 0011
1001 0110
1111 1111 43
This is how subtraction is implemented in
computer hardware! A – B = A + (-B)
What happens if the computed value won't fit in the given number of bits k?
Overflow
If k = 8 bits, adding 127 to 3 overflows 1111 1111
+ 0000 0011
0 1000 0010
… but adding -1 to 3 doesn’t!
Conclusion: overflow is specific to the representation (unsigned, sign-mag., two’s comp., floating point etc.)
47
Representing Real Numbers
Real numbers
A number with a whole part and a fractional part
104.32, 0.999999, 357.0, and 3.14159
Positions to the right of the decimal point are the fractional part (tenths, hundredths, thousandths etc.):
10-1, 10-2 , 10-3 …
49
Binary fractions
Same rules apply in binary as in decimal
Decimal point is actually the radix point
Positions to the right of the radix point in binary are
2-1 (one half) = 0.5
2-2 (one quarter) = 0.25
2-3 (one eighth) = 0.125
…
50
Converting fractions from binary to decimal
Easy! Just multiply with the powers of 2, as we did for unsigned binary. Only difference is that now the powers are negative.
Example: .10012 = 0. 10
51
Converting fractions from decimal to binary
Remember the repeated division algorithm?
We apply it for the integer part of the number.
To covert the fractional part, we use the repeated multiplication algorithm!
Example: 0.43510 = 0. 2
53
QUIZ
Finite decimal fractions may have infinite binary representation!
0.310 = 0. 0100110011 2
55
Stop after 8 bits!
Representing Real Numbers
58
In general, we use “sign-mantissa-exponent”
N = mantissa * 10exp
N = mantissa * 2exp
Depending on the form of the mantissa, we have:
• Floating-point notation
• Scientific notation
Floating point
59
This representation is called floating point because, although
the number of digits is fixed (5 in the example above),
the radix point floats (according to the exponent)
The radix point is always at the
extreme right, e.g. 12001.0
Scientific notation
Particular form of floating point, in which the
decimal point is kept to the right of the leftmost
digit.
12001.32708 is represented 1.200132708 E+4
60
Exactly one digit to the left of the radix point
Floating point in binary?
62
It does exist - remember floating point numbers in Python!
It can be implemented either through software, or directly in
the hardware of the computer.
Until 1989, all desktops had an optional chip, separate from
the main CPU, called FPU (a.k.a. math coprocessor).
The Intel 80486, introduced that year, had the first integrated
CPU + FPU.
Not in text!
80486 was also the first chip with over 1 million transistors!
Floating point in binary?
63
Motorola launched their first integrated CPU+FPU, the
68040 the next year, in 1990.
The detailed binary implementation of floating point is
beyond the scope of our class.
If you want to learn more:
• Search for “IEEE 754” on the web
• Take CS 343 or CS 344
Not in text!
3.3 Representing Text
Basic idea: There are finite number of characters to represent, so list them all and assign each a (binary) number, a.k.a. code. Character set A list of characters and the codes used to represent each one Computer manufacturers (eventually) agreed to standardize
– Read “Character Set Maze” on p.67 64
The ASCII Character Set
ASCII = American Standard Code for Information Interchange
ASCII originally used seven bits to represent each character, allowing for 128 unique characters
Later extended ASCII evolved so that all eight bits were used
• How many characters can be represented?
65
The ASCII Character Set
69
The first 32 characters in the ASCII character chart do not have a simple character representation to print to the screen.
They are called control characters
8-bit (“extended”) ASCII Character Sets
71
By using 8 bits instead of 7, the number of codes extends from 128 to 256.
Extended ASCII is always a superset of 7-bit ASCII:
• The first 128 characters correspond exactly to 7-bit ASCII
Problem ???
Not in text
Extended ASCII: IBM code page 437
72
http://en.wikipedia.org/wiki/Code_page_437
Not in text
Extended ASCII: Latin-1
73 http://en.wikipedia.org/wiki/ISO/IEC_8859-1
Not in text
Trick QUIZ
What do these bits represent?
1101 1110
75
Unsigned integer: …
Signed integer (2’s complement): …
Fraction: …
IBM 437 character: …
Latin-1 character: …
The Unicode Character Set
None of the Extended ASCII character sets were enough for international use (256!)
Unicode uses 16 bits per character
How many characters can UNICODE represent?
Unicode is a superset of Latin-1: The first 256 characters correspond exactly to Latin-1 characters (http://unicode.org/charts/PDF/U0080.pdf )
76
Simplified Chinese has
6500!
Miscellaneous Characters in Unicode
79
See more online at the official Unicode site
For next time:
Read pages 62-69 of text
Homework: 8, 9, 29, 30, 31, 41, 44, 48, 49
Due Monday at the beginning of lab, before
the midterm!
80
QUIZ:
Your boss tells you to develop a webpage using the extended ASCII character set. What do you reply?
81
QUIZ Select all that apply:
The Latin-1 character set:
• Is a 5-bit representation
• Is a 7-bit representation
• Is a 16-bit representation
• Is an extension of ASCII
• Is an extension of Unicode
• Contains letters used in European languages
82
QUIZ Select all that apply:
The Unicode representation:
• Uses 16 bits
• Is an extension of ASCII
• Is an extension of Latin-1
• Is an extension of IBM-437
• Contains letters used in all world languages
• Can accommodate over 65,000 characters
• Is used in the majority of web pages today
83
Text Compression
Sometimes, assigning 8 or 16 bits to each character in a document uses too much memory We need ways to store and transmit text efficiently Text compression techniques:
– keyword encoding – run-length encoding – Huffman encoding
84
Keyword Encoding
Replace frequently used words with a single character, for example here’s a substitution chart:
85
Keyword Encoding Given the following paragraph:
We hold these truths to be self-evident, that all men are
created equal, that they are endowed by their Creator with certain unalienable Rights, that among these are Life, Liberty and the pursuit of Happiness. That to secure these rights, Governments are instituted among Men, deriving their just powers from the consent of the governed, That whenever any Form of Government becomes destructive of these ends, it is the Right of the People to alter or to abolish it, and to institute new Government, laying its foundation on such principles and organizing its powers in such form, as to them shall seem most likely to effect their Safety and Happiness.
86
Keyword Encoding The encoded paragraph is:
We hold # truths to be self-evident, $ all men are created
equal, $ ~y are endowed by ~ir Creator with certain unalienable Rights, $ among # are Life, Liberty + ~ pursuit of Happiness. $ to secure # rights, Governments are instituted among Men, deriving ~ir just powers from ~ consent of ~ governed, $ whenever any Form of Government becomes destructive of # ends, it is ~ Right of ~ People to alter or to abolish it, + to institute new Government, laying its foundation on such principles + organizing its powers in such form, ^ to ~m shall seem most likely to effect ~ir Safety + Happiness.
87
Keyword Encoding
How much did we compress?
Original paragraph
656 characters
Encoded paragraph
596 characters
Characters saved
60 characters
Compression ratio
596/656 = 0.9085
Could we use this substitution chart for all text?
88
Keyword Encoding
More advanced idea:
Encode only some parts of the words, like common prefixes and suffixes, e.g. “pre”, “ex”, “ing”, “tion”
89
Run-Length Encoding
A single character may be repeated over and over again in a long sequence.
Replace a repeated sequence with – a flag character
– repeated character
– number of repetitions
*n8 – * is the flag character
– n is the repeated character
– 8 is the number of times n is repeated 90
Run-Length Encoding Encoding example:
Original text is
bbbbbbbbjjjkllqqqqqq+++++
Encoded text is
*b8jjjkLL*q6*+5
(Why isn't L encoded? J?)
The compression ratio is 15/25 = .6
Q: This type of repetition doesn’t occur in English text; can you think of a situation where it is very likely to occur?
91
Run-Length Encoding
Decoding example:
Encoded text is
*x4*p4l*k7
Original text is
xxxxpppplkkkkkkk
92
Huffman Codes
Conclusion: each language and each topic have specific frequencies of characters and groups of characters (digraphs, trigraphs etc.)
Why should the characters “X" or "z" take up the same number of bits as "e" or "t"?
Huffman codes use variable-length bit strings to
represent each character. More frequently-used letters
have shorter strings to represent them, and vice-versa!
95
Huffman encoding example
“ballboard” would be 1010001001001010110001111011
compression ratio
28/56
QUIZ:
Encode “roadbed”
96
Huffman decoding
In Huffman encoding no character's bit string is the prefix of any other character's bit string. Codes with this property are called prefix codes.
To decode
look for match left to right, bit by bit
record letter when the first match is found
continue where you left off, going left to right
97
3.4 Representing Audio Data
99
We perceive sound when a series of air waves cause
to vibrate a membrane in our ear (eardrum), which
sends signals to our brain.
Analog Audio
Record players and stereos send analog signals to speakers to produce sound.
These signals are analog representations of the sound waves.
The voltage in the signal varies in direct proportion to the amplitude of the sound wave.
100
From Analog to Digital Audio
Digitize the signal by sampling and quantizing
– periodically measure the voltage
– record the numeric value
How often should we sample? A sampling rate of about 40,000 times per second is enough to create a reasonable sound reproduction
101
44,000 for audio CD, to be exact
Remember: Sampling and Quantizing
102
Some information
is lost, but a
reasonable
sound is
reproduced
Not in text
Digital Audio on a CD
On the surface of the CD are microscopic pits
and lands that represent binary digits
A low intensity laser is pointed as the disc. The
laser light reflects strongly if the surface is
smooth and poorly if the surface is pitted ???
(p.75 of text)
104
105
Pit height is about ¼ the
laser’s wavelength
“destructive interference”
FYI: How the pits and lands are actually read
Not in text
106
Both halves of the
laser beam reflect off
pit or both halves off
land.
The two halves are “in
phase”.
Half of the laser beam
reflects off pit and half
off land.
The 2 halves are “out
of phase”.
FYI: How the pits and lands are actually read
Not in text
Audio Formats
Audio Formats
– WAV, AU, AIFF, VQF, and MP3
MP3 (MPEG-2, audio layer 3 file) is dominant
– analyzes the frequency spread and discards information that can’t be heard by humans (>16 kHz)
– bit stream is compressed using a form of Huffman encoding to achieve additional compression
Is this a lossy or lossless compression?
107
3.5 Representing Images and Graphics
Color
Perception of the frequencies of light that reach
the retinas of our eyes
Human retinas have three types of color
photoreceptor cone cells that correspond to the
colors of red, green, and blue
108
Color is expressed as an RGB (red-green-blue) value = three numbers that indicate the relative contribution of each of these three primary colors
An RGB value of (255, 255, 0) maximizes the contribution of red and green, and minimizes the contribution of blue, which results in a bright yellow.
109
110
Dark means low number, light
means high.
Look at the snow and the black
side of the barn! Source: Wikipedia – RGB color model
Representing Images and Graphics
Can you understand this HTML code?
<font color="#FF0000"> Blah blah …
</font>
111
RGB Color Chart in hex
Representing Images and Graphics
color depth
The amount of data that is used to represent a color
HiColor
A 16-bit color depth: five bits used for each number in an RGB value with the extra bit sometimes used to represent transparency
TrueColor
A 24-bit color depth: eight bits used for each number in an RGB value
114
Extra-credit question
TrueColor
A 24-bit color depth: eight bits used for each number in an RGB value
How many different colors can be represented in TrueColor?
Please show your work.
115
Indexed Color
A browser may support only a certain number of specific colors, creating a palette from which to choose
117
Figure 3.11
The Netscape color palette
EOL
Extra-credit question
How many bits are needed to represent this palette?
Please show your work.
118
How to digitize a picture
• Sample it → Represent it as a collection of individual dots called pixels
• Quantize it → Represent each pixel as one of 224 possible colors (TrueColor)
Resolution = The # of pixels used to represent a picture
119
Digitized Images and Graphics
120
Figure 3.12 A digitized picture composed of many individual pixels
Whole
picture
Digitized Images and Graphics
121
Figure 3.12 A digitized picture composed of many individual pixels
Magnified portion
of the picture
See the pixels?
Hands-on: paste the
high-res image from
the previous slide in
Paint, then choose
ZOOM = 800
Raster Graphics = Storage of data on a pixel-by-pixel basis
• Bitmap (BMP), GIF, JPEG, and PNG are the most popular
raster-graphics formats
GIF format
Each image is made up of only 256 colors (indexed color)
• But they can be a different 256 for each image!
Supports animation! Example
Optimal for line art
PNG format (“ping” = Portable Network Graphics)
Like GIF but achieves greater compression with wider range of color depth
No animations
122
Bitmap format
Contains the pixel color values of the image from left to right and from top to bottom
• Great candidate for run-length compression!
• Losless
JPEG format
Averages color hues over short distances
• Lossy compression
Optimal for color photographs
123
Vector Graphics
A format that describes an image in terms of lines and geometric shapes
A vector graphic is a series of commands that describe a line’s direction, thickness, and color
The file sizes tend to be smaller because not every pixel is described
Example: Flash
124
Vector Graphics
The good side:
Vector graphics can be resized mathematically and changes can be calculated dynamically as needed
The bad side:
Vector graphics are not good for representing real-world images
125
3.6 Representing Video
The problem: huge amount of data!
Example: In HDTV, the Frame size is defined as the number of horizontal pixels × number of vertical pixels:
• 1280 × 720
• 1920 × 1080
Calculate: 1] Data rate (bits per second) for 25 fps
2] Size (bytes) of 2-hour movie
126
Video codec = COmpressor/DECompressor Methods used to shrink the size of a movie to allow it to be played on a computer or over a network
Almost all video codecs use lossy compression to minimize the huge amounts of data associated with video
127
Compressing Video
Temporal compression
A technique based on differences between consecutive frames: If most of an image in two frames hasn’t changed, why should we waste space to duplicate all of the similar information?
Spatial compression
A technique based on removing redundant information within a frame: This problem is essentially the same as that faced when compressing still images.
128
Individual work for Monday:
Read pages 69-80 of text.
Solve in the notebook these problems :
10 through 20
50 through 53
129
Not homework! Do not turn in, it’s
for your own review
Chapter Review Questions
• Distinguish between analog and digital information
• Explain data compression and calculate compression ratios
• Explain the binary formats for negative (two’s complement), fractional, and floating-point values
• Describe the characteristics of the ASCII and Unicode character sets
• Perform various types of text compression with pencil and paper: Keyword, Run-length, Huffman
130
Chapter Review Questions
• Explain the nature of sound and its representation
• Explain how RGB values define a color
• Distinguish between raster and vector graphics
• Explain temporal and spatial video compression
131
Don’t forget!
Homework: 8, 9, 29, 30, 31, 41, 44, 48, 49
Due Monday at the beginning of lab, before
the midterm!
Midterm is Monday in the lab.
Bring:
-- notebook (for grading)
-- cheat-sheet (max. 6 pages total, including ASCII table)
132
How to study for the midterm:
-- read and understand your notes (from the notebook – this
is a good opportunity to improve the notebook, if needed)
-- go over the quizzes we solved together in class (you can find them in your notebook, and in the instructor’s notes posted on the
webpage)
-- solve the homework for Ch.3
-- solve the end-of-chapter quizzes for Chs. 1
and 2 (webpage)
-- write your cheat-sheet (max. 6 pages total, including ASCII
table)
133