Chapter 7 – End-to-End Data Two main topics Presentation formatting Compression We will go over...
-
Upload
marion-mcdonald -
Category
Documents
-
view
219 -
download
1
Transcript of Chapter 7 – End-to-End Data Two main topics Presentation formatting Compression We will go over...
Chapter 7 – End-to-End Data
Two main topics Presentation formatting Compression
We will go over the main issues in presentation formatting, but not much detail
More detail will be covered in compression, especially JPEG and MPEG
Presentation Formatting/Encoding
The receiver must be able to extract the same message from the signal as the transmitter sent
Encoding is sometimes called argument marshalling Marshalling is actually not trivial – because
compilers and application programs have a lot of latitude in how they lay out structures (records)
Look over the next high-level graphic
Taxonomy
Base types – lowest level Integers, floating point, characters, etc.
Flat types Structures Arrays
Complex types – highest level Types requiring pointers
Examples
Case 1 – sending an ordered string of integers (say financial market data) over the internet – no problem breaking this up into a string of bytes and no problem reassembling the data at the end
Examples continued
Case 2 – sending a database of student records over a network. Students would have different numbers of
courses that took, so the records would be of different length but the fields would probably be fixed length
Packing and unpacking the data would have some difficulties involved
Examples continued
Case 3 – a hierarchical database stored in a format with pointers needs to be transmitted over the Internet Packing and unpacking is a large problem
Pointers are implemented by memory addresses and will change from one machine (sender) to another (receiver)
Marshalling must serialize a complex, pointer implementation of a database – quite difficult!
Two Conversion Strategies
Sender converts to common format, common format is transmitted, receiver decodes from common format Seems natural, but …
Receiver-makes-right – transmit and let the receiver figure it all out Surprisingly this is often the better approach See the reasons on page 533
Call P
Clientstub
RPC
Arguments
Marshalledarguments
Interfacedescriptor forProcedure P
Stubcompiler
Message
Specification
P
Serverstub
RPC
Arguments
Marshalledarguments
Code Code
Data Compression
Blue.bmp = 293 KB Blue.jpeg = 4 KB Not much information
Length, width of each area
Color of each area
Data Compression - Why
Bandwidth is a scarce resource, someone still has to pay for it
Often important to compress the data at the sender then transmit the compressed form then decompress it at the receiver
.bmp is a good format for application programs like “Paint” but it is much better to transmit with the .jpeg file format
Two classes of Compression
Lossless Data recovered from the
compression/decompression process is the same as the original
Lossy Some information might be removed by the
compression/decompression process
Why not always Lossless?
Lossy algorithms typically achieve an order of magnitude (10x) better compression than a lossless algorithm
10x makes a big difference in the amount of data that must be transmitted
Still images, video and audio are all intended for human eyes or ears – which can tolerate errors and imperfections – because the brain can compensate
Lossless Algorithms
Run length encoding (RLE) Replaces consecutive occurrences of a symbol with a
single symbol and the number of times it occurs (example: AAACCCC is 3A4C)
Differential Pulse Code Modulation Records differences from the base symbol
Dictionary-Based Each string is replace with its index in a dictionary
Lossless Example – Differential Encoding
Basic idea is to encode changes. Concept is also used in some lossy algorithms
No need to store all the information in each of the following pictures – for the last two just the changes which are much smaller
Frame 1 A B C Frame 2 A B C D E F then just store “D E F” for Frame 2 and “add” it to Frame 1 to restore Frame 2
Image Compression (JPEG)
JPEG = Joint Photographic Experts Group More than a compression algorithm, also
defines the format for image or video data JPEG compression takes place in stages
Aside first: Fourier transforms and filtering
Fourier Transform
Consider the following graph
It is a weighted sum of 5 sine waves
But the coefficients of the higher frequency terms are very small
So the entire figure can be approximated well by the low order terms
5th Order Approximation would be Exact
Since the original function is a weighted sum of the first 5 sin terms - sin(kt) - the information that uniquely represents the function is the set of coefficients [10;5;2;1;.5]
As we saw we could drop the .5 coefficient and retain most of the shape of the curve – hence our information loss would be very slight. A simple example of lossy.
Fourier vs. DCT
The Discrete Cosine Transform (DCT) is very similar to the Fourier Transform (see pages 544-5 and note that we are using a 2-dimensional transform)
MPEG
A very difficult algorithm! Like JPEG for a single frame, but it has three
basic kinds of frames Encoding is very difficult and computationally
intensive, hence slow, often done offline Decoding is the only part usually done real
time
Three Phases
Study over the three phases of JPEG DCT Quantization
similar to our example of dropping the 5th coefficient and retaining a graph that was very similar to the original
Encoding phase
Frame 1 Frame 2 Frame 3 Frame 4 Frame 5 Frame 6 Frame 7
I frame B frame B frame P frame B frame B frame I frame
MPEGcompression
Forwardprediction
Bidirectionalprediction
Compressedstream
Inputstream
16 16 macroblockwith Y component
8 8 macroblockwith U component
8 8 macroblockwith V component
16 16 pixel region
Color frame