Chapter 7 – End-to-End Data Two main topics Presentation formatting Compression We will go over...

Chapter 7 – End-to-End Data

Two main topics Presentation formatting Compression

We will go over the main issues in presentation formatting, but not much detail

More detail will be covered in compression, especially JPEG and MPEG

Presentation Formatting/Encoding

The receiver must be able to extract the same message from the signal as the transmitter sent

Encoding is sometimes called argument marshalling Marshalling is actually not trivial – because

compilers and application programs have a lot of latitude in how they lay out structures (records)

Look over the next high-level graphic

Applicationdata

Presentationencoding

Applicationdata

Presentationdecoding

Message Message Message…

Taxonomy

Base types – lowest level Integers, floating point, characters, etc.

Flat types Structures Arrays

Complex types – highest level Types requiring pointers

Examples

Case 1 – sending an ordered string of integers (say financial market data) over the internet – no problem breaking this up into a string of bytes and no problem reassembling the data at the end

Examples continued

Case 2 – sending a database of student records over a network. Students would have different numbers of

courses that took, so the records would be of different length but the fields would probably be fixed length

Packing and unpacking the data would have some difficulties involved

Examples continued

Case 3 – a hierarchical database stored in a format with pointers needs to be transmitted over the Internet Packing and unpacking is a large problem

Pointers are implemented by memory addresses and will change from one machine (sender) to another (receiver)

Marshalling must serialize a complex, pointer implementation of a database – quite difficult!

Argument marshaller

Application data structure

Two Conversion Strategies

Sender converts to common format, common format is transmitted, receiver decodes from common format Seems natural, but …

Receiver-makes-right – transmit and let the receiver figure it all out Surprisingly this is often the better approach See the reasons on page 533

Call P

Clientstub

RPC

Arguments

Marshalledarguments

Interfacedescriptor forProcedure P

Stubcompiler

Message

Specification

P

Serverstub

RPC

Arguments

Marshalledarguments

Code Code

Data Compression

Blue.bmp = 293 KB Blue.jpeg = 4 KB Not much information

Length, width of each area

Color of each area

Data Compression - Why

Bandwidth is a scarce resource, someone still has to pay for it

Often important to compress the data at the sender then transmit the compressed form then decompress it at the receiver

.bmp is a good format for application programs like “Paint” but it is much better to transmit with the .jpeg file format

Two classes of Compression

Lossless Data recovered from the

compression/decompression process is the same as the original

Lossy Some information might be removed by the

compression/decompression process

Why not always Lossless?

Lossy algorithms typically achieve an order of magnitude (10x) better compression than a lossless algorithm

10x makes a big difference in the amount of data that must be transmitted

Still images, video and audio are all intended for human eyes or ears – which can tolerate errors and imperfections – because the brain can compensate

Lossless Algorithms

Run length encoding (RLE) Replaces consecutive occurrences of a symbol with a

single symbol and the number of times it occurs (example: AAACCCC is 3A4C)

Differential Pulse Code Modulation Records differences from the base symbol

Dictionary-Based Each string is replace with its index in a dictionary

Lossless Example – Differential Encoding

Basic idea is to encode changes. Concept is also used in some lossy algorithms

No need to store all the information in each of the following pictures – for the last two just the changes which are much smaller

Frame 1 A B C Frame 2 A B C D E F then just store “D E F” for Frame 2 and “add” it to Frame 1 to restore Frame 2

Image Compression (JPEG)

JPEG = Joint Photographic Experts Group More than a compression algorithm, also

defines the format for image or video data JPEG compression takes place in stages

Aside first: Fourier transforms and filtering

Fourier Transform

Consider the following graph

It is a weighted sum of 5 sine waves

But the coefficients of the higher frequency terms are very small

So the entire figure can be approximated well by the low order terms

1st Order Approximation

Only the first term – not a good approximation

2nd Order Approximation

The first two terms give a better approximation

4th Order Approximation

Skipping ahead to 4 terms the approximation is excellent – almost exact

5th Order Approximation would be Exact

Since the original function is a weighted sum of the first 5 sin terms - sin(kt) - the information that uniquely represents the function is the set of coefficients [10;5;2;1;.5]

As we saw we could drop the .5 coefficient and retain most of the shape of the curve – hence our information loss would be very slight. A simple example of lossy.

Fourier vs. DCT

The Discrete Cosine Transform (DCT) is very similar to the Fourier Transform (see pages 544-5 and note that we are using a 2-dimensional transform)

Sourceimage

JPEG compression

DCT Quantization Encoding

Compressedimage

MPEG

A very difficult algorithm! Like JPEG for a single frame, but it has three

basic kinds of frames Encoding is very difficult and computationally

intensive, hence slow, often done offline Decoding is the only part usually done real

time

Three Phases

Study over the three phases of JPEG DCT Quantization

similar to our example of dropping the 5th coefficient and retaining a graph that was very similar to the original

Encoding phase

Frame 1 Frame 2 Frame 3 Frame 4 Frame 5 Frame 6 Frame 7

I frame B frame B frame P frame B frame B frame I frame

MPEGcompression

Forwardprediction

Bidirectionalprediction

Compressedstream

Inputstream

16 16 macroblockwith Y component

8 8 macroblockwith U component

8 8 macroblockwith V component

16 16 pixel region

Color frame

SeqHdr Group of pictures SeqHdr Group of pictures SeqEndCode

GOPHdr PicturePicture Picture

SlicePictureHdr Slice Slice

Macroblock MacroblockSliceHdr Macroblock

MBHdr Block(0) Block(1) Block(2) Block(3) Block(4) Block(5)

…

…

…

…

Chapter 7 – End-to-End Data Two main topics Presentation formatting Compression We will go over...

Documents

Transcript of Chapter 7 – End-to-End Data Two main topics Presentation formatting Compression We will go over...