A Picture is Worth a Thousand Words Milton Chen. What’s a Picture Worth? A thousand words -...

30
A Picture is Worth a Thousand Words Milton Chen
  • date post

    21-Dec-2015
  • Category

    Documents

  • view

    228
  • download

    1

Transcript of A Picture is Worth a Thousand Words Milton Chen. What’s a Picture Worth? A thousand words -...

Page 1: A Picture is Worth a Thousand Words Milton Chen. What’s a Picture Worth? A thousand words - Descartes (1596-1650) A thousand bytes - modern translation.

A Picture is Worth a Thousand Words

Milton Chen

Page 2: A Picture is Worth a Thousand Words Milton Chen. What’s a Picture Worth? A thousand words - Descartes (1596-1650) A thousand bytes - modern translation.

What’s a Picture Worth?

• A thousand words - Descartes (1596-1650)

• A thousand bytes - modern translation– 1000 * 5 * 5 / 3 8,000 bits

• 75,000 bytes - ATSC/MPEG-2– 20 M / 30 600,000 bits

Page 3: A Picture is Worth a Thousand Words Milton Chen. What’s a Picture Worth? A thousand words - Descartes (1596-1650) A thousand bytes - modern translation.

Frequency Response of the Eye

• Lens - low pass

• Photoreceptors - low pass

• Lateral inhibition - high pass– edge is important

Page 4: A Picture is Worth a Thousand Words Milton Chen. What’s a Picture Worth? A thousand words - Descartes (1596-1650) A thousand bytes - modern translation.

Today’s Video Coding

YUV(lossy)

Motion DCTQuantize(lossy)

EntropyOrder

Designed for natural scenes =>Higher frequency DCT coefficients are quantized more =>Sharp edges are not well preserved

Page 5: A Picture is Worth a Thousand Words Milton Chen. What’s a Picture Worth? A thousand words - Descartes (1596-1650) A thousand bytes - modern translation.

What’s Wrong with Today’s Video Coding

• Poor performance for – text (channel logo, stock ticks)– graphics – anything with sharp edges

Page 6: A Picture is Worth a Thousand Words Milton Chen. What’s a Picture Worth? A thousand words - Descartes (1596-1650) A thousand bytes - modern translation.

Desirable Features

• Postproduction support

• Personalized delivery / presentation

• Interactive

• Error resilience

• More compression

• Facilitate search / indexing (MPEG-7)

Page 7: A Picture is Worth a Thousand Words Milton Chen. What’s a Picture Worth? A thousand words - Descartes (1596-1650) A thousand bytes - modern translation.

Outline

• Why

• MPEG-4 Overview

• Systems Layer

• Visual Coding– Arbitrarily shaped video– Meshed video– Face and body

Page 8: A Picture is Worth a Thousand Words Milton Chen. What’s a Picture Worth? A thousand words - Descartes (1596-1650) A thousand bytes - modern translation.

Goals of MPEG-4

• One content– convergence of DTV, computer graphics, and

WWW– broadcast, internet, local

• User interactivity

• Higher compression rates

• Robustness in mobile environment

Page 9: A Picture is Worth a Thousand Words Milton Chen. What’s a Picture Worth? A thousand words - Descartes (1596-1650) A thousand bytes - modern translation.

MPEG-4 Applications

• Interactive TV (broadcast)– Home-shopping, Interactive game show

• Virtual workspace (internet)– virtual meeting, collaborative design

• Infotainment (local)– Virtual-City-Guide

Page 10: A Picture is Worth a Thousand Words Milton Chen. What’s a Picture Worth? A thousand words - Descartes (1596-1650) A thousand bytes - modern translation.

MPEG-4 Key Concepts

• Independent coding of objects– allow user interactivity (client & server)– higher compression rates

• Provide tools as well as solutions– allow content specific and user defined

compression algorithms

Page 11: A Picture is Worth a Thousand Words Milton Chen. What’s a Picture Worth? A thousand words - Descartes (1596-1650) A thousand bytes - modern translation.

MPEG-4 History

• Started in July 1993

• Originally for low-bit-rate applications

• Version 1 to be standardized by January 1999

• Continue work on version 2, etc.

Page 12: A Picture is Worth a Thousand Words Milton Chen. What’s a Picture Worth? A thousand words - Descartes (1596-1650) A thousand bytes - modern translation.

MPEG-4 Standard

1) Systems (manage streams, composition)

2) Visual (natural and synthetic)

3) Audio (natural and synthetic)

4) Conformance Testing

5) Reference Software

6) Delivery Multimedia Integration Framework (medium abstraction layer)

Page 13: A Picture is Worth a Thousand Words Milton Chen. What’s a Picture Worth? A thousand words - Descartes (1596-1650) A thousand bytes - modern translation.

hierarchically multiplexeddownstream control / data

hierarchically multiplexedupstream control / data

audiovisualpresentation

3D objects

2D background

voice

sprite

hypothetical viewer

projection

videocompositor

plane

audiocompositor

scenecoordinate

systemx

y

z user events

audiovisual objects

speakerdisplay

user input

Page 14: A Picture is Worth a Thousand Words Milton Chen. What’s a Picture Worth? A thousand words - Descartes (1596-1650) A thousand bytes - modern translation.
Page 15: A Picture is Worth a Thousand Words Milton Chen. What’s a Picture Worth? A thousand words - Descartes (1596-1650) A thousand bytes - modern translation.

TransMux Streams

FlexMux Streams

Audiovisual InteractiveScene

AL-Packetized Streams

Elementary Streams

Composition and Rendering

Display andUser

Interaction

Transmission/Storage Medium

...(RTP)UDP

IP

(PES)MPEG-2

TS

AAL2ATM

H223PSTN

DABMux ...

TransMuxLayer

TransMux Interface

FlexMux FlexMux FlexMux FlexMux FlexMuxLayer

Stream Multiplex Interface

AL AL...AL AL ... AL AccessUnitLayer

Elementary Stream Interface

PrimitiveAV Objects

SceneDescriptionInformation

ObjectDescriptor

... CompressionLayer

ReturnChannelCoding

Page 16: A Picture is Worth a Thousand Words Milton Chen. What’s a Picture Worth? A thousand words - Descartes (1596-1650) A thousand bytes - modern translation.

Previous Work in Object Coding• Synthetic High System (Schreiber ‘59)

• Contour-Texture Approach (Kocher & Kunt ‘82)

• Object-Based Video Coder (Musmann et. al. ‘89)

• Talisman (Torborg & Kajiya ‘96)

• Blue screen matting (Vlahos ‘64)

Page 17: A Picture is Worth a Thousand Words Milton Chen. What’s a Picture Worth? A thousand words - Descartes (1596-1650) A thousand bytes - modern translation.

Shape Coding• Bitmap-based

– 1 means in, 0 means out– Chroma-keying, GIF89a– G4 fax standard

• Contour-based– chain code– polygon/curve approximation– Fourier descriptor

Page 18: A Picture is Worth a Thousand Words Milton Chen. What’s a Picture Worth? A thousand words - Descartes (1596-1650) A thousand bytes - modern translation.

Chain Code

• Follows the contour and encode the direction of next boundary pel

• 4 or 8 directions for an avg. of 1.2 or 1.4 bits per boundary pel

• Extensions– length– angular resolution

Page 19: A Picture is Worth a Thousand Words Milton Chen. What’s a Picture Worth? A thousand words - Descartes (1596-1650) A thousand bytes - modern translation.

Polygon Approximation

• Add control points until maximum error is below threshold

• Threshold <= 1.4 pel for CIF (352*288) video

• Extension– curves of various order

Page 20: A Picture is Worth a Thousand Words Milton Chen. What’s a Picture Worth? A thousand words - Descartes (1596-1650) A thousand bytes - modern translation.

Fourier Descriptor

• Translation, rotation, and scale invariant

• Sample contour -> ( xi, yi )

• i, ( yi+1 - yi ) / ( xi + 1 - xi )

• Compute Fourier Series coefficients

• Good for recognition, but not an efficient shape coder

Page 21: A Picture is Worth a Thousand Words Milton Chen. What’s a Picture Worth? A thousand words - Descartes (1596-1650) A thousand bytes - modern translation.

MPEG-4 Experiments• Chroma-keying

– color bleeding– need to decode whole frame to get shape

• Bitmap and contour-based coding are similar in:– error resilience– coding efficiency

• Bitmap-based is simpler for hardware due to regular memory access

Page 22: A Picture is Worth a Thousand Words Milton Chen. What’s a Picture Worth? A thousand words - Descartes (1596-1650) A thousand bytes - modern translation.

MPEG-4 Shape Coding

• Three types of macroblocks– transparent, opaque, and object boundary

• Context-based arithmetic encoder • Macroblocks can be subsampled• Texture padded with 0 or mean value• Transparency

– constant: one 8 bit value– arbitrary: treat it like color

Page 23: A Picture is Worth a Thousand Words Milton Chen. What’s a Picture Worth? A thousand words - Descartes (1596-1650) A thousand bytes - modern translation.

Meshed Video

• 2D mesh tessellates the video into patches

• Motion vector for each vertex

• Texture warped in each patch

Page 24: A Picture is Worth a Thousand Words Milton Chen. What’s a Picture Worth? A thousand words - Descartes (1596-1650) A thousand bytes - modern translation.

Meshed Video - Motivation

• Motion Modeling– Translational-block motion does not model

rotation, scaling, reflection, and shear

• Shape Modeling– Possible without depth

Page 25: A Picture is Worth a Thousand Words Milton Chen. What’s a Picture Worth? A thousand words - Descartes (1596-1650) A thousand bytes - modern translation.

Meshed Video - Applications• Compression

– better motion compensation– transmit texture only at key frames– spatio-temporal interpolation (zooming, frame-rate

up-conversion)

• Manipulation– augmented reality– transfiguration (replace billboards)

• Indexing / searching

Page 26: A Picture is Worth a Thousand Words Milton Chen. What’s a Picture Worth? A thousand words - Descartes (1596-1650) A thousand bytes - modern translation.

Face• Face object

– Default face model with terminal– Facial Definition Parameter or user supplied

model/texture– Facial Animation Parameter plus Amplification

and Filters– Lip Shape Animation from phoneme

Page 27: A Picture is Worth a Thousand Words Milton Chen. What’s a Picture Worth? A thousand words - Descartes (1596-1650) A thousand bytes - modern translation.

Facial Definition Parameter

4.64.4

10.4

10.2

9.4

2.10

Y

Z

X

7.1

2.12.10

2.1

9.2

5.2 5.1

4.34.2 4.14.4

10.6

10.10

10.8

11.311.2

11.511.5

11.411.4

11.2

11.1

11.1

10.10

10.8

10.6

10.9

10.7

10.5

10.3

10.110.2

3.11

3.13

3.7

3.9

3.53.1

3.3

Left Eye

Other feature points

Feature points affected by FAPs6.1

6.3

6.4

Tongue

6.2

Mouth

8.4

8.7

8.5

2.4 8.3

8.1

2.5

2.8

2.6.2.2

2.9

2.7

2.3

8.108.6 8.9

8.8 8.2

3.14

3.12

3.10

3.8

3.63.2

Right Eye

4.6 4.5

9.119.10

9.9

9.8

Teeth

9.12

2.112.12

9.6

2.132.14 2.14

2.12

9.14

Nose

9.79.6

9.12

9.19.29.3

9.59.4

9.14 9.13

9.15

Y

X

Z

3.4

10.4

9.3

Page 28: A Picture is Worth a Thousand Words Milton Chen. What’s a Picture Worth? A thousand words - Descartes (1596-1650) A thousand bytes - modern translation.

Facial Animation Parameter

ES0

ENS0

MNS0

MW0

IRISD0

Page 29: A Picture is Worth a Thousand Words Milton Chen. What’s a Picture Worth? A thousand words - Descartes (1596-1650) A thousand bytes - modern translation.

Body

• Like the face

Page 30: A Picture is Worth a Thousand Words Milton Chen. What’s a Picture Worth? A thousand words - Descartes (1596-1650) A thousand bytes - modern translation.

Ultimate Compression TechniqueComputer Graphics ???

• Block based DCT (MPEG-1/2)

• Arbitrary shaped video (MPEG-4)

• Meshed video (MPEG-4)

• Image based rendering

• Textured 3D graphics

• Geometry only 3D graphics