Basics on Video Communications and UMCP ENEE631 Slides ... · PDF file1 ENEE631 Digital Image...
Transcript of Basics on Video Communications and UMCP ENEE631 Slides ... · PDF file1 ENEE631 Digital Image...
11
ENEE631 Digital Image Processing (Spring'06)
Basics on Video Communications andBasics on Video Communications andOther Video Coding Approaches/StandardsOther Video Coding Approaches/Standards
Spring ’06 Instructor: K. J. Ray Liu
ECE Department, Univ. of Maryland, College Park
UM
CP
ENEE
631
Slid
es (c
reat
ed b
y M
.Wu
©20
04)
ENEE631 Digital Image Processing (Spring'06) Lec20 – Video Coding (3) [2]
Quick Review Quick Review –– A Few Basics on VideoA Few Basics on Video
Acquisition, Display, Analog & Digital FormatsAcquisition, Display, Analog & Digital Formats
UM
CP
ENEE
631
Slid
es (c
reat
ed b
y M
.Wu
©20
04)
ENEE631 Digital Image Processing (Spring'06) Lec20 – Video Coding (3) [3]
Video CameraVideo Camera
Frame-by-frame capturing
CCD sensors (Charge-Coupled Devices)– 2-D array of solid-state sensors– Each sensor corresponding to a pixel– Store in a buffer and sequentially read out– Widely used
small and light
CMOS sensors– Each sensor is a transitor
UM
CP
ENEE
631
Slid
es (c
reat
ed b
y M
.Wu
©20
01)
ENEE631 Digital Image Processing (Spring'06) Lec20 – Video Coding (3) [4]
Video DisplayVideo Display
CRT (Cathode Ray Tube)– Large dynamic range– Bulky for large display
CRT physical depth has to be similar to screen width
LCD Flat-panel display– Use electrical field to change the optical properties hence the
brightness/color of liquid crystal– Generating the electrical field
by an array of transistors: active-matrix thin-film transistorsby plasma
“Active-matrix display” (also known as TFT) has a transistor located at each pixel, allowing display be switched more frequently and less current to control pixel luminance. Passive matrix LCD has a grid of conductors with pixels located at the grid intersections
UM
CP
ENEE
631
Slid
es (c
reat
ed b
y M
.Wu
©20
01)
22
ENEE631 Digital Image Processing (Spring'06) Lec20 – Video Coding (3) [5]
Composite vs. Component VideoComposite vs. Component Video
Component video– Three separate signals for tristimulus color representation or luminance-
chrominance representation – Pro: higher quality– Con: need high bandwidth and synchronization
Composite video– Multiplex into a single signal– Historical reason for transmitting color TV through monochrome channel– Pro: save bandwidth– Con: cross talk
S-video: luminance sig. + single multiplexed chrominance sig.
UM
CP
ENEE
631
Slid
es (c
reat
ed b
y M
.Wu
©20
01)
ENEE631 Digital Image Processing (Spring'06) Lec20 – Video Coding (3) [6]
Analog Video RasterAnalog Video Raster
Line-by-line “Raster Scan”– Represent line-by-line image frame with 1-D analog
waveform– Synchronization signal for horizontal and vertical retrace
UM
CP
ENEE
631
Slid
es (c
reat
ed b
y M
.Wu
©20
01)
ENEE631 Digital Image Processing (Spring'06) Lec20 – Video Coding (3) [7]
Forming Picture on TV Tube (Monochrome)Forming Picture on TV Tube (Monochrome)
How many lines?
From B.Liu EE330S’01 Princeton
UM
CP
ENEE
631
Slid
es (c
reat
ed b
y M
.Wu
©20
01)
ENEE631 Digital Image Processing (Spring'06) Lec20 – Video Coding (3) [8]
How Many TV Lines?How Many TV Lines?
Determined by spatial freq. response of HVS
dot
dot
Cannot resolve if
distance > 2000 x separation
(~ 0.03 degree viewing angle)
From B.Liu EE330S’01 Princeton
N = 500 for D=4H
UM
CP
ENEE
631
Slid
es (c
reat
ed b
y M
.Wu
©20
01)
33
ENEE631 Digital Image Processing (Spring'06) Lec20 – Video Coding (3) [9]
Review: Progressive vs. Interlaced scanReview: Progressive vs. Interlaced scanFrom B.Liu EE330S’01 Princeton
UM
CP
ENEE
631
Slid
es (c
reat
ed b
y M
.Wu
©20
01)
ENEE631 Digital Image Processing (Spring'06) Lec20 – Video Coding (3) [10]
Analog Color TV SystemsAnalog Color TV Systems
Historical notes – Color TV system had to be compatible with earlier monochrome TV system
3 formats– NTSC ~ North American + Japan/Taiwan– PAL ~ Western Europe + Asia(China) + Middle East– SECAM ~ Eastern Europe + France– What format in your home country?
From Wang’s Preprint Fig.1.5
UM
CP
ENEE
631
Slid
es (c
reat
ed b
y M
.Wu
©20
01)
ENEE631 Digital Image Processing (Spring'06) Lec20 – Video Coding (3) [11]
Comparison of Three Analog TV SystemsComparison of Three Analog TV Systems– Spatial and temporal resolution– Color coordinate– Signal bandwidth– Multiplexing of luminance, chrominance, and audio
(From Wang’s Preprint)
UM
CP
ENEE
631
Slid
es (c
reat
ed b
y M
.Wu
©20
01)
ENEE631 Digital Image Processing (Spring'06) Lec20 – Video Coding (3) [12]
NTSCNTSC
4:3 aspect ratio (width:height)
525 lines/frame, 2:1 interlace at field rate 59.94Hz– 483 active lines per frame; vertical retrace takes time of 9 lines– rest for broadcaster’s info. like closed caption
YIQ color coordinate for transmission– RGB primary slightly different from PAL– Orthogonal chrominance
I ~ orange-to-cyan; Q ~ green-to-purple (need less bandwidth)
Multiplexing over 6M Hz total bandwidth– Artifacts due to cross talk between luminance and chrominance
UM
CP
ENEE
631
Slid
es (c
reat
ed b
y M
.Wu
©20
01)
44
ENEE631 Digital Image Processing (Spring'06) Lec20 – Video Coding (3) [13]
NTSC 6MHz Bandwidth NTSC 6MHz Bandwidth From Wang’s Preprint Fig.1.6(b)
UM
CP
ENEE
631
Slid
es (c
reat
ed b
y M
.Wu
©20
01)
ENEE631 Digital Image Processing (Spring'06) Lec20 – Video Coding (3) [14]
Analog Video RecordingAnalog Video Recording
Comparison of common formats
From Wang’s Preprint Table 1.2
UM
CP
ENEE
631
Slid
es (c
reat
ed b
y M
.Wu
©20
01)
ENEE631 Digital Image Processing (Spring'06) Lec20 – Video Coding (3) [15]
Digital Video FormatsDigital Video FormatsITU-R BT.601 recommendation
Downsampled chrominance– Y Cb Cr coordinate and four subsampling formats
Inter. Telecomm. Union – Radio sector
Wang’sPreprint Fig.1.8
UM
CP
ENEE
631
Slid
es (c
reat
ed b
y M
.Wu
©20
01)
ENEE631 Digital Image Processing (Spring'06) Lec20 – Video Coding (3) [16]
From Wang’sPreprint
Table 1.3
UM
CP
ENEE
631
Slid
es (c
reat
ed b
y M
.Wu
©20
01)
55
ENEE631 Digital Image Processing (Spring'06) Lec20 – Video Coding (3) [17]
ResourceResource
Background and Motivation on Background and Motivation on Multimedia Coding / CommunicationsMultimedia Coding / Communications
UM
CP
ENEE
631
Slid
es (c
reat
ed b
y M
.Wu
©20
04)
ENEE631 Digital Image Processing (Spring'06) Lec20 – Video Coding (3) [18]
Generations of Video CodingGenerations of Video Coding
UM
CP
ENEE
408G
Slid
es (c
reat
ed b
y M
.Wu
& R
.Liu
©20
02)
From R.Liu Seminar Course ’00 @ UMCP
ENEE631 Digital Image Processing (Spring'06) Lec20 – Video Coding (3) [19]
Channel Bandwidth Channel Bandwidth
UM
CP
ENEE
408G
Slid
es (c
reat
ed b
y M
.Wu
& R
.Liu
©20
02)
From R.Liu Seminar Course ’00 @ UMCP
ENEE631 Digital Image Processing (Spring'06) Lec20 – Video Coding (3) [20]
UM
CP
ENEE
408G
Slid
es (c
reat
ed b
y M
.Wu
& R
.Liu
©20
02)
Storage CapacityStorage Capacity
UMCP ENEE408G Slides (created by M.Wu & R.Liu © 2002)
From R.Liu Seminar Course ’00 @ UMCP
66
ENEE631 Digital Image Processing (Spring'06) Lec20 – Video Coding (3) [21]
Source Video FormatsSource Video FormatsU
MC
P EN
EE40
8G S
lides
(cre
ated
by
M.W
u &
R.L
iu ©
2002
)
From R.Liu Seminar Course ’00 @ UMCP
ENEE631 Digital Image Processing (Spring'06) Lec20 – Video Coding (3) [22]
Application RequirementsApplication Requirements
UM
CP
ENEE
408G
Slid
es (c
reat
ed b
y M
.Wu
& R
.Liu
©20
02)
From R.Liu Seminar Course ’00 @ UMCP
ENEE631 Digital Image Processing (Spring'06) Lec20 – Video Coding (3) [23]
Other Standard and Considerations for Other Standard and Considerations for Digital Video Coding Digital Video Coding
UM
CP
ENEE
631
Slid
es (c
reat
ed b
y M
.Wu
©20
04)
ENEE631 Digital Image Processing (Spring'06) Lec20 – Video Coding (3) [24]
Performance Tradeoff for Video CodingPerformance Tradeoff for Video Coding
From R.Liu’s Handbook Fig.1.2:
“mos” ~ 5-pt mean opinion scale of bad, poor, fair, good, excellent
UM
CP
ENEE
408G
Slid
es (c
reat
ed b
y M
.Wu
& R
.Liu
©20
02)
77
ENEE631 Digital Image Processing (Spring'06) Lec20 – Video Coding (3) [25]
H.26x for Video TelephonyH.26x for Video TelephonyRemote face-to-face communication: A dream for years
H.26x – Video coding targeted low bit rate– Through ISDN or regular analog telephone line ~ on the order of 64kbps – Need roughly symmetric complexity on encoder and decoder
H.261 (early 1990s)– Similar to simplified MPEG-1 ~ block-based DCT/MC hybrid coder– Integer-pel motion compensation with I/P frame only ~ no B frames– Restricted picture size/fps format and M.V. range
H.263 (mid 1990s) and H.263+/H.263++ (late 1990s)– Support half-pel motion compensation & many options for improvement
H.264 (latest, 2001-): also known as H.26L / JVT / MPEG4 part10– Hybrid coding framework with many advanced techniques– Focusing on greatly improving compression ratio at a cost of complexity
UM
CP
ENEE
408G
Slid
es (c
reat
ed b
y M
.Wu
& R
.Liu
©20
02)
ENEE631 Digital Image Processing (Spring'06) Lec20 – Video Coding (3) [26]
MPEGMPEG--22
Extend from MPEG-1
Target at high-resolution high-bit-rate applications– Digital video broadcasting, HDTV, …– Also used for DVD
Support scalability
Support interlaced video – Frame pictures vs. Field pictures– New prediction modes for motion compensation related to interlaced
videoUse previously encoded fields to do M.E.-M.C.U
MC
P EN
EE40
8G S
lides
(cre
ated
by
M.W
u &
R.L
iu ©
2002
)
ENEE631 Digital Image Processing (Spring'06) Lec20 – Video Coding (3) [27]
Scalability in Video CodecsScalability in Video Codecs
Scalability: provide different quality in a single stream– Stack up more bits on base layer to provide improved quality
Possible ways for achieving scalabilities– SNR Scalability ~ Multiple–quality video services
Basic vs. premium quality
– Spatial Scalability ~ Multiple-dimension displaysDisplay on PDA vs. PC vs. Super-resolution display
– Temporal Scalability ~ Multiple frame rates
Layered coding concept facilitates:– Unequal error protection – Efficient use of resources– Different needs from customers – Multiple services
UM
CP
ENEE
408G
Slid
es (c
reat
ed b
y M
.Wu
& R
.Liu
©20
02)
ENEE631 Digital Image Processing (Spring'06) Lec20 – Video Coding (3) [28]
SNR ScalabilitySNR Scalability
Two layers with same spatio-temporal resolution but different qualities
base-layerencoder
base-layerdecoder
enhancement-layerencoder
mul
tiple
xer
+ -
Video inBase-layerbitsteam
Enhancement-layerbitsteam
Outputbitsteam
From R.Liu Seminar Course @ UMCP
88
ENEE631 Digital Image Processing (Spring'06) Lec20 – Video Coding (3) [29]
Spatial ScalabilitySpatial Scalability
Two layers with different spatial resolution
base-layerencoder
base-layerdecoder
enhancement-layerencoder
mul
tiple
xer
+ -
Video inBase-layerbitsteam
Enhancement-layerbitsteam
Outputbitsteam
Down-sampler
Up-sampler
From R.Liu Seminar Course @ UMCP
ENEE631 Digital Image Processing (Spring'06) Lec20 – Video Coding (3) [30]
Temporal ScalabilityTemporal ScalabilityEnhancement layer carries additional frames at same spatial resolution
Temporaldemux
Base-layer
Enhancement-layer
base-layerencoder
base-layerdecoder
enhancement-layerencoder
mul
tiple
xer
Base-layer video in Base-layer bitsteam
Enhancement-layerbitsteam
Outputbitsteam
Base-layer decodedvideo out
Enhancement-layer video in
From R.Liu Seminar Course @ UMCP
ENEE631 Digital Image Processing (Spring'06) Lec20 – Video Coding (3) [31]
MPEGMPEG--44
Many functionalities targeting a variety of applications
Introduced object-based coding strategy– For better support of interactive applications & graphics/animation video– Require encoder to perform object segmentation
difficult for general applications
Introduced error resilient coding techniques– “Streaming video profile” for wireless multimedia applications
Part-10 is converged into H.264– Focused on improving compression ratio and error resilience– Stick with Hybrid coding frameworkU
MC
P EN
EE40
8G S
lides
(cre
ated
by
M.W
u &
R.L
iu ©
2002
)
ENEE631 Digital Image Processing (Spring'06) Lec20 – Video Coding (3) [32]
ObjectObject--based Coding in MPEGbased Coding in MPEG--44Interactive functionalitiesHigher compression efficiency by separately handling – Moving objects– Unchanged background– New regions– M.C.-failure regions=> “Sprite” encoding
Object segmentationneeded (not easy )– Based on color, motion,
edge, texture, etc.– Possible for targeted
applications
Revised from R.Liu Seminar Course @ UMCP
UM
CP
ENEE
408G
Slid
es (c
reat
ed b
y M
.Wu
& R
.Liu
©20
02)
99
ENEE631 Digital Image Processing (Spring'06) Lec20 – Video Coding (3) [33]
ObjectObject--based Coding in MPEGbased Coding in MPEG--4 (cont4 (cont’’d)d)
From Wang’s book preprint Fig. 13.30
UM
CP
ENEE
408G
Slid
es (c
reat
ed b
y M
.Wu
& R
.Liu
©20
02)
ENEE631 Digital Image Processing (Spring'06) Lec20 – Video Coding (3) [34]
ModelModel--Based Video CodingBased Video Coding
From R.Liu Seminar Course @ UMCP
ENEE631 Digital Image Processing (Spring'06) Lec20 – Video Coding (3) [35]
AnalysisAnalysis--Synthesis CodingSynthesis Coding
From R.Liu Seminar Course @ UMCP
ENEE631 Digital Image Processing (Spring'06) Lec20 – Video Coding (3) [36]
Some Coding ModelsSome Coding Models
From R.Liu Seminar Course @ UMCP
1010
ENEE631 Digital Image Processing (Spring'06) Lec20 – Video Coding (3) [37]
MPEGMPEG--77
“Multimedia Content Description Interface”– Not a video coding/compression standard like previous MPEG– Emphasize on how to describe the video content for efficient
indexing, search, and retrieval
Standardize the description mechanism of content– Descriptor, Description Scheme, Description Definition Languages– Example of MPEG-7 visual descriptor: Color, Texture, Shape, …
Figure from MPEG-7 Document N4031 (March 2001)
UM
CP
ENEE
408G
Slid
es (c
reat
ed b
y M
.Wu
& R
.Liu
©20
02)
ENEE631 Digital Image Processing (Spring'06) Lec20 – Video Coding (3) [38]
SummarySummaryScalable coding
Standards evolved from or similar to MPEG-1– MPEG-2, H.26x
Brief intro. on model-based coding– Object-based video coding & MPEG-4
Additional MPEG-4 activities– Error resilience– Intellectual property management/protection
What is after MPEG-4?– MPEG-7 for facilitating image/video search and indexing
ENEE631 Digital Image Processing (Spring'06) Lec20 – Video Coding (3) [39]
Reading AssignmentReading Assignment
Readings– Wang’s book Chapt.13, Sec.11.1, Sec.10.5
– [Electronic Handout] R.Liu’s Handbook Chapt.1-3
Chapter 7 “Data Compression” (handout)– Sec. 7.6 => H.261 & H.263– Sec. 7.7.5 & 7.7.6 => MPEG-4 & MPEG-7
Tutorial on MPEG Video Coding (handout)– IEEE Signal Processing Magazine, Sept. 1997
UM
CP
ENEE
408G
Slid
es (c
reat
ed b
y M
.Wu
©20
02)
ENEE631 Digital Image Processing (Spring'06) Lec20 – Video Coding (3) [40]
Video Content AnalysisVideo Content Analysis
1111
ENEE631 Digital Image Processing (Spring'06) Lec20 – Video Coding (3) [41]
Introduction to Video Content AnalysisIntroduction to Video Content AnalysisTeach computer to “understand” video content– Define features that computer can learn to measure and compare
color (RGB values or other color coordinates)motion (magnitude and directions)shape (contours)texture and patterns
– Give example correspondences so that computer can learnbuild connections between feature & higher-level semantics/conceptsstatistical classification and recognition techniques
Video understanding– Break a video sequence into chunks, each with consistent content ~ “shot”– Group similar shot into scenes that represent certain events– Describe connections among scenes via story boards or scene graphs– Associate shot/scene with representative feature/semantics for future query
UM
CP
ENEE
408G
Slid
es (c
reat
ed b
y M
.Wu
& R
.Liu
©20
02)
ENEE631 Digital Image Processing (Spring'06) Lec20 – Video Coding (3) [42]
Video Understanding (stepVideo Understanding (step--1)1)
– Break a video sequence into chunks, each with consistent content ~ “shot”
From Yeung-Yeo-Liu: STG (Princeton)
ENEE631 Digital Image Processing (Spring'06) Lec20 – Video Coding (3) [43]
Video Understanding (stepVideo Understanding (step--2)2)
– Group similar shot into scenes
From Yeung-Yeo-Liu: STG (Princeton)
ENEE631 Digital Image Processing (Spring'06) Lec20 – Video Coding (3) [44]
Video Understanding (stepVideo Understanding (step--3)3)– Describe connections among scenes via story boards or scene
graphs
From Yeung-Yeo-Liu: STG (Princeton)
1212
ENEE631 Digital Image Processing (Spring'06) Lec20 – Video Coding (3) [45]
Video Temporal SegmentationVideo Temporal Segmentation
A first step toward video content understanding
Two types of transitions– “Cut” ~ abrupt transition– Gradual transition
Fade out and Fade in; Dissolve; Wipe
Detecting transitions– Detecting cut is relatively easier ~ check frame-wise difference– Detecting dissolve and fade by checking linearity
f0 (1 – t/T) + f1 * t/T
– Detecting wipe ~ more difficultvia projection, edge pattern, or linearity of color histogramU
MC
P EN
EE40
8G S
lides
(cre
ated
by
M.W
u &
R.L
iu ©
2002
)
ENEE631 Digital Image Processing (Spring'06) Lec20 – Video Coding (3) [46]
Types of TransitionsTypes of Transitions
– [above] Transition types offered by Adobe Premiere– See also transition demos provided by PowerPoint
From talks by Joyce-Liu (Princeton)
Video transition collection (Rob Joyce) www.ee.princeton.edu/~robjoyce/research/transitions/
ENEE631 Digital Image Processing (Spring'06) Lec20 – Video Coding (3) [47]
Examples of WipesExamples of Wipes
UM
CP
ENEE
408G
Slid
es (c
reat
ed b
y M
.Wu
©20
02)
ENEE631 Digital Image Processing (Spring'06) Lec20 – Video Coding (3) [48]
CompressedCompressed--Domain ProcessingDomain Processing
Use I & P frames only to reduce computation and to enhance robustness in scene change detection– … I b b P b b P b b P b b I b b P …
Working in compressed domain– Process video by only doing partial decoding (inverse VLC,
etc.) without a full decoding (IDCT) to save computationLow resolution version already provide enough information for transition detection– DC-imageU
MC
P EN
EE40
8G S
lides
(cre
ated
by
M.W
u &
R.L
iu ©
2002
)
1313
ENEE631 Digital Image Processing (Spring'06) Lec20 – Video Coding (3) [49]
DC ImageDC Image– Put DC of each block together– Already contain most information of the video
DC Frame
Example From Joyce-Liu (Princeton)
ENEE631 Digital Image Processing (Spring'06) Lec20 – Video Coding (3) [50]
Fast Extraction of DC Image From MPEGFast Extraction of DC Image From MPEG--11I frame– Take DC coeff. from each block and put together
P/B frame– Fast approximation of reference block’s DC – Adding DC of the motion compensation residue
recall DCT is a linear transform
[ ( )] [ ( )] [ ( )]DCT P DCT P DCT Pcur ref diff00 00 00≈ +
[ ( )] [ ( )]DCT Ph w
DCT Prefi i
ii
00 001
4
64≈
=∑
1 2
3 4
C
RUM
CP
ENEE
408G
Slid
es (c
reat
ed b
y M
.Wu
©20
02)
ENEE631 Digital Image Processing (Spring'06) Lec20 – Video Coding (3) [51]
CompressedCompressed--Domain Scene Change DetectionDomain Scene Change Detection
Compare nearby frames– Take pixel-wise difference of nearby DC-frames– Or take pixel-wise difference of every N frames to accumulate more
changes => useful for detect gradual transitions
Observe the pixel-wise difference for different frame pairs– Peaks @ cuts, and plateaus @ gradual transitions
Figure from Yeo-Liu CSVT’95 paper
UM
CP
ENEE
408G
Slid
es (c
reat
ed b
y M
.Wu
©20
02)
ENEE631 Digital Image Processing (Spring'06) Lec20 – Video Coding (3) [52]
Scene Change Detection (contScene Change Detection (cont’’d)d)
Figure from Yeo-Liu CSVT’95 paper
UM
CP
ENEE
408G
Slid
es (c
reat
ed b
y M
.Wu
©20
02)
1414
ENEE631 Digital Image Processing (Spring'06) Lec20 – Video Coding (3) [53]
Dissolve: DC Frame SpaceDissolve: DC Frame Space
Dissolve: a linear combination of g and h
Detect straight lines in DC frame space– correlation detection on triplets
dissolve
g k
h km
n
Pixel 1
Pixel 2
Pixel 3
From talks by Joyce-Liu (Princeton)
ENEE631 Digital Image Processing (Spring'06) Lec20 – Video Coding (3) [54]
Wipe DetectionWipe Detection– Convert the 2-D
problem to 1-D by projection
– Perform horizon, vertical, diagonal projection to detect diverse wipe types
UM
CP
ENEE
408G
Slid
es (c
reat
ed b
y M
.Wu
©20
02)
ENEE631 Digital Image Processing (Spring'06) Lec20 – Video Coding (3) [55]
Color HistogramColor Histogram
What is color histogram?– Count the # of pixels with the same color– Plot color-value vs. corresponding pixel#
Similarly for luminance histogram
Give idea of the dominate color and color distribution– Ignore the exact spatial location of each color value– Useful in image and video analysis
Color histogram can be used to:– Detect gradual shot transition esp. for fancy wipes– Measure content similarity between images / video shots
UM
CP
ENEE
408G
Slid
es (c
reat
ed b
y M
.Wu
& R
.Liu
©20
02)
ENEE631 Digital Image Processing (Spring'06) Lec20 – Video Coding (3) [56]
Wipe Detection (contWipe Detection (cont’’d)d)More diverse and fancy wipes
Linear change in color histogram
wipe
G k H k
m
n
Bin 1
Bin 2
Bin 3
From talks by Joyce-Liu (Princeton)