IEE 5037 Multimedia Communications Lecture 2: Basics of...
Transcript of IEE 5037 Multimedia Communications Lecture 2: Basics of...
1
IEE 5037 Multimedia CommunicationsLecture 2: Basics of Video
Dept. Electronics Engineering,N
ational Chiao T
ung University Adapted from Prof. Yao Wang’s slides
And Prof. Hang’s slides
Reading: ch.1. video processing and communications
Video Basics
Outline
• Color perception and specification• Video capture and display• Analog raster video• Analog TV systems• Digital video
Video Basics 3
1. Color Perception and Specification
• Light -> color perception• Human perception of color• Type of light sources• Trichromatic color mixing theory• Specification of color
– Tristimulus representation – Luminance/Chrominance representation
• Color coordinate conversion
Video Basics 4
Light is part of the EM wave
from [Gonzalez02]
Video Basics 5
Illuminating and Reflecting Light
• Illuminating sources: – emit light (e.g. the sun, light bulb, TV monitors)– perceived color depends on the emitted freq. – follows additive rule
• R+G+B=White
• Reflecting sources: – reflect an incoming light (e.g. the color dye, matte surface,
cloth)– perceived color depends on reflected freq (=emitted freq-
absorbed freq.)– follows subtractive rule
• R+G+B=Black
Video Basics 6
Color Representation
(N&H, Sec. 1.8)• Colors (human perception) are related to the wavelength of light,
but they are subjective, depending upon the physical structure of human eye.
• Human retina contains 3 different color receptor (cones): red, green, and blue
Basis of trichromatic theory.• Tristimulus: Any color appeared to human eye can be specified by
a weighted combination of 3 primary colors.
Video Basics 7
Eye Anatomy
From http://www.stlukeseye.com/Anatomy.asp
Video Basics 8
Eye vs. Camera
Optic nerve send the info to the brainCable to transfer images
RetinaFilm
Iris, pupilShutter
Lens, corneaLens
Eye componentsCamera components
Video Basics 9
Human Perception of Color
• Retina contains photo receptors– Cones: day vision, can perceive color
tone• Red, green, and blue cones• Different cones have different frequency
responses• Tri-receptor theory of color vision
[Young1802]– Rods: night vision, perceive brightness
only• Color sensation is characterized by
– Luminance (brightness)– Chrominance
• Hue (color tone)• Saturation (color purity)
From http://www.macula.org/anatomy/retinaframe.html
Video Basics 10
Human Vision
• Cross-section of the human eye (N&H, p.263)
Cone: fat, clustered around fovea; color, daylight
Rod: slender, dense; intensity, low light
Video Basics 11
The Retina
• Schematic diagram of the retina (N&H, p. 263)
Video Basics 12
Retina Circuitry
• Photoreceptors first respond to the light. The produced signals are “processed” by the bipolar and ganglion cells. Then, visual information (pulses) are passed through optical nerves into the brain. (Carlson, p.145~)
Video Basics 13
Primary Visual Path
• The left half of the visual fields of both eyes are sent to the right hemisphere. (Carlson, p.149)
Video Basics 14
Color Receptors
• Relative absorbance of light of various wavelengths by rods and cones(Carlson, p.155)
Video Basics 15
Frequency Responses of Cones
ybgridaCC ii ,,,,)()( == ∫ λλλ
from [Gonzalez02]
Video Basics 16
Cones Sensitivity: R, G, B and Luminous
100
0400 500 600 700
20
40
60
80 R
elat
ive
sens
itivi
ty
Wavelength
LuminosityfunctionRedGreen
Blue�20
ybgridaCC ii ,,,,)()( == ∫ λλλ
RG
B
Luminous
Video Basics 17
Color Hue Specification
Video Basics 18
Trichromatic Theory
• Select 3 primary stimuli, for example,R (700.0), G (546.1) ,and B (435.8)Any color S can be represented as combination of these 3 primaries R, G, and B.
• Any 3 independent colors can be selected as primaries as long as one is not a mix of the other two.
• Different sets of primaries are related by linear transformations.
BGR ⋅+⋅+⋅= sss BGRS
Video Basics 19
Trichromatic Color Mixing
• Trichromatic color mixing theory– Any color can be obtained by mixing three primary colors with a
right proportion
• Primary colors for illuminating sources:– Red, Green, Blue (RGB)– Color monitor works by exciting red, green, blue phosphors using
separate electronic guns• Primary colors for reflecting sources (also known as
secondary colors):– Cyan, Magenta, Yellow (CMY)– Color printer works by using cyan, magenta, yellow and black
(CMYK) dyes
valuessTristimulu :,3,2,1
kk
kk TCTC ∑=
=
Video Basics 20
RGB vs CMY
Video Basics 21
red
Green Blue
Video Basics 22
Color Representation Models
• Specify the tristimulus values associated with the three primarycolors– RGB– CMY
• Specify the luminance and chrominance– HSI (Hue, saturation, intensity)– YIQ (used in NTSC color TV)– YCbCr (used in digital color TV)
• Amplitude specification: – 8 bits for each color component, or 24 bits total for each pixel– Total of 16 million colors – A true RGB color display of size 1Kx1K requires a display buffer
memory size of 3 MB
Video Basics 23
Color Coordinate Conversion
• Conversion between different primary sets are linear (3x3 matrix)
• Conversion between primary and XYZ/YIQ/YUV are also linear
• Conversion to LSI/Lab are nonlinear– LSI and Lab coordinates
• coordinate Euclidean distance proportional to actual color difference
• Conversion formulae between many color coordinates can be found in [Gonzalez92]
Video Basics 24
CIE Color Specification
─ CIE (Commission Internationale de L'eclairage─ International Commission on Illumination)
• R (700.0), G (546.1), and B (435.8) ─1931• X, Y, Z─ 1931; (1) all color matching functions are positive, and
(2) Y = Luminance. (N&H, p. 51)
⎥⎥⎥
⎦
⎤
⎢⎢⎢
⎣
⎡=
⎥⎥⎥
⎦
⎤
⎢⎢⎢
⎣
⎡
⎥⎥⎥
⎦
⎤
⎢⎢⎢
⎣
⎡
−−−
−=
⎥⎥⎥
⎦
⎤
⎢⎢⎢
⎣
⎡−
BGR
MBGR
ZYX
1
009.1089.0468.014.426.1897.005.0515.365.2
Video Basics 25
CIE Color Specification (cont.)
⎥⎥⎥
⎦
⎤
⎢⎢⎢
⎣
⎡=
⎥⎥⎥
⎦
⎤
⎢⎢⎢
⎣
⎡
⎥⎥⎥
⎦
⎤
⎢⎢⎢
⎣
⎡=
⎥⎥⎥
⎦
⎤
⎢⎢⎢
⎣
⎡
ZYX
MZYX
BGR
990.0010.0200.0010.0813.0310.0000.0177.0490.0
(Basis vectors transformation)
Tristimulus values transformation: (Coordinates transformation)
;1
⎥⎥⎥
⎦
⎤
⎢⎢⎢
⎣
⎡=
⎥⎥⎥
⎦
⎤
⎢⎢⎢
⎣
⎡−
BGR
MZYX
⎥⎥⎥
⎦
⎤
⎢⎢⎢
⎣
⎡=
⎥⎥⎥
⎦
⎤
⎢⎢⎢
⎣
⎡
ZYX
MBGR
Video Basics 26
Chromaticity Coordinates
ZYXZz
ZYXYy
ZYXXx
++=
++=
++=
Note: Hence, (x,y,Y) completely specifies a color.1931 CIE-xychromaticity diagram (N&H, p. 51)
Video Basics 27
CIE Chromaticity Diagram
Concept chart of L*a*b*L: luminancea: from green to redb: from blue to yellow
CIE L*a*b diagram CIE XYZ diagram
Source: http://www.experience.epson.com.au/help/understandingcolour/COL_G/0503_3.htm
Video Basics 28
Just Noticeable Color Difference
• MacAdam's ellipses (just noticeable regions, ten times enlarged) on CIE-xychromaticity diagram(N&H, p.293)
Video Basics 29
2. Video Capture and Display
• Light reflection physics• Imaging operator• Color capture• Color display• Component vs. composite video
Video Basics 30
Video Capture
• For natural images we need a light source (λ: wavelength of the source) .
– E(x, y, z, λ): incident light on a point (x, y, z world coordinates of the point)
• Each point in the scene has a reflectivity function.
– r(x, y, z, λ): reflectivity function
• Light reflects from a point and the reflected light is captured by an imaging device.
– c(x, y, z, λ) = E(x, y, z, λ) × r(x, y, z, λ): reflected light.
?
Courtesy of Onur Guleryuz
Video Basics 31
More on Video Capture
• Reflected light to camera– Camera absorption function
– Projection from 3-D to 2-D
• The projection operator is non-linear – Perspective projection– Othographic projection
λλλψ datCt c )(),,(),( ∫= XX
)),((),(or ),()),(( 1 tPtttPP
xxXX
xX−==
→
ψψψψ
Video Basics 32
Perspective Projection Model
Y
Y
X
Z
C
X
X
x
x
y
x
y
F
Z
Cameracenter
Imageplane
2-Dimage
3-Dpoint
The image of an object is reversed from its 3-D position. The object appears smaller when it is farther away.
ZYFy
ZXFx == ,
Video Basics 33
How to Capture Color
• Need three types of sensors• Complicated digital processing is incorporated in
advanced cameras
Inte
rpol
atio
n
Non
linea
rpr
oces
sing
A/D
RLens
B
G
( fs,1)CCDs
fs,1 fs,1 2fs,1
2fs,1/fs,2
Imageenhancer
D/A
D/A
Rateconv.
Digital CNoutput
Analog CN &CS output
Viewfinderoutput
2fs,1 13.5 MHz
Pre-
proc
ess
Ana
log
proc
ess
Col
or c
orre
ctor
Mat
rix
& e
ncod
er
Figure 1.2 Schematic block diagram of a professional color video camera. Reprinted fromY. Hashimoto, M. Yamamoto, and T. Asaida, Cameras and display systems, IEEE (July 1995),83(7):1032–43. Copyright 1995 IEEE.
Video Basics 34
Video Display
• CRT vs LCD• Need three light sources projecting red, green, blue
components respectively
Video Basics 35
3. Analog Video
• Video raster• Progressive vs. interlaced raster• Analog TV systems
Video Basics 36
Raster Scan
• Real-world scene is a continuous 3-D signal (temporal, horizontal, vertical)
• Analog video is stored in the raster format– Sampling in time: consecutive sets of frames
• To render motion properly, >=30 frame/s is needed– Sampling in vertical direction: a frame is represented by a
set of scan lines• Number of lines depends on maximum vertical frequency and
viewing distance, 525 lines in the NTSC system– Video-raster = 1-D signal consisting of scan lines from
successive frames.
Video Basics 37
Two Scan Methods: Progressive and Interlaced Scans
Field 1 Field 2
Progressive Frame Interlaced Frame
Interlaced scan is developed to provide a trade-off between temporal and vertical resolution, for a given, fixed data rate (number of line/sec).
Horizontal retrace
Vertical retrace
Video Basics 38
Two Scan Methods: Progressive and Interlaced Scans
Video Basics 39
Color TV Broadcasting and Receiving
Luminance,Chrominance,Audio Multiplexing
Modulation
De-Modulation
De-Multiplexing
YC1C2--->RGB
RGB--->
YC1C2
Video Basics 40
Why not using RGB directly?
• R,G,B components are correlated– Transmitting R,G,B components separately is redundant– More efficient use of bandwidth is desired
• RGB->YC1C2 transformation– Decorrelating: Y,C1,C2 are uncorrelated– C1 and C2 require lower bandwidth– Y (luminance) component can be received by B/W TV sets
• YIQ in NTSC– I: orange-to-cyan– Q: green-to-purple (human eye is less sensitive)
• Q can be further bandlimited than I– Phase=Arctan(Q/I) = hue, Magnitude=sqrt (I^2+Q^2) = saturation– Hue is better retained than saturation
Color Image Y image
I image (orange-cyan) Q image (green-purple)
Video Basics 42
I and Q on the color circle
I: orange-cyan
Q: green-purple
Video Basics 43
Conversion between RGB and YIQ
Y = 0.299 R + 0.587 G + 0.114 BI = 0.596 R -0.275 G -0.321 BQ = 0.212 R -0.523 G + 0.311 B
R =1.0 Y + 0.956 I + 0.620 Q,G = 1.0 Y - 0.272 I -0.647 Q,B =1.0 Y -1.108 I + 1.700 Q.
• RGB -> YIQ
• YIQ -> RGB
Video Basics 44
TV signal bandwidth
• Luminance– Maximum vertical frequency (cycles/picture-height)= black and
white lines interlacing
– Maximum horizontal frequency (cycles/picture-width)
– Corresponding temporal frequency (cycles/second or Hz)
– For NTSC,
• Chrominance– Can be bandlimited significantly
• I: 1.5 MHz, Q: 0.5 MHz.
2/' ,max, ysv Kff =
IAR max,max, ⋅= vh ff
lyslh TKfTff '/2' IAR '/ ,max,max ⋅==
MHz 2.4max =f
Video Basics 45
Bandwidth of Chrominance Signals
• Theoretically, for the same line rate, the chromiance signal can have as high frequency as the luminance signal
• However, with real video signals, the chrominance component typically changes much slower than luminance
• Furthermore, the human eye is less sensitive to changes in chrominance than to changes in luminance
• The eye is more sensitive to the orange-cyan range (I) (the color of face!) than to green-purple range (Q)
• The above factors lead to – I: bandlimitted to 1.5 MHz– Q: bandlimitted to 0.5 MHz
Video Basics 46
NTSC Composite Video
Video Basics 47
Multiplexing of luminance, chrominance and audio
(Composite Video Spectrum)
Video Basics 48
Adding Color Bursts for Synchronization
Figure from From Grob, Basic Color Television Principles and Servicing, McGraw Hill, 1975http://www.ee.washington.edu/conselec/CE/kuhn/ntsc/95x417.gif
For accurate regeneration of the color sub-carrier signal at the receiver, a color burst signal is added during the horizontal retrace period
Video Basics 49
Different Color TV Systems
Parameters NTSC PAL SECAM
Field Rate (Hz) 59.95 (60) 50 50
Line Number/Frame 525 625 625
Line Rate (Line/s) 15,750 15,625 15,625
Color Coordinate YIQ YUV YDbDr
Luminance Bandwidth (MHz) 4.2 5.0/5.5 6.0
Chrominance Bandwidth (MHz) 1.5(I)/0.5(Q) 1.3(U,V) 1.0 (U,V)
Color Subcarrier (MHz) 3.58 4.43 4.25(Db),4.41(Dr)
Color Modulation QAM QAM FM
Audio Subcarrier 4.5 5.5/6.0 6.5
Total Bandwidth (MHz) 6.0 7.0/8.0 8.0
Video Basics 50
Who uses what?
From http://www.stjarnhimlen.se/tv/tv.html#worldwide_0
Video Basics 51
NTSC Color TV
(N&H, Sec.2.2)• Selection of primaries is limited by the physical display devices and
camera• NTSC (National Television System Committee): 1953 Primaries:
(chromaticity coordinates)
x y z R: 0.67 0.33 0.00G: 0.21 0.71 0.08B: 0.14 0.08 0.78
Video Basics 52
NTSC and PAL Color Ranges
Primaries used for PAL (left) and NTSC (right). The rangeof achievable colors is the interior of the triangle. (N&H, p.100)
Video Basics 53
NTSC Color TV (cont.)
• The tristimulus values R, G, B (using R,G,B primaries) are normalized to
Luminance:
Color-difference:
• Components Transmitted:
.8370~9870~0881~ B.BG,.GR,.R ===
BGRY ~114.0~587.0~299.0 ++=
14.1
~;
03.2
~ YRVYBU −=
−=
oo
oo
33cos33sin33sin33cos
UVQUVI
−=−=
BGR ~,~,~
Video Basics 54
PAL Color TV
• PAL (Phase Alternate Line): 1967 (N&H, Sec. 2.2.2)Primaries: (late CRT display phosphor)
• The tristimulus values R, G, B are normalized to
•• Same formulas in NTSC are used to define Y,U,V in terms of • Components Transmitted: Y, U, V.
.911.0~,494.0~,190.1~ BBGGRR ===
x y z R: 0.64 0.33 0.03 G: 0.29 0.60 0.11 B: 0.15 0.06 0.79
BGR ~,~,~
Video Basics 55
Today's Color TV
NTSC PAL SECAMLines/frame
Active lines
Frame rate
Interlaced
Aspect ratio
Color mod.
Luminance
Chrominance
525
>483
29.97 frams
2:1
4(H):3(V)
QAM
4.2Mhz
1.3(I):0.6(Q)
625
>575
25
2:1
4:3
QAM
5.0,5.5
1.3(U,V)
625
575
25
2:1
4:3
FM
6.0
>1.0(U,V)
Video Basics 56
Today's Color TV (cont.)
• NTSC National Television System Committee): 1953
• PAL (Phase Alternative Line): 1967
• SECAM (Sequentiel Couleur Avec Memoire, or sequential color with memory): 1967
(D.H. Pritchard and J.J. Gibson, "Worldwide Color Television Standards -- Similarities and Differences," SMPTE Journal, pp.111-120, Feb. 80; CCIR Rec.624)
Video Basics 57
TV Picture
Video Basics 58
4. Digital Video
• Digital video by sampling/quantizing analog video raster BT.601 video
• Other digital video formats and their applications
Video Basics 59
Why Digitized?
• The advantages of digital representation for video are many. Forexample:
• (a) Video can be stored on digital devices or in memory, ready to be processed (noise removal, cut and paste, etc.), and integrated to various multimedia applications;
• (b) Direct access is possible, which makes nonlinear video editing achievable as a simple, rather than a complex, task;
• (c) Repeated recording does not degrade image quality;• (d) Ease of encryption and better tolerance to channel noise.
Video Basics 60
Digitizing A Raster Video
• Sample the raster waveform = Sample along the horizontal direction
• Sampling rate must be chosen properly– For the samples to be aligned vertically, the sampling rate should
be multiples of the line rate– Horizontal sampling interval = vertical sampling interval– Total sampling rate equal among different systems
MHz 5.13)(PAL/SECAM864(NTSC)858 === lls fff
Video Basics 61
Digital Color TV
• CCIR 601 (International Radio Consultative Committee): Based on NTSC, PAL, and SECAM (N&H, Sec.2.2.5)
0~1 Volt;⇓ Digitized to 8 bits
Range: 16 - 235
Range: 16 - 240
Range: 16 – 240
Sampled at about 13.5M Hz
:,,~~~BGR BGRY
~~~114.0587.0299.0 ++=
,16219 += YYd
,128701.0
)(112~
+−= YRCR
,128886.0
)(112~
+−= YBCB
Video Basics 62
Chroma Subsampling
• Since humans see color with much less spatial resolution than they see black and white, it makes sense to “decimate" the chrominance signal.
• Interesting (but not necessarily informative!) names have arisen to label the different schemes used.
• To begin with, numbers are given stating how many pixel values, per four original pixels, are actually sent:– (a) The chroma subsampling scheme “4:4:4" indicates that
no chroma subsampling is used: each pixel's Y, Cb and Cr values are transmitted, 4 for each of Y, Cb, Cr.
Video Basics 63
Chroma Subsampling
• (b) The scheme “4:2:2" indicates horizontal subsampling of the Cb, Cr signals by a factor of 2. That is, of four pixels horizontally labelled as 0 to 3, all four Ys are sent, and every two Cb's and two Cr's are sent, as (Cb0, Y0)(Cr0, Y1)(Cb2, Y2)(Cr2, Y3)(Cb4, Y4), and so on (or averaging is used).
• (c) The scheme “4:1:1" subsamples horizontally by a factor of 4.
• (d) The scheme “4:2:0" subsamples in both the horizontal and vertical dimensions by a factor of 2. Theoretically, an average chroma pixel is positioned between the rows and columns
Video Basics 64
Chrominance Subsampling Formats
4:2:0For every 2x2 Y Pixels
1 Cb & 1 Cr Pixel(Subsampling by 2:1 bothhorizontally and vertically)
4:2:2 For every 2x2 Y Pixels
2 Cb & 2 Cr Pixel(Subsampling by 2:1
horizontally only)
4:4:4 For every 2x2 Y Pixels
4 Cb & 4 Cr Pixel(No subsampling)
Y Pixel Cb and Cr Pixel
4:1:1For every 4x1 Y Pixels
1 Cb & 1 Cr Pixel(Subsampling by 4:1
horizontally only)
Video Basics 65
Digital Video Formats
Video Format Y Size Color Sampling
Frame Rate (Hz)
Raw Data Rate (Mbps)
HDTV Over air. cable, satellite, MPEG2 video, 20-45 Mbps SMPTE296M 1280x720 4:2:0 24P/30P/60P 265/332/664 SMPTE295M 1920x1080 4:2:0 24P/30P/60I 597/746/746 Video production, MPEG2, 15-50 Mbps BT.601 720x480/576 4:4:4 60I/50I 249 BT.601 720x480/576 4:2:2 60I/50I 166 High quality video distribution (DVD, SDTV), MPEG2, 4-10 Mbps BT.601 720x480/576 4:2:0 60I/50I 124 Intermediate quality video distribution (VCD, WWW), MPEG1, 1.5 Mbps SIF 352x240/288 4:2:0 30P/25P 30 Video conferencing over ISDN/Internet, H.261/H.263, 128-384 Kbps CIF 352x288 4:2:0 30P 37 Video telephony over wired/wireless modem, H.263, 20-64 Kbps QCIF 176x144 4:2:0 30P 9.1
Video Basics 66
Video Terminology
• Component video– Three color components stored/transmitted separately – Use either RGB or YIQ (YUV) coordinate– New digital video format (YCrCb)– Betacam (professional tape recorder) use this format
• Composite video– Convert RGB to YIQ (YUV)– Multiplexing YIQ into a single signal– Used in most consumer analog video devices– Artifacts due to cross talk between color and luminance signals
• S-video– Y and C (QAM of I and Q) are stored separately– Used in high end consumer video devices
• High end monitors can take input from all three
Video Basics 67
5. Video Quality Measurement
PSNR (Peak Signal Peak Signal-to to-Noise Ratio Noise Ratio): ∑⋅
2
2255log10e
N
Objectives of video coding: high average PSNR, low PSNR variation
Video Basics 68
Subjective Test
(N&H, p.255)
5 Excellent Imperceptible 4 Good Perceptible but not annoying3 Fair Slightly annoying2 Poor Annoying1 Bad Very annoying
Video Basics 69
Objective Test
• MSE: mean square error
• MAD: mean absolute difference, simplified from MSE
• PSNR (dB): most widely used index
– Rule of thumb– > 40dB: excellent– 30 ~ 40 dB: good, visible distortion but acceptable– 20 ~ 30 dB: quite poor– <20 dB: unacceptable
∑ 21 errorN
∑ ||1 errorN
∑⋅
2
2
log10e
peakN