Communication & Multimedia C. -Y. Tsai 2005/8/17 1 MCTF in Current Scalable Video Coding Schemes...
-
date post
21-Dec-2015 -
Category
Documents
-
view
219 -
download
3
Transcript of Communication & Multimedia C. -Y. Tsai 2005/8/17 1 MCTF in Current Scalable Video Coding Schemes...
C. -Y. Tsai 2005/8/171Communication & Multimedia
MCTF in Current Scalable Video MCTF in Current Scalable Video Coding SchemesCoding Schemes
Student: Chia-Yang TsaiStudent: Chia-Yang TsaiAdvisor: Prof. Hsueh-Ming HangAdvisor: Prof. Hsueh-Ming Hang
Institute of Electronics, NCTUInstitute of Electronics, NCTU
C. -Y. Tsai 2005/8/172Communication & Multimedia
OutlineOutline
OverviewOverview MCTF in Interframe Wavelet MCTF in Interframe Wavelet MCTF in JSVMMCTF in JSVM ComparisonComparison ReferencesReferences
C. -Y. Tsai 2005/8/173Communication & Multimedia
OutlineOutline
OverviewOverview Scalable Video CodingScalable Video Coding
MCTF in Interframe Wavelet MCTF in Interframe Wavelet MCTF in JSVMMCTF in JSVM ComparisonComparison ReferencesReferences
C. -Y. Tsai 2005/8/174Communication & Multimedia
Scalable Video CodingScalable Video Coding
Ability to adjustAbility to adjust Different client Different client
requirementsrequirements Scalabilities Scalabilities
Rate/SNRRate/SNR SpatialSpatial TemporalTemporal
Workstation
Database
PersonalComputer
Server
Mobile
C. -Y. Tsai 2005/8/175Communication & Multimedia
MCTFMCTF
MCTF = Motion Compensated Temporal Filtering
C. -Y. Tsai 2005/8/176Communication & Multimedia
Rate/SNR ScalabilityRate/SNR Scalability
Progressive approximationProgressive approximation
GOP Header Motion Info. Image Data
300kbpsPSNR=32.2 dB
500kbpsPSNR=34.6 dB
1000kbpsPSNR=38.2 dB
C. -Y. Tsai 2005/8/177Communication & Multimedia
Spatial ScalabilitySpatial Scalability
Wavelet Wavelet decomposition decomposition provides spatial provides spatial scalabilityscalability
Bit-planeCoder
C. -Y. Tsai 2005/8/178Communication & Multimedia
H H H H
H 2 H 2
H 3
H H H H 1
H 2 H 2
H 3
L H 4
15Hz Video Sequence
7.5Hz Video Sequence
30Hz Video Sequence
3.25Hz Video Sequence
Temporal ScalabilityTemporal Scalability
C. -Y. Tsai 2005/8/179Communication & Multimedia
Scalable Video CodingScalable Video Coding
HistoryHistory
2004.3 20052004.7
MSRA (wavelet)
RPI (wavelet)
UNSW (wavelet)
HHI (AVC-based) JSVM
C. -Y. Tsai 2005/8/1710Communication & Multimedia
ApproachesApproaches
Spatio-Temporal "Transform"
Video
Entropy Coding
Motion Coding
Layered AVC TextureCoding
2D SpatialDecimation
QCIF
Bitstream
CIF
AVC 4x4 integer transform
AVC 4x4 integer transform
2D SpatialInterpolation
5/3 MCTF
5/3 MCTF
An AVC/H.264-based approach (also DCT-based)
C. -Y. Tsai 2005/8/1711Communication & Multimedia
ApproachesApproaches
A wavelet-based approach with t+2D structure.
Video
Entropy Coding
Motion Coding
Texture Coding
Spatio-Temporal "Transform"
5/3 based MCTF 2D Spatial DWT
Bitstream
C. -Y. Tsai 2005/8/1712Communication & Multimedia
ApproachesApproaches
A wavelet-based approach with 2D+t structure
Video
Entropy Coding
Motion Coding
Texture Coding
Spatio-Temporal "Transform"
2D Spatial DWT
Temporal TransformMCTF based
2D SpatialDecomposition
LL
Bitstream
In band TemporalTransform MCTF based
2D SpatialDecomposition
HF
C. -Y. Tsai 2005/8/1713Communication & Multimedia
Lifting SchemeLifting Scheme
5/3 lifting scheme5/3 lifting scheme
2
2
P USk
hk
lk
z-1
S2k+1
S2k
U P
2
2
Sk
S2k
S2k+1
z
Fh
Fl
Fh-1
Fl-1
Lifting Scheme(Analysis Filterbank)
(a)
Inverse Lifting Scheme(Synthesis Filterbank)
(b)
2
2
P USk
hk
lk
z-1
S2k+1
S2k
U P
2
2
Sk
S2k
S2k+1
z
Fh
Fl
Fh-1
Fl-1
Lifting Scheme(Analysis Filterbank)
(a)
Inverse Lifting Scheme(Synthesis Filterbank)
(b)
C. -Y. Tsai 2005/8/1714Communication & Multimedia
OutlineOutline
OverviewOverview MCTF in Interframe WaveletMCTF in Interframe Wavelet
Barbell liftingBarbell lifting In-band MCTFIn-band MCTF Base-layer structureBase-layer structure
MCTF in JSVMMCTF in JSVM ComparisonComparison ReferencesReferences
C. -Y. Tsai 2005/8/1715Communication & Multimedia
Barbell Lifting SchemeBarbell Lifting Scheme
Purpose:Purpose: Improve the accuracy of motion field.Improve the accuracy of motion field.
Methods:Methods: Take (5,3) wavelet kernel.Take (5,3) wavelet kernel. Use “barbell function” to generate Use “barbell function” to generate
prediction /update values.prediction /update values.
C. -Y. Tsai 2005/8/1716Communication & Multimedia
Barbell Lifting SchemeBarbell Lifting Scheme
210 ˆˆ sassat
Barbellfunctions
t
S0 S1 S2
)(ˆ 000 Sfs
1s
a a
)(ˆ 222 Sfs
0s2s
t
S0 S1 S2
)(ˆ 000 Sfs
1s
a a
)(ˆ 222 Sfs
0s2s
C. -Y. Tsai 2005/8/1717Communication & Multimedia
Barbell Lifting SchemeBarbell Lifting SchemeH0
X0 X1 X2 X3 X4
)(ˆ 000 Xfx )(ˆ 444 Xfx
1x
H1
)(ˆ 222 Xfx
)(''ˆ 222 Xfx
-a -a -a -a
L0
x0
H0
x2
H1 X4
)(ˆ000 Hgh
)(''ˆ 111 Hgh
L2
)(''ˆ 000 Hgh
)(ˆ111 Hgh
2b b
X0 X2
L1
x4
b 2b
Prediction Stage
Update Stage
C. -Y. Tsai 2005/8/1718Communication & Multimedia
In-Band MCTFIn-Band MCTF
Purpose:Purpose: Improve coding performance with Improve coding performance with
spatial scalabilityspatial scalability Methods:Methods:
Leaky motion compensationLeaky motion compensation Mode-based temporal filteringMode-based temporal filtering
C. -Y. Tsai 2005/8/1719Communication & Multimedia
In-Band MCTFIn-Band MCTF
LL
Sinc4x4interpolation ¼-pixel
interpolationreference based
on LL
LH
HL
HH
IDWT+ODWT ODWT
LL
Sinc2x2interpolation
¼-pixelinterpolation
reference basedon ODWT LL
LL
The forming of different quality reference The forming of different quality reference of LL of LL Low quality reference as IP_DIRLow quality reference as IP_DIR
High quality reference as IP_LBSHigh quality reference as IP_LBS
C. -Y. Tsai 2005/8/1720Communication & Multimedia
In-Band MCTFIn-Band MCTF
Leaky motion compensationLeaky motion compensation leaky factor leaky factor
Attenuate the prediction based on the Attenuate the prediction based on the unknown information at the decoder unknown information at the decoder
make a good trade-off between drifting make a good trade-off between drifting errors and coding efficiency errors and coding efficiency
1,...,0
)),(_)(_)1((2
1
)),(_)(_)1((2
1),(
)(
22122222
21222222
222121
Ni
MVLLBSIPLDIRIPMC
MVLLBSIPLDIRIPMCLLP
LLPLH
nni
ni
n
nnin
in
in
in
in
in
in
in
C. -Y. Tsai 2005/8/1721Communication & Multimedia
In-Band MCTFIn-Band MCTF
Mode-based Mode-based temporal filteringtemporal filtering Mode I: Low quality Mode I: Low quality
referencereference Mode 2: High Mode 2: High
quality referencequality reference Mode is selected by Mode is selected by
RD costRD cost
LL
LL HL
LH HH
LL
LL
LL HL
LH HH
C. -Y. Tsai 2005/8/1722Communication & Multimedia
Base-Layer StructureBase-Layer Structure
Purpose:Purpose: Coding efficiency improvement in low Coding efficiency improvement in low
ratesrates AVC compatibleAVC compatible
Methods:Methods: Insert AVC encoding module into MCTFInsert AVC encoding module into MCTF
C. -Y. Tsai 2005/8/1723Communication & Multimedia
Base-Layer StructureBase-Layer Structure
Encoder
Decoder
C. -Y. Tsai 2005/8/1724Communication & Multimedia
OutlineOutline
OverviewOverview MCTF in Interframe WaveletMCTF in Interframe Wavelet MCTF in JSVMMCTF in JSVM
Base layer structure Base layer structure Inter-layer predictionInter-layer prediction Adaptive prediction/update stepsAdaptive prediction/update steps
ComparisonComparison ReferencesReferences
C. -Y. Tsai 2005/8/1725Communication & Multimedia
Base Layer StructureBase Layer Structure
PurposePurpose Coding efficiency improvement in low Coding efficiency improvement in low
ratesrates Compatibility to AVCCompatibility to AVC
MethodsMethods Unrestricted MCTF (UMCTF)Unrestricted MCTF (UMCTF) Hierarchical B picturesHierarchical B pictures
C. -Y. Tsai 2005/8/1726Communication & Multimedia
Base Layer StructureBase Layer Structure
UMCTFUMCTF Update step is omitted.Update step is omitted.
Hierarchical B picturesHierarchical B pictures Fully compatible to AVC Main profileFully compatible to AVC Main profile Non-dyadic decomposition is availableNon-dyadic decomposition is available
A B3 B2B1 AB3 B3 B3B2
L3 H1 H2H3H2 L3H1 H1 H1
GOP boundaries
AVC Main Profilecompatible base layer
MCTF enhancementlayer
C. -Y. Tsai 2005/8/1727Communication & Multimedia
Non-Dyadic DecompositionNon-Dyadic Decomposition
H1
L0
H2
L1
L0
H1
L0
L2
H3
L1
L0
H1
L0
H2
L1
L0
H1
L0
L2
H3
L1
L0
H1
L0
H2
L1
L0
H1
L0
L2
L3
L1
L0
H1H1
L0L0
H2H2
L1L1
L0L0
H1H1
L0L0
L2L2
H3H3
L1L1
L0L0
H1H1
L0L0
H2H2
L1L1
L0L0
H1H1
L0L0
L2L2
H3H3
L1L1
L0L0
H1H1
L0L0
H2H2
L1L1
L0L0
H1H1
L0L0
L2L2
L3L3
L1L1
L0L0
Level 0: full resolution
Level 1: 1/2 of the full resolution
Level 2: 1/4 of the full resolution
Level 3: 1/12 of the full resolution
C. -Y. Tsai 2005/8/1728Communication & Multimedia
Inter-Layer PredictionInter-Layer Prediction
PurposePurpose Reduce redundancy between layersReduce redundancy between layers
MethodsMethods Inter-layer texture predictionInter-layer texture prediction Inter-layer motion predictionInter-layer motion prediction
C. -Y. Tsai 2005/8/1729Communication & Multimedia
Inter-Layer PredictionInter-Layer Prediction
Video Bitstream
2D Decimation(by 2)
MultiplexMCTF
Motion CodingMotion
TextureSpatial Transform -
SNR ScalableEntropy Coding
2D Decimation(by 4)
MCTF
Motion CodingMotion
Texture
MCTF
Motion CodingMotion
Texture
Prediction
Prediction
Prediction
Interpolation
Interpolation
Spatial Transform -SNR Scalable
Entropy Coding
Spatial Transform -SNR Scalable
Entropy Coding
C. -Y. Tsai 2005/8/1730Communication & Multimedia
Adaptive Prediction/Update StepsAdaptive Prediction/Update Steps
Purpose:Purpose: Delay (Memory)controlDelay (Memory)control
Method:Method: Sub-partitioning of GOPSub-partitioning of GOP
C. -Y. Tsai 2005/8/1731Communication & Multimedia
Adaptive Prediction/Update StepsAdaptive Prediction/Update Steps
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
H
0
H H H H H H H
L L L L L L L L
H H H H
L L L L
H H
L L
H
L
80 G 80 G30 C 30 C
1 0 2 4 6 5 7 3 9 8 10 12 14 13 15 11coding order
prediction
update
prediction
update
prediction
update
prediction
update
GOP border partition bordersub-partition
borderGOP border
sub-partition border
C. -Y. Tsai 2005/8/1732Communication & Multimedia
OutlineOutline
OverviewOverview MCTF in Interframe Wavelet MCTF in Interframe Wavelet
videovideo MCTF in JSVMMCTF in JSVM ComparisonComparison
Cons and prosCons and pros Experimental resultsExperimental results
ReferencesReferences
C. -Y. Tsai 2005/8/1733Communication & Multimedia
Wavelet Based SVCWavelet Based SVC
Key featuresKey features 3D wavelet decomposition3D wavelet decomposition Open-loop prediction structureOpen-loop prediction structure Spatial-temporal resolution scalabilitySpatial-temporal resolution scalability SNR scalabilitySNR scalability
C. -Y. Tsai 2005/8/1734Communication & Multimedia
Wavelet Based SVCWavelet Based SVC
AdvantagesAdvantages Nature for multi-resolution scalability Nature for multi-resolution scalability Open-loop prediction structureOpen-loop prediction structure
Provides elegant SNR scalability without Provides elegant SNR scalability without impairing full exploitation of spatial-impairing full exploitation of spatial-temporal correlation temporal correlation
Simplifies the R-D model of the bitstreams. Simplifies the R-D model of the bitstreams. Facilitates the bitstream truncation Facilitates the bitstream truncation
each subband is independent with other each subband is independent with other subbands subbands
C. -Y. Tsai 2005/8/1735Communication & Multimedia
Wavelet Based SVCWavelet Based SVC
DisadvantagesDisadvantages Decomposition modes (coding modes) Decomposition modes (coding modes)
selection selection Texture & side information trade offTexture & side information trade off Intra-prediction Intra-prediction Badly-matched blocks Badly-matched blocks Downsampling filter problemsDownsampling filter problems
C. -Y. Tsai 2005/8/1736Communication & Multimedia
AVC Based SVCAVC Based SVC
Key featuresKey features MCTF/Hierarchical B structure for MCTF/Hierarchical B structure for
temporal scalability temporal scalability Hierarchical B structure with close-loop Hierarchical B structure with close-loop
structure for base layer structure for base layer Multiple spatial layers for spatial Multiple spatial layers for spatial
scalability scalability Multiple FGS layers at each spatial Multiple FGS layers at each spatial
resolution for SNR scalability resolution for SNR scalability DCT coding of all the frames DCT coding of all the frames
C. -Y. Tsai 2005/8/1737Communication & Multimedia
AVC Based SVCAVC Based SVC
AdvantagesAdvantages All the RDO and intra-prediction can be All the RDO and intra-prediction can be
used.used. It guarantees the quality of the first testing It guarantees the quality of the first testing
point. point. MPEG filter for low resolution video MPEG filter for low resolution video
the target low resolution video is visually the target low resolution video is visually good. good.
C. -Y. Tsai 2005/8/1738Communication & Multimedia
AVC Based SVCAVC Based SVC
DisadvantagesDisadvantages Redundancy between spatial layers Redundancy between spatial layers
C. -Y. Tsai 2005/8/1739Communication & Multimedia
Experiments Experiments
Foreman
30
31
32
33
34
35
36
0 50 100 150 200 250 300 350 400 450Rate(kbps)
PSNR
(dB)
J SVM1 CI F30
J SVM1 CI F15
J SVM1 CI F7. 5
MSRA CI F30
MSRA CI F15
MSRA CI F 7. 5
J SVM1 CI F30 wi th defaul t confi g
J SVM1 CI F15 wi th defaul t confi g
J SVM1 CI F7. 5 wi th defaul t confi g
C. -Y. Tsai 2005/8/1740Communication & Multimedia
ExperimentsExperiments
Bus
26
27
28
29
30
31
32
33
0 100 200 300 400 500 600 700 800 900Rate(kbps)
PSNR
(dB)
J SVM1 CI F30
J SVM1 CI F15
J SVM1 CI F7. 5
MSRA CI F30
MSRA CI F15
MSRA CI F 7. 5
J SVM1 CI F30 wi th defaul t confi g
J SVM1 CI F15 wi th defaul t confi g
J SVM1 CI F7. 5 wi th defaul t confi g
C. -Y. Tsai 2005/8/1741Communication & Multimedia
ReferencesReferences
[1] “Draft of joint scalable video model (JSVM)3.0 reference [1] “Draft of joint scalable video model (JSVM)3.0 reference encoding algorithm description”, ISO/IEC JTC1/SC29/WG11, encoding algorithm description”, ISO/IEC JTC1/SC29/WG11, N7311, Poznan, July 2005.N7311, Poznan, July 2005.
[2] D. Zhang, J. Xu, H. Xiong, and F. Wu, “Improvement for in-band [2] D. Zhang, J. Xu, H. Xiong, and F. Wu, “Improvement for in-band video coding with spatial scalability”, ISO/IEC JTC1/SC29/WG11, video coding with spatial scalability”, ISO/IEC JTC1/SC29/WG11, M11681, HongKong, Jan. 2005.M11681, HongKong, Jan. 2005.
[3] V. Bottreau, G. Pau, and J. Xu, “Vidwav evaluation software [3] V. Bottreau, G. Pau, and J. Xu, “Vidwav evaluation software manual”, ISO/IEC JTC1/SC29/WG11, M12176, Poznan, July. manual”, ISO/IEC JTC1/SC29/WG11, M12176, Poznan, July. 2005.2005.
[4] X. Ji, J. Xu, D. Zhao, and F. Wu, “Responses of CE1d: base- [4] X. Ji, J. Xu, D. Zhao, and F. Wu, “Responses of CE1d: base- layer”, ISO/IEC JTC1/SC29/WG11, M11127, Redmond, July 2004.layer”, ISO/IEC JTC1/SC29/WG11, M11127, Redmond, July 2004.
[5] R. Xiong, J. Xu, and F. Wu, “Coding performance comparison [5] R. Xiong, J. Xu, and F. Wu, “Coding performance comparison between MSRA wavelet video coding and JSVM”, ISO/IEC between MSRA wavelet video coding and JSVM”, ISO/IEC JTC1/SC29/WG11, M11975, Busan, April 2005.JTC1/SC29/WG11, M11975, Busan, April 2005.
[6] R. Xiong, J. Xu, and F. Wu, “Response to VidWav EE1”, ISO/IEC [6] R. Xiong, J. Xu, and F. Wu, “Response to VidWav EE1”, ISO/IEC JTC1/SC29/WG11, M12286, Poznan, July 2005.JTC1/SC29/WG11, M12286, Poznan, July 2005.
[7] J. Reichel, K. Hanke and B. Popescu, “Scalable Video Model [7] J. Reichel, K. Hanke and B. Popescu, “Scalable Video Model V1.0”, ISO/IEC JTC1/SC29/WG11, N6372, Munich, March 2004.V1.0”, ISO/IEC JTC1/SC29/WG11, N6372, Munich, March 2004.