Communication & Multimedia C. -Y. Tsai 2005/8/17 1 MCTF in Current Scalable Video Coding Schemes...

C. -Y. Tsai 2005/8/171Communication & Multimedia

MCTF in Current Scalable Video MCTF in Current Scalable Video Coding SchemesCoding Schemes

Student: Chia-Yang TsaiStudent: Chia-Yang TsaiAdvisor: Prof. Hsueh-Ming HangAdvisor: Prof. Hsueh-Ming Hang

Institute of Electronics, NCTUInstitute of Electronics, NCTU


OutlineOutline

OverviewOverview MCTF in Interframe Wavelet MCTF in Interframe Wavelet MCTF in JSVMMCTF in JSVM ComparisonComparison ReferencesReferences


OutlineOutline

OverviewOverview Scalable Video CodingScalable Video Coding

MCTF in Interframe Wavelet MCTF in Interframe Wavelet MCTF in JSVMMCTF in JSVM ComparisonComparison ReferencesReferences


Scalable Video CodingScalable Video Coding

Ability to adjustAbility to adjust Different client Different client

requirementsrequirements Scalabilities Scalabilities

Rate/SNRRate/SNR SpatialSpatial TemporalTemporal

Workstation

Database

PersonalComputer

Server

Mobile


MCTFMCTF

MCTF = Motion Compensated Temporal Filtering


Rate/SNR ScalabilityRate/SNR Scalability

Progressive approximationProgressive approximation

GOP Header Motion Info. Image Data

300kbpsPSNR=32.2 dB

500kbpsPSNR=34.6 dB

1000kbpsPSNR=38.2 dB


Spatial ScalabilitySpatial Scalability

Wavelet Wavelet decomposition decomposition provides spatial provides spatial scalabilityscalability

Bit-planeCoder


H H H H

H 2 H 2

H 3

H H H H 1

H 2 H 2

H 3

L H 4

15Hz Video Sequence

7.5Hz Video Sequence

30Hz Video Sequence

3.25Hz Video Sequence

Temporal ScalabilityTemporal Scalability


Scalable Video CodingScalable Video Coding

HistoryHistory

2004.3 20052004.7

MSRA (wavelet)

RPI (wavelet)

UNSW (wavelet)

HHI (AVC-based) JSVM


ApproachesApproaches

Spatio-Temporal "Transform"

Video

Entropy Coding

Motion Coding

Layered AVC TextureCoding

2D SpatialDecimation

QCIF

Bitstream

CIF

AVC 4x4 integer transform

AVC 4x4 integer transform

2D SpatialInterpolation

5/3 MCTF

5/3 MCTF

An AVC/H.264-based approach (also DCT-based)



A wavelet-based approach with t+2D structure.

Video

Entropy Coding

Motion Coding

Texture Coding


5/3 based MCTF 2D Spatial DWT

Bitstream



A wavelet-based approach with 2D+t structure

Video

Entropy Coding

Motion Coding

Texture Coding


2D Spatial DWT

Temporal TransformMCTF based

2D SpatialDecomposition

LL

Bitstream

In band TemporalTransform MCTF based

2D SpatialDecomposition

HF


Lifting SchemeLifting Scheme

5/3 lifting scheme5/3 lifting scheme

2

2

P USk

hk

lk

z-1

S2k+1

S2k

U P

2

2

Sk

S2k

S2k+1

z

Fh

Fl

Fh-1

Fl-1

Lifting Scheme(Analysis Filterbank)

(a)

Inverse Lifting Scheme(Synthesis Filterbank)

(b)

2

2

P USk

hk

lk

z-1

S2k+1

S2k

U P

2

2

Sk

S2k

S2k+1

z

Fh

Fl

Fh-1

Fl-1

Lifting Scheme(Analysis Filterbank)

(a)

Inverse Lifting Scheme(Synthesis Filterbank)

(b)


OutlineOutline

OverviewOverview MCTF in Interframe WaveletMCTF in Interframe Wavelet

Barbell liftingBarbell lifting In-band MCTFIn-band MCTF Base-layer structureBase-layer structure

MCTF in JSVMMCTF in JSVM ComparisonComparison ReferencesReferences


Barbell Lifting SchemeBarbell Lifting Scheme

Purpose:Purpose: Improve the accuracy of motion field.Improve the accuracy of motion field.

Methods:Methods: Take (5,3) wavelet kernel.Take (5,3) wavelet kernel. Use “barbell function” to generate Use “barbell function” to generate

prediction /update values.prediction /update values.


Barbell Lifting SchemeBarbell Lifting Scheme

210 ˆˆ sassat

Barbellfunctions

t

S0 S1 S2

)(ˆ 000 Sfs

1s

a a

)(ˆ 222 Sfs

0s2s

t

S0 S1 S2

)(ˆ 000 Sfs

1s

a a

)(ˆ 222 Sfs

0s2s


Barbell Lifting SchemeBarbell Lifting SchemeH0

X0 X1 X2 X3 X4

)(ˆ 000 Xfx )(ˆ 444 Xfx

1x

H1

)(ˆ 222 Xfx

)(''ˆ 222 Xfx

-a -a -a -a

L0

x0

H0

x2

H1 X4

)(ˆ000 Hgh

)(''ˆ 111 Hgh

L2

)(''ˆ 000 Hgh

)(ˆ111 Hgh

2b b

X0 X2

L1

x4

b 2b

Prediction Stage

Update Stage


In-Band MCTFIn-Band MCTF

Purpose:Purpose: Improve coding performance with Improve coding performance with

spatial scalabilityspatial scalability Methods:Methods:

Leaky motion compensationLeaky motion compensation Mode-based temporal filteringMode-based temporal filtering



LL

Sinc4x4interpolation ¼-pixel

interpolationreference based

on LL

LH

HL

HH

IDWT+ODWT ODWT

LL

Sinc2x2interpolation

¼-pixelinterpolation

reference basedon ODWT LL

LL

The forming of different quality reference The forming of different quality reference of LL of LL Low quality reference as IP_DIRLow quality reference as IP_DIR

High quality reference as IP_LBSHigh quality reference as IP_LBS



Leaky motion compensationLeaky motion compensation leaky factor leaky factor

Attenuate the prediction based on the Attenuate the prediction based on the unknown information at the decoder unknown information at the decoder

make a good trade-off between drifting make a good trade-off between drifting errors and coding efficiency errors and coding efficiency

1,...,0

)),(_)(_)1((2

1

)),(_)(_)1((2

1),(

)(

22122222

21222222

222121

Ni

MVLLBSIPLDIRIPMC

MVLLBSIPLDIRIPMCLLP

LLPLH

nni

ni

n

nnin

in

in

in

in

in

in

in



Mode-based Mode-based temporal filteringtemporal filtering Mode I: Low quality Mode I: Low quality

referencereference Mode 2: High Mode 2: High

quality referencequality reference Mode is selected by Mode is selected by

RD costRD cost

LL

LL HL

LH HH

LL

LL

LL HL

LH HH


Base-Layer StructureBase-Layer Structure

Purpose:Purpose: Coding efficiency improvement in low Coding efficiency improvement in low

ratesrates AVC compatibleAVC compatible

Methods:Methods: Insert AVC encoding module into MCTFInsert AVC encoding module into MCTF


Base-Layer StructureBase-Layer Structure

Encoder

Decoder


OutlineOutline

OverviewOverview MCTF in Interframe WaveletMCTF in Interframe Wavelet MCTF in JSVMMCTF in JSVM

Base layer structure Base layer structure Inter-layer predictionInter-layer prediction Adaptive prediction/update stepsAdaptive prediction/update steps

ComparisonComparison ReferencesReferences


Base Layer StructureBase Layer Structure

PurposePurpose Coding efficiency improvement in low Coding efficiency improvement in low

ratesrates Compatibility to AVCCompatibility to AVC

MethodsMethods Unrestricted MCTF (UMCTF)Unrestricted MCTF (UMCTF) Hierarchical B picturesHierarchical B pictures


Base Layer StructureBase Layer Structure

UMCTFUMCTF Update step is omitted.Update step is omitted.

Hierarchical B picturesHierarchical B pictures Fully compatible to AVC Main profileFully compatible to AVC Main profile Non-dyadic decomposition is availableNon-dyadic decomposition is available

A B3 B2B1 AB3 B3 B3B2

L3 H1 H2H3H2 L3H1 H1 H1

GOP boundaries

AVC Main Profilecompatible base layer

MCTF enhancementlayer


Non-Dyadic DecompositionNon-Dyadic Decomposition

H1

L0

H2

L1

L0

H1

L0

L2

H3

L1

L0

H1

L0

H2

L1

L0

H1

L0

L2

H3

L1

L0

H1

L0

H2

L1

L0

H1

L0

L2

L3

L1

L0

H1H1

L0L0

H2H2

L1L1

L0L0

H1H1

L0L0

L2L2

H3H3

L1L1

L0L0

H1H1

L0L0

H2H2

L1L1

L0L0

H1H1

L0L0

L2L2

H3H3

L1L1

L0L0

H1H1

L0L0

H2H2

L1L1

L0L0

H1H1

L0L0

L2L2

L3L3

L1L1

L0L0

Level 0: full resolution

Level 1: 1/2 of the full resolution




Inter-Layer PredictionInter-Layer Prediction

PurposePurpose Reduce redundancy between layersReduce redundancy between layers

MethodsMethods Inter-layer texture predictionInter-layer texture prediction Inter-layer motion predictionInter-layer motion prediction


Inter-Layer PredictionInter-Layer Prediction

Video Bitstream

2D Decimation(by 2)

MultiplexMCTF

Motion CodingMotion

TextureSpatial Transform -

SNR ScalableEntropy Coding

2D Decimation(by 4)

MCTF

Motion CodingMotion

Texture

MCTF

Motion CodingMotion

Texture

Prediction

Prediction

Prediction

Interpolation

Interpolation

Spatial Transform -SNR Scalable

Entropy Coding

Spatial Transform -SNR Scalable

Entropy Coding


Adaptive Prediction/Update StepsAdaptive Prediction/Update Steps

Purpose:Purpose: Delay (Memory)controlDelay (Memory)control

Method:Method: Sub-partitioning of GOPSub-partitioning of GOP


Adaptive Prediction/Update StepsAdaptive Prediction/Update Steps

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

H

0

H H H H H H H

L L L L L L L L

H H H H

L L L L

H H

L L

H

L

80 G 80 G30 C 30 C

1 0 2 4 6 5 7 3 9 8 10 12 14 13 15 11coding order

prediction

update

prediction

update

prediction

update

prediction

update

GOP border partition bordersub-partition

borderGOP border

sub-partition border


OutlineOutline

OverviewOverview MCTF in Interframe Wavelet MCTF in Interframe Wavelet

videovideo MCTF in JSVMMCTF in JSVM ComparisonComparison

Cons and prosCons and pros Experimental resultsExperimental results

ReferencesReferences


Wavelet Based SVCWavelet Based SVC

Key featuresKey features 3D wavelet decomposition3D wavelet decomposition Open-loop prediction structureOpen-loop prediction structure Spatial-temporal resolution scalabilitySpatial-temporal resolution scalability SNR scalabilitySNR scalability



AdvantagesAdvantages Nature for multi-resolution scalability Nature for multi-resolution scalability Open-loop prediction structureOpen-loop prediction structure

Provides elegant SNR scalability without Provides elegant SNR scalability without impairing full exploitation of spatial-impairing full exploitation of spatial-temporal correlation temporal correlation

Simplifies the R-D model of the bitstreams. Simplifies the R-D model of the bitstreams. Facilitates the bitstream truncation Facilitates the bitstream truncation

each subband is independent with other each subband is independent with other subbands subbands



DisadvantagesDisadvantages Decomposition modes (coding modes) Decomposition modes (coding modes)

selection selection Texture & side information trade offTexture & side information trade off Intra-prediction Intra-prediction Badly-matched blocks Badly-matched blocks Downsampling filter problemsDownsampling filter problems


AVC Based SVCAVC Based SVC

Key featuresKey features MCTF/Hierarchical B structure for MCTF/Hierarchical B structure for

temporal scalability temporal scalability Hierarchical B structure with close-loop Hierarchical B structure with close-loop

structure for base layer structure for base layer Multiple spatial layers for spatial Multiple spatial layers for spatial

scalability scalability Multiple FGS layers at each spatial Multiple FGS layers at each spatial

resolution for SNR scalability resolution for SNR scalability DCT coding of all the frames DCT coding of all the frames



AdvantagesAdvantages All the RDO and intra-prediction can be All the RDO and intra-prediction can be

used.used. It guarantees the quality of the first testing It guarantees the quality of the first testing

point. point. MPEG filter for low resolution video MPEG filter for low resolution video

the target low resolution video is visually the target low resolution video is visually good. good.



DisadvantagesDisadvantages Redundancy between spatial layers Redundancy between spatial layers


Experiments Experiments

Foreman

30

31

32

33

34

35

36

0 50 100 150 200 250 300 350 400 450Rate(kbps)

PSNR

(dB)

J SVM1 CI F30

J SVM1 CI F15

J SVM1 CI F7. 5

MSRA CI F30

MSRA CI F15

MSRA CI F 7. 5

J SVM1 CI F30 wi th defaul t confi g


J SVM1 CI F7. 5 wi th defaul t confi g


ExperimentsExperiments

Bus

26

27

28

29

30

31

32

33

0 100 200 300 400 500 600 700 800 900Rate(kbps)

PSNR

(dB)

J SVM1 CI F30

J SVM1 CI F15

J SVM1 CI F7. 5

MSRA CI F30

MSRA CI F15

MSRA CI F 7. 5



J SVM1 CI F7. 5 wi th defaul t confi g


ReferencesReferences

[1] “Draft of joint scalable video model (JSVM)3.0 reference [1] “Draft of joint scalable video model (JSVM)3.0 reference encoding algorithm description”, ISO/IEC JTC1/SC29/WG11, encoding algorithm description”, ISO/IEC JTC1/SC29/WG11, N7311, Poznan, July 2005.N7311, Poznan, July 2005.

[2] D. Zhang, J. Xu, H. Xiong, and F. Wu, “Improvement for in-band [2] D. Zhang, J. Xu, H. Xiong, and F. Wu, “Improvement for in-band video coding with spatial scalability”, ISO/IEC JTC1/SC29/WG11, video coding with spatial scalability”, ISO/IEC JTC1/SC29/WG11, M11681, HongKong, Jan. 2005.M11681, HongKong, Jan. 2005.

[3] V. Bottreau, G. Pau, and J. Xu, “Vidwav evaluation software [3] V. Bottreau, G. Pau, and J. Xu, “Vidwav evaluation software manual”, ISO/IEC JTC1/SC29/WG11, M12176, Poznan, July. manual”, ISO/IEC JTC1/SC29/WG11, M12176, Poznan, July. 2005.2005.

[4] X. Ji, J. Xu, D. Zhao, and F. Wu, “Responses of CE1d: base- [4] X. Ji, J. Xu, D. Zhao, and F. Wu, “Responses of CE1d: base- layer”, ISO/IEC JTC1/SC29/WG11, M11127, Redmond, July 2004.layer”, ISO/IEC JTC1/SC29/WG11, M11127, Redmond, July 2004.

[5] R. Xiong, J. Xu, and F. Wu, “Coding performance comparison [5] R. Xiong, J. Xu, and F. Wu, “Coding performance comparison between MSRA wavelet video coding and JSVM”, ISO/IEC between MSRA wavelet video coding and JSVM”, ISO/IEC JTC1/SC29/WG11, M11975, Busan, April 2005.JTC1/SC29/WG11, M11975, Busan, April 2005.

[6] R. Xiong, J. Xu, and F. Wu, “Response to VidWav EE1”, ISO/IEC [6] R. Xiong, J. Xu, and F. Wu, “Response to VidWav EE1”, ISO/IEC JTC1/SC29/WG11, M12286, Poznan, July 2005.JTC1/SC29/WG11, M12286, Poznan, July 2005.

[7] J. Reichel, K. Hanke and B. Popescu, “Scalable Video Model [7] J. Reichel, K. Hanke and B. Popescu, “Scalable Video Model V1.0”, ISO/IEC JTC1/SC29/WG11, N6372, Munich, March 2004.V1.0”, ISO/IEC JTC1/SC29/WG11, N6372, Munich, March 2004.

Communication & Multimedia C. -Y. Tsai 2005/8/17 1 MCTF in Current Scalable Video Coding Schemes...

Documents

Transcript of Communication & Multimedia C. -Y. Tsai 2005/8/17 1 MCTF in Current Scalable Video Coding Schemes...