Conference title 1 A WYNER-ZIV TO H.264 VIDEO TRANSCODER José Luis Martínez, Pedro Cuenca, Gerardo...

Conference title 1

A WYNER-ZIV TO H.264 VIDEO TRANSCODER

José Luis Martínez , Pedro Cuenca, Gerardo Fernández-Escribano, Francisco José Quiles and Hari Kalva

February 27, 2009

Barcelona, Spain

RTCTCM 2009 2

OVERVIEW

1. Introduction/Motivation

2. H.264 and WZ Video Coding Paradigms

3. Proposed WZ/H.264 Video Transcoder

4. Results

5. Conclusion

RTCTCM 2009 3

OVERVIEW




4. Results

5. Conclusion

RTCTCM 2009 4

Introduction/Motivation (I)

Comm. Tower

TRANSCODER

TRANSCODERTRANSCODER

• Requirements: • Low complexity devices. Low complexity

encoding / decoding algorithm

• Battery consumptions

• Low cost

• Solution adopted: • Traditional video coding with low

complexity tool

• Low complexity profiles (baseline, etc…)

• Rate-Distortion loss

RTCTCM 2009 5

Introduction/Motivation (II)

• What is video transcoding?• The process of converting video encoded

sequence from format A to format B. The conversion process can affect one or more of the coding parameters, such as frame rate, bitrate, resolution, quality, error resilience, etc.

• When is it necessary?• Mismatch in video capabilities and resources at

the sender and the receiver– Resource mismatch, e.g. sender has CIF and the

receiver requires QCIF resolution), bandwidth, computing resources, battery life, etc.

– Format mismatch, e.g. sender has MPEG-2 video and the receiver requires H.264 video

• Then we need transcoding• How to transcode?

– Decode video A and then encode to video B

• How to transcode efficiently?– Complexity reduction is the key problem

– Manage quality vs. complexity tradeoffs

Video Transcoder

RTCTCM 2009 6

OVERVIEW




4. Results

5. Conclusion

RTCTCM 2009 7

H.264 Video Coding Paradigm

• H.264 is becoming popular, why?

• The H.264/AVC standard achieves much higher coding efficiency than the H.263, MPEG-2, and MPEG-4 standards, due to its improved inter and intra prediction modes at the expense of higher computation complexity. Therefore, H.264/AVC is a strong candidate for a wide range of applications in the near future.

• The complexity reduction techniques for the ME and MC for transcoders based on H.264 is the key for developing fast real-time systems.

Number of Profiles:- Baseline, Main, Extended, and FRExt.

Motion compensation minimum block size: - From 16x16 to 4x4 (frame/field)

Motion vector accuracy:- Quarter-pixel

Transform:- 4x4 DCT integer approximation

Reference frames:- Up to 16 reference frames

Built-in deblocking filter:- Yes

Intra prediction:- Yes

H.264

RTCTCM 2009 8

WZ Video Coding Paradigm

• Low-complexity encoding makes possible more complex decoding

• Side Information: Estimation of X´i available at the decoder.

• What we use to correct errors / mistakes in communications?

– ERROR CORRECTIONS CODES

– Coset codes, Turbo codes, LDPC, …

– Turbo Trellis Codes Modulation (TTCM)

RTCTCM 2009 9

Emerging Challenges

• Applications (from down-link to up-link)• Wireless digital video cameras

• Multimedia mobile phones and PDAs

• Low-power video sensors and surveillance cameras

• Wireless video teleconferencing systems

• Requirements• Light and flexible distribution codec complexity

• Robustness to packet/frame losses

• High compression efficiency

• Low latency

• Target• Inter coding efficiency

• Intra coding complexity (encoder)

• Intra coding robustness

RTCTCM 2009 10

OVERVIEW




4. Results

5. Conclusion

RTCTCM 2009 11

Proposed WZ/H.264 Transcoder

TRANSCODER

TTCM

Decoder

Parity bit stream

F n - 1 (reference)

+

-

Inter

Intra

+

T

Q

Reorder

T - 1 Q - 1 +

D n

D´ n

F n (current)

Entropy encode

Deblocking Filte

r F´ n (reconstructe

d)

X

NAL

u F´ n

P

WYNER-ZIV DECODER -

H.264 ENCODER

Intra predictio

n

Key Frames

Bitstream

REGULAR CODED BITSTREAM

H.264 I Frames Regular

Intra-Frame

Decoder

X’2i-1

X’2i+1 Side

Information

Generation

DCT Y2i

Reconstruction

MC

MV’s

Buffer IDCT

q’2i X’2i

ME

RTCTCM 2009 12

Exhaustive Search

DMW Search

M

N

Block Size = M x N

Search area (M + 2d max) (N + 2d max)

Number of search points = (2d max+1)2

Search area: x2 + y2 <= (MVx2 +

MVy2), where x and y are the

coordinates of the candidate points Number of search points: Inner point of the circumference

N + 2d max

M + 2d max

d max

d max

Side Information motion vector (MVx, MVy)

1st Step: Reducing the Motion Estimation

TTCM

Decoder

Parity bit stream

F n - 1 (reference)

+

-

Inter

Intra

+

T

Q

Reorder

T - 1 Q - 1 +

D n

D´ n

F n (current)

Entropy encode

Deblocking Filte


d)

X

NAL

u F´ n

P

WYNER-ZIV DECODER -

H.264 ENCODER

Intra predictio

n

Key Frames

Bitstream



Intra-Frame

Decoder

X’2i-1

X’2i+1 Side

Information

Generation

DCT Y2i

Reconstruction

MC

MV’s

Buffer IDCT

q’2i X’2i

ME

RTCTCM 2009 13

2nd Step: Reducing the Inter Prediction

• Inter frame coding of H.264, seven different block division modes (16x16, 16x8, 8x16, 8x8, 8x4, 4x8 and 4x4)

• H.264 adopts the spatial domain intra prediction in the block sizes 16x16 and 4x4, which include four and nine directional predictions.

• For each of these partions, the motion estimation is carried out

RTCTCM 2009 14

Background

• What is data mining?• Algorithms and techniques that allow computers to learn

• A decision tree is made by mapping the observations about a set of data in a tree made of arcs and nodes

• The goal is to reduce macroblock mode search in H.264 to a classification problem

• The decision trees use the information from the incoming WZ video

• Sum of Absolute Differences values (SAD), Motion Vectors (MVs) length and the amount of pixels that have to be reconstructed

• The coding mode of the corresponding MBs in H.264 is also saved for training purposes

Decision Trees

Data Mining Tools

WZ Decoder H.264 Encoder

Video Sequence

SAD, MVs length and the amount

of pixels

H.264 MB Coding Modes

Pixel data

RTCTCM 2009 15

Background

• The decision tree for MB mode classification was made using the WEKA

• WEKA is a collection of machine learning algorithms for data mining tasks. The algorithms can be applied directly to a dataset

• It is open source software

• ARFF files are used to prepare the data sets for training

@RELATION 4x4_mean_variance @ATTRIBUTE mean0 NUMERIC @ATTRIBUTE variance0 NUMERIC @ATTRIBUTE mean1 NUMERIC @ATTRIBUTE variance1 NUMERIC .............................. @ATTRIBUTE mean15 NUMERIC @ATTRIBUTE variance15 NUMERIC @ATTRIBUTE h263_CBPC_0 {0,1} @ATTRIBUTE h263_CBPC_1 {0,1} @ATTRIBUTE h263_CBPC_2 {0,1} @ATTRIBUTE h263_CBPC_3 {0,1} @ATTRIBUTE h263_mode {0,3} @ATTRIBUTE class {1,8,9}

RTCTCM 2009 16

Background

1

2 3

LOW COMPLEXITY MODES HIGH COMPLEXITY MODES

{SKIP, 16x16, 16x8, 8x16} {INTRA, 8x8, 8x4, 4x8, 4x4}

WZ Information

WEKA Tree

H.264 freedom

1

2 3

HIGH COMPLEXITY MODES {INTRA, 8x8, 8x4, 4x8, 4x4}

WZ Information

WEKA Tree

H.264 freedom

4

5

{SKIP, 16x16} {16x8, 8x16}

1

2 3

WZ Information

WEKA Tree

H.264 freedom

4

5

{SKIP, 16x16} {16x8, 8x16} {8x8, 8x4, 4x8} {INTRA, 4x4}

4

5

1

2 3

WZ Information

WEKA Tree

H.264 freedom

4

5

{SKIP, 16x16} {16x8, 8x16} {INTRA, 4x4}

4

5

6

7

{8x8} {8x4, 4x8}

RTCTCM 2009 17

OVERVIEW




4. Results

5. Conclusion

RTCTCM 2009 18

Proposed Transcoder

1. Reduced Motion Estimation based on the Dynamic Search Windows

2. Reduced the Inter Prediction based on the decision trees algorithm

TTCM

Decoder

Parity bit stream

F n - 1 (reference)

+

-

Inter

Intra

+

T Q

Reorder

T - 1 Q - 1 +

D n

D´ n

F n (current)

Entropy encode

Deblocking Filte


d)

X

NAL

u F´ n

P

WYNER-ZIV DECODER -

H.264 ENCODER

Intra predictio

n

Key Frames

Bitstream



Intra-Frame

Decoder

X’2i-1

X’2i+1 Side

Information

Generation

DCT Y2i

Reconstruction

MC

MV’s

Buffer IDCT

q’2i X’2i

ME

RTCTCM 2009 19

RESULTS

• Training Sequence: 10 QCIF frames flower garden

• Tested Sequences: Akiyo, Coastguard, Container, Hall Monitor and Mother

• Simulation conditions The transcoder was implemented

using the H.264/AVC reference software (JM13.2)

QP values 28, 32, 36 and 40 were use for testing. IWZ GOP format was transcoded as IP

The approach is compared with a cascade WZ to H.264 transcoder

1

2 3

WZ Information

WEKA Tree

H.264 freedom

4

5

{SKIP, 16x16} {16x8, 8x16} {INTRA, 4x4}

4

5

6

7

{8x8} {8x4, 4x8}

RTCTCM 2009 20

RESULTS

Sequence Format ∆PSNR (dB) ∆Bitrate (%) ∆Time (%)

Akiyo QCIF -0,004 0,10 95,23

Coastguard QCIF -0,036 0,95 94,99

Container QCIF -0,001 0,03 96,00

Hall Monitor QCIF -0,006 0,14 95,43

Mother QCIF -0,005 0,14 95,32

Mean QCIF -0.011 0.27 95.39

G. Bjontegaard, Calculation of average PSNR differences between RD-Curves. Presented at the 13th VCEG-M33 Meeting, Austin, TX, April 2001

• RD differences

RTCTCM 2009 21

OVERVIEW


2. H.264 and Video Coding Paradigms


4. Results

5. Conclusion

RTCTCM 2009 22

Conclusions

• We propose a WZ to H.264 video transcoder as a solution for mobile-to-mobile video communications.

• Transcoders based on data mining techniques for exploiting the WZ encoding correlation.

• The H.264 Motion Estimation done in H.264 can be reduced by using the incoming side information motion vectors

• The proposed solution eliminates the requirements of complex WZ decoders and H.264 encoders in the end-user devices.

• These first approach results show an extremely close RD performance with an 95% of complexity reduction.

RTCTCM 2009 23

Collaborations

• Internationals

• Florida Atlantic University. Florida, USA

• Dr Hari Kalva

• University of Surrey. Centre for Communication System Research (CCSR). United Kingdom

• Dr. W.A.C. Fernando

• Nationals

• Universidad Miguel Hernandez. Elche

• Dr. Manuel Perez Malumbres

Conference title 24

Gracias!!!

RTCTCM 2009 25

RESULTS

• WZ encoding time

Conference title 1 A WYNER-ZIV TO H.264 VIDEO TRANSCODER José Luis Martínez, Pedro Cuenca, Gerardo...

Documents

Transcript of Conference title 1 A WYNER-ZIV TO H.264 VIDEO TRANSCODER José Luis Martínez, Pedro Cuenca, Gerardo...