Fast Mode Decision for H.264/AVC Based on Rate-Distortion Clustering

40
Fast Mode Decision for H.264/AVC Based on Rate- Distortion Clustering IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 14, NO. 3, JUNE 2012 Yu-Huan Sung Jia-Ching Wang, Senior Member, IEEE

description

Fast Mode Decision for H.264/AVC Based on Rate-Distortion Clustering. IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 14, NO. 3, JUNE 2012 Yu- Huan Sung Jia-Ching Wang, Senior Member, IEEE. Outline. Introduction Related Works Feature Selection Proposed Fast Mode Decision Experiment Results - PowerPoint PPT Presentation

Transcript of Fast Mode Decision for H.264/AVC Based on Rate-Distortion Clustering

Fast Mode Decision for H.264/AVC Based on Rate-Distortion Clustering

IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 14, NO. 3, JUNE 2012

Yu-Huan Sung

Jia-Ching Wang, Senior Member, IEEE

Outline

• Introduction• Related Works• Feature Selection• Proposed Fast Mode Decision• Experiment Results• Conclusion

Introduction

• The up-to-date video coding standard H.264/AVC – twice the compression ratio of other video coding standards.– maintaining nearly the same visual equality.

• However, an extremely high computational complexity is a tradeoff of the performance gains.– Video conferencing– Live TV broadcasting– Mobile computing

Introduction

• H.264/AVC adopts many features that can enhance coding performance.– Variable block-size MC– Sub-pixel ME– Multiple reference pictures selection– Directional intra prediction– In-the-loop de-blocking filtering, etc.

• The features incur a heavy burden during the encoding process.

Introduction

• Reducing the computational time has received considerable attention recently.

• Reducing the encoding time involves two main parts :1. Inter-mode decision

2. Intra-mode decision

according to a RD cost optimization scheme.

Introduction

• The proposed method presents a Multi-Phase Classification (MPC) scheme– use a nearest mean criterion.– determine inter-modes and intra-modes.

• MPC is a hierarchical classification scheme that allows an MB to be classified into a category phase by phase.

Introduction

• The MPC presents a three-phase classification scheme.– a phase consists of several categories.– partition from current phase into next phase.– categories are the sub-sets of the upper phase.

• Each category within a phase is represented as a feature point in the feature space.– assign an MB to a category with the minimum distance.

Outline

• Introduction• Related Works• Feature Selection• Proposed Method• Experiment Results• Conclusion

Related Works

• Four ways to develop the fast mode decision algorithm in previous works.

• The first approach is SIKP-mode detection– early identified if an MB can be skipped.– Kannangara et al. [3] and Zhao et al. [4].

[3] C. Kannangara et al., “Low-complexity skip prediction for H.264 through Lagrangian cost estimation,” IEEE Trans. Circuits Syst. Video Technol., vol. 16, no. 2, pp. 202–208, Feb. 2006.[4] Y. Zhao, M. Bystrom, and I. E. G. Richardson, “A MAP frame work for efficient skip/code mode decision in H.264,” in Proc. ICIP2006, Atlanta, GA, Oct. 8–11, 2006.

Related Works

• The second approach is mode prediction– directly or indirectly predict the best mode for the current MB.– Wu et al. [5], Ri et al. [6] and Paul et al. [17].

• The third approach is mode classification– classifies the current MB into a specific category.– the corresponding candidate modes will be checked to find the best.– Kim et al. [7], Yu et al. [8], Liu et al. [9], Zeng et al. [10] and Zhao et

al. [11].

[5] D.Wu, F. Pan, K. P. Lim, and S.Wu et al., “Fast intermode decision in H.264/AVC video coding,” IEEE Trans. Circuits Syst. Video Technol., vol. 15, no. 7, pp. 953–958, Jul. 2005.[6] S. H. Ri, Y. Vatis, and J. Ostermann, “Fast inter-mode decision in an H.264/AVC encoder using mode and Lagrangian cost correlation,” IEEE Trans. Circuits Syst. Video Technol., vol. 19, no. 2, pp. 302–306, Feb. 2009.[17] M. Paul,W. Lin, C. T. Lau, and B. S. Lee, “Direct inter-mode selection for H.264 video coding using phase correlation,” IEEE Trans. Image Process., vol. 20, no. 2, pp. 461–473, Feb. 2011.

Related Works

• The last approach redefines the optimization cost function– number of operations needed for mode selection can be reduced.

[7] C. Kim and C. C. Jay Kuo, “Feature-based intra-/inter coding mode selection for H.264/AVC,” IEEE Trans. Circuits and Syst. Video Technol., vol. 17, no. 4, pp. 441–453, Apr. 2007.[8] A. C. W. Yu, G. R. Martin, and H. Park, “Fast inter-mode selection in the H.264/AVC standard using a hierarchical decision process,” IEEE Trans. Circuits Syst. Video Technol., vol. 18, no. 2, pp. 186–195, Feb. 2008.[9] Z. Liu, L. Shen, and Z. Zhang, “An efficient intermode decision algorithm based on motion homogeneity for H.264/AVC,” IEEE Trans. Circuits Syst. Video Technol., vol. 19, no. 1, pp. 128–132, Jan. 2009.[10] H. Zeng, C. Cai, and K.-K. Ma, “Fast mode decision for H.264/AVC based on macro block motion activity,” IEEE Trans. Circuits Syst. Video Technol., vol. 19, no. 4, pp. 491–499, Apr. 2009.[11] T. Zhao, H.Wang, S. Kwong, and C.-C. Jay Kuo, “Fast mode decision based on mode adaptation,” IEEE Trans. Circuits Syst. Video Technol., vol. 20, no. 5, pp. 697–705, May 2010.

Outline

• Introduction• Related Works• Feature Selection– Feature Vector– Feature Space and Classifier

• Proposed Fast Mode Decision• Experiment Results• Conclusion

Feature Vector

• There is a strong correlation of RD cost between the best mode and the temporal-spatial modes.

• A three-dimensional feature vector that comprises RD costs of neighboring MBs is used to discriminate between the different modes for mode decision.

Feature Vector

• RD costs range to various extents under different coding modes and motion contents and should not be directly used as a universal criterion.

• Using a three-dimensional feature vector – ensure that an MB can be assigned to the most probable category

accurately.– adapt to the variable motion contents of various video sequences

properly.

Feature Vector

• The three components of a feature vector, fskip, fspat, and ftemp, are expressed as :

Flowchart of Initialization

Feature Vector

• RD cost is expressed as :

Feature Space and Classifier

• The 3D feature space

Feature Space and Classifier

• Feature Space and Voronoi Diagram

ftemp

fskip

Outline

• Introduction• Related Works• Feature Selection• Proposed Fast Mode Decision• Experiment Results• Conclusion

Fast Mode Decision

• Nearest Mean Criterion– assign MBs into a specific category.– classify MBs by using Euclidean distance.

– predict the best mode of an MB by finding a mean Mi (cluster center).

Fast Mode Decision

• Category Organization– directly assigning the mode with minimum distance to the given MB.

• unsatisfactory prediction accuracy.

– grouping modes with similar characteristics into a category.• reducing the probability of a false prediction.

Fast Mode Decision

• Multi-Phase Classification– pass through multiple phases.– avoid assigning an MB to a category too cursorily.

• Phase-I identifies– Large-Middle category (SKIP/DIRECT, 1616, 168, 816, I1616)– Middle-Small category (168, 816, P88, I44)

• Phase-II and Phase-III then divide each motion category into much smaller categories.

Fast Mode Decision

Flowchart of Phases

Fast Mode Decision

• Mode decision process can be further accelerated by Early Termination.– activated => if the fskip is below a specific threshold.

– SKIP mode is the best mode.

• Initial threshold is set to be the average RD costs of SKIP-MBs in the training sequences, and will be dynamically updated according to :

fskip Tskip

Flowchart of Early Termination

.

.

.

Error Propagation and Performance Degradation Control

• A performance control process is incorporated into the proposed method.

• Avoid serious performance degradation caused by repeated use of wrongly predicted results or accidental false predictions.

• The idea is providing an inspection for the coding result of each MB produced from the fast mode decision algorithm.

Error Propagation and Performance Degradation Control

• An adaptive RD cost inspection is proposed and all it needs have been gained already.– temporal RD costs – spatial RD costs

• A fast mode decision is made and the corresponding RD cost is obtained, an inspection is performed by :

Flowchart of Inspection

Outline

• Introduction• Related Works• Feature Selection• Proposed Fast Mode Decision• Experiment Results• Conclusion

Training and Test Conditions

• The means of each category and the related statistics are generated by JM17.0 [15].

• Ten video sequences are Silent, Ice, Hall, Highway, Miss-America, Carphone, Tempete, Soccer, Bus, and Table Tennis. Video format is QCIF-format.

• QP values are 20, 24, 28, 32, and 36.

• Two GOP structures (IPPP and IBBP) are used for the training purpose.

Training and Test Conditions

• The number of frames to be encoded is set to 100.

• The search range of motion estimation is 16, and the search strategy is full search.

• The number of reference frames is 1, and the intra-period is set to 4.

Performance of Used Mode

Performance Comparisons (1/2)

Performance Comparisons (2/2)

Performance Comparisons with GOP size

RD Curves

Outline

• Introduction• Related Works• Feature Selection• Proposed Fast Mode Decision• Experiment Results• Conclusion

Conclusion

• Experimental results indicate that the quality loss and bitrate increasing are only 0.02 dB and 1.65%, respectively.

• Reducing 67.5% encoding time on average among the 12 video sequences of different GOP structures.

• Encompass a wide variety of motion contents and different resolutions.