Fast Mode Decision for H.264/AVC Based on Rate-Distortion Clustering
-
Upload
amaya-bradshaw -
Category
Documents
-
view
41 -
download
1
description
Transcript of Fast Mode Decision for H.264/AVC Based on Rate-Distortion Clustering
Fast Mode Decision for H.264/AVC Based on Rate-Distortion Clustering
IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 14, NO. 3, JUNE 2012
Yu-Huan Sung
Jia-Ching Wang, Senior Member, IEEE
Outline
• Introduction• Related Works• Feature Selection• Proposed Fast Mode Decision• Experiment Results• Conclusion
Introduction
• The up-to-date video coding standard H.264/AVC – twice the compression ratio of other video coding standards.– maintaining nearly the same visual equality.
• However, an extremely high computational complexity is a tradeoff of the performance gains.– Video conferencing– Live TV broadcasting– Mobile computing
Introduction
• H.264/AVC adopts many features that can enhance coding performance.– Variable block-size MC– Sub-pixel ME– Multiple reference pictures selection– Directional intra prediction– In-the-loop de-blocking filtering, etc.
• The features incur a heavy burden during the encoding process.
Introduction
• Reducing the computational time has received considerable attention recently.
• Reducing the encoding time involves two main parts :1. Inter-mode decision
2. Intra-mode decision
according to a RD cost optimization scheme.
Introduction
• The proposed method presents a Multi-Phase Classification (MPC) scheme– use a nearest mean criterion.– determine inter-modes and intra-modes.
• MPC is a hierarchical classification scheme that allows an MB to be classified into a category phase by phase.
Introduction
• The MPC presents a three-phase classification scheme.– a phase consists of several categories.– partition from current phase into next phase.– categories are the sub-sets of the upper phase.
• Each category within a phase is represented as a feature point in the feature space.– assign an MB to a category with the minimum distance.
Outline
• Introduction• Related Works• Feature Selection• Proposed Method• Experiment Results• Conclusion
Related Works
• Four ways to develop the fast mode decision algorithm in previous works.
• The first approach is SIKP-mode detection– early identified if an MB can be skipped.– Kannangara et al. [3] and Zhao et al. [4].
[3] C. Kannangara et al., “Low-complexity skip prediction for H.264 through Lagrangian cost estimation,” IEEE Trans. Circuits Syst. Video Technol., vol. 16, no. 2, pp. 202–208, Feb. 2006.[4] Y. Zhao, M. Bystrom, and I. E. G. Richardson, “A MAP frame work for efficient skip/code mode decision in H.264,” in Proc. ICIP2006, Atlanta, GA, Oct. 8–11, 2006.
Related Works
• The second approach is mode prediction– directly or indirectly predict the best mode for the current MB.– Wu et al. [5], Ri et al. [6] and Paul et al. [17].
• The third approach is mode classification– classifies the current MB into a specific category.– the corresponding candidate modes will be checked to find the best.– Kim et al. [7], Yu et al. [8], Liu et al. [9], Zeng et al. [10] and Zhao et
al. [11].
[5] D.Wu, F. Pan, K. P. Lim, and S.Wu et al., “Fast intermode decision in H.264/AVC video coding,” IEEE Trans. Circuits Syst. Video Technol., vol. 15, no. 7, pp. 953–958, Jul. 2005.[6] S. H. Ri, Y. Vatis, and J. Ostermann, “Fast inter-mode decision in an H.264/AVC encoder using mode and Lagrangian cost correlation,” IEEE Trans. Circuits Syst. Video Technol., vol. 19, no. 2, pp. 302–306, Feb. 2009.[17] M. Paul,W. Lin, C. T. Lau, and B. S. Lee, “Direct inter-mode selection for H.264 video coding using phase correlation,” IEEE Trans. Image Process., vol. 20, no. 2, pp. 461–473, Feb. 2011.
Related Works
• The last approach redefines the optimization cost function– number of operations needed for mode selection can be reduced.
[7] C. Kim and C. C. Jay Kuo, “Feature-based intra-/inter coding mode selection for H.264/AVC,” IEEE Trans. Circuits and Syst. Video Technol., vol. 17, no. 4, pp. 441–453, Apr. 2007.[8] A. C. W. Yu, G. R. Martin, and H. Park, “Fast inter-mode selection in the H.264/AVC standard using a hierarchical decision process,” IEEE Trans. Circuits Syst. Video Technol., vol. 18, no. 2, pp. 186–195, Feb. 2008.[9] Z. Liu, L. Shen, and Z. Zhang, “An efficient intermode decision algorithm based on motion homogeneity for H.264/AVC,” IEEE Trans. Circuits Syst. Video Technol., vol. 19, no. 1, pp. 128–132, Jan. 2009.[10] H. Zeng, C. Cai, and K.-K. Ma, “Fast mode decision for H.264/AVC based on macro block motion activity,” IEEE Trans. Circuits Syst. Video Technol., vol. 19, no. 4, pp. 491–499, Apr. 2009.[11] T. Zhao, H.Wang, S. Kwong, and C.-C. Jay Kuo, “Fast mode decision based on mode adaptation,” IEEE Trans. Circuits Syst. Video Technol., vol. 20, no. 5, pp. 697–705, May 2010.
Outline
• Introduction• Related Works• Feature Selection– Feature Vector– Feature Space and Classifier
• Proposed Fast Mode Decision• Experiment Results• Conclusion
Feature Vector
• There is a strong correlation of RD cost between the best mode and the temporal-spatial modes.
• A three-dimensional feature vector that comprises RD costs of neighboring MBs is used to discriminate between the different modes for mode decision.
Feature Vector
• RD costs range to various extents under different coding modes and motion contents and should not be directly used as a universal criterion.
• Using a three-dimensional feature vector – ensure that an MB can be assigned to the most probable category
accurately.– adapt to the variable motion contents of various video sequences
properly.
Feature Vector
• The three components of a feature vector, fskip, fspat, and ftemp, are expressed as :
Outline
• Introduction• Related Works• Feature Selection• Proposed Fast Mode Decision• Experiment Results• Conclusion
Fast Mode Decision
• Nearest Mean Criterion– assign MBs into a specific category.– classify MBs by using Euclidean distance.
– predict the best mode of an MB by finding a mean Mi (cluster center).
Fast Mode Decision
• Category Organization– directly assigning the mode with minimum distance to the given MB.
• unsatisfactory prediction accuracy.
– grouping modes with similar characteristics into a category.• reducing the probability of a false prediction.
Fast Mode Decision
• Multi-Phase Classification– pass through multiple phases.– avoid assigning an MB to a category too cursorily.
• Phase-I identifies– Large-Middle category (SKIP/DIRECT, 1616, 168, 816, I1616)– Middle-Small category (168, 816, P88, I44)
• Phase-II and Phase-III then divide each motion category into much smaller categories.
Fast Mode Decision
• Mode decision process can be further accelerated by Early Termination.– activated => if the fskip is below a specific threshold.
– SKIP mode is the best mode.
• Initial threshold is set to be the average RD costs of SKIP-MBs in the training sequences, and will be dynamically updated according to :
fskip Tskip
Error Propagation and Performance Degradation Control
• A performance control process is incorporated into the proposed method.
• Avoid serious performance degradation caused by repeated use of wrongly predicted results or accidental false predictions.
• The idea is providing an inspection for the coding result of each MB produced from the fast mode decision algorithm.
Error Propagation and Performance Degradation Control
• An adaptive RD cost inspection is proposed and all it needs have been gained already.– temporal RD costs – spatial RD costs
• A fast mode decision is made and the corresponding RD cost is obtained, an inspection is performed by :
Outline
• Introduction• Related Works• Feature Selection• Proposed Fast Mode Decision• Experiment Results• Conclusion
Training and Test Conditions
• The means of each category and the related statistics are generated by JM17.0 [15].
• Ten video sequences are Silent, Ice, Hall, Highway, Miss-America, Carphone, Tempete, Soccer, Bus, and Table Tennis. Video format is QCIF-format.
• QP values are 20, 24, 28, 32, and 36.
• Two GOP structures (IPPP and IBBP) are used for the training purpose.
Training and Test Conditions
• The number of frames to be encoded is set to 100.
• The search range of motion estimation is 16, and the search strategy is full search.
• The number of reference frames is 1, and the intra-period is set to 4.
Outline
• Introduction• Related Works• Feature Selection• Proposed Fast Mode Decision• Experiment Results• Conclusion
Conclusion
• Experimental results indicate that the quality loss and bitrate increasing are only 0.02 dB and 1.65%, respectively.
• Reducing 67.5% encoding time on average among the 12 video sequences of different GOP structures.
• Encompass a wide variety of motion contents and different resolutions.