Power analysis of H.264/AVC for mobile platforms
-
Upload
alexandru-iovanovici -
Category
Technology
-
view
108 -
download
3
description
Transcript of Power analysis of H.264/AVC for mobile platforms
Power analysis of H.264/AVCfor mobile platforms
Student: Alexandru IOVANOVICI, B.Eng.Supervisor: Lucian PRODAN, Ph.D, B.Eng.
Master’s ThesisJune 2011
What ... is the problem ?
1. Customers are mobile
2. Customers expect PC-like featuresfrom their „cell-phones”
What ... solutions do we have ?
CODECs
Video CODECs basics
RAW video -> minimum 216Mbps [Richardson2010] Lossless compression = removing data redundancy
(3-4 times less space); Lossy compression = removing subjective
redundancy; much higher compression rates;
Video ENCODER basics
Prediction model:exploiting the similarities between neighbouring video frames;
Spatial model: compressing the residual; transform coeficients;
Entropy encoder:removing statistical redundancy;
Typical ENCODER architecture
DPCM/DCT architecture: basis for all modern encoders, including H.264;
Typical DECODER architecture
The H.264/AVC
H.264/AVC gives specifications only for the decoder; Published by ITU in 2003 with several revisions; Based on MPEG-4 Visual; 10 to 50 times better compression ratio at same visual quality
[Richardson2010, Xe2007] 2 to 10 times more power dissipation than MPEG-4 [Xe2007]
The H.264 ENCODER
Transform and quantization
H.264 Profiles
There is a great flexibility in choosing the apropriate combination of tools that best suit the specific needs of a particular tasks [eg. CAVLC+DCT];
H.264 profiles define a specific set of tools; A profile compliant decoder must be able to
decode with all the tools in that profile → constraints on the capabilities required by a decoder;
The Zexia encoderNon-interlaced Base
Profile;
Only OpenSource available encoder for H.264;
Modular, interface-based;
VHDL configuration-based;
Better suited for Spartan 3 than for Cyclone III: Spartan has more on-chip memory;
Two Clock lines;
RAM: an entire image must be loaded by the external controller; and the reference image too; Smaller reqirements if we use intraprediction only
[Richardson2007]; Predictions components:
SAD comparaison; Only p-frames prediction (licensing issues);
The Zexia encoder
Simulation based – lack of hardware resources on Altera DE2 board;
Quartus II with Advanced PowerPlay Early Estimator;
VCD files based on statistical distribution of line transitions (12.5% [Xe2007]).
Experimental results
Experimental results
Chip flooplan area distribution In number of LEs
Buffer
CAVLC
Core transform
DC transform
Dequantise
Header
Intra 4x4
Invtransform
Quantise
Reconstruction
ToBytes
Total power distribution by functional unit
Buffer
CAVLC
Core transform
DC transform
Dequantise
Header
Intra 4x4
Intra 8x8
Invtransform
Quantise
Reconstruction
ToBytes
Experimental results
FPGAs are bad at low-power optimizations
Augmented Cell PhoneA new architecture
Coprocessor based cell-phone;
“Program repository” on Flash;
Marketplace for “programs”;
High power requirements but even higher customer satisfaction
better user experience;
→ more devices sold;
Conclusions• H.264 is a power intensive algorithm;
• Less than 30% is parallel [52];
• ASIC is the best option but is „expensive”;
• FPGAs are not good for low-power techniques• Large routing grids;
• A lot of cells are powered on even in neutral functions;
• Good for comparing two similar designs in HDL in terms of simulated performance;
• ACP: balance between power requirements and user satisfaction
• Need considerable rethinking on the HW level and the OS;
• Need skilled developers for “soft-coprocessors”