3D Graphics in Future Mobile Devices - · PDF file- Optimized for Android™ - OpenGL ......
-
Upload
hoangkhanh -
Category
Documents
-
view
223 -
download
0
Transcript of 3D Graphics in Future Mobile Devices - · PDF file- Optimized for Android™ - OpenGL ......
3D Graphics
in Future Mobile Devices
Steve Steele, ARM
Market Trends
Mobile Computing Market Growth
0
200
400
600
800
1000
1200
1400
1600
2010 2015
Entry Level Mid Range Premium
Mobile Computing Market Trends
• Latest features
• Greatest specs
• Tailored to fit a
given budget
<$150
>$400
Smart Mobile Device Shipments
(Smartphones and Tablets)
Source: ARM and Gartner Estimates
Entry-level in 2015
• >750 million devices
• 250% growth since 2012;
>10x since 2010
• Nearly 4 times the 2013
Notebook PC forecasts
ARM provides targeted solutions to
deliver the best features and specs at
every price point
$200-
$350
Volu
me in m
illio
ns
0
200
400
600
800
1000
1200
1400
1600
1800
2000
2010
2011
2012
2013
2014
2015
2016
2017
Entry Level Mid Range Premium
Driving Market Change
80-100mm2
50-80mm2
25-40mm2
Typical GPU
<$150
>$400
*Product is based on a published Khronos Specification, and is expected to pass the Khronos Conformance Testing Process.
Current conformance status can be found at www.khronos.org/conformance
$200-
$350
Volu
me in m
illio
ns
Smart Mobile Device Shipments
(Smartphones and Tablets)
ARM Mali™-T760 GPU - Increased GPU & SoC energy efficiency
- Increased performance scalability
- Advanced memory system scalability
- Reduced bandwidth consumption
ARM Mali-T720 GPU - Increased graphics area efficiency
- Optimized for Android™
- OpenGL® ES 3.0 support*
- Reduced cost & Time-to-Market
Source: ARM and Gartner Estimates
ARM Mali-T600 Series GPU Overview
Midgard Architecture - the foundation of ARM’s GPU roadmap providing increased
performance, flexibility and software compatibility
Innovation – driving 64bit GPU Compute, bandwidth saving technologies
Scalability - an ARM Mali-T600 Series GPU to suit every application
GPU Compute - 64-bit double precision, IEEE-754 compliant floating point
Feature-rich - all popular OSs, multiple APIs including DirectX® 11, next generation
OpenGL ES, OpenCL™ Full Profile and RenderScript compute
Leading on Lowering System Power
GPUs have a major impact on SoC architecture
Area, memory bandwidth, energy and implementation
ARM focuses on system wide power efficiency not just IP components
Energy Saving Features in the Mali-T62x system:
GPU efficiency – 50% performance increase or less energy/frame in same area
ARM Mali GPUs are leaders in balancing power, area and functionality
ASTC* TE
90%
TE**
50%
ARM
POP™
19%
*Adaptive Scalable Texture Compression (ASTC)
**Transaction Elimination (TE)
ARM POP IP
• Up to 27% higher frequency
• 27% lower area SAVING SAVING SAVING
Mali Momentum through 2013
The most widely licensed GPU
84 Mali licenses to date
More than 10x Growth in volume in 2 years
152M units shipped in 2012,
in more than 230 devices
>300M units shipped YTD 2013
2013 has already seen shipments double
from 2012
Strength in key market segments
#1 Android GPU IP supplier
>20% Android Smartphone
#1 in Android tablets (>50%)
#1 in Digital Smart TVs (>70%)
Premium Mobile Devices
Energy efficient 3D Graphics
Mali High-end GPU Solutions
Designed for Graphics and
GPU Compute
Full Profile, 64-bit Compute
Closer CPU-GPU links
Efficient use of system resources
Coherent memory links
Cortex® -A15 / Cortex-A53 / Cortex-A57
Protecting partner investments
Common software platform reduces costs
and TTM
Multicore scales performance
to address multiple form factors
Advanced products get to market early
ARM Mali-T628 GPU silicon shipping now
in consumer products
Product is based on a published Khronos Specification, and is expected to pass the Khronos Conformance Testing Process.
Current conformance status can be found at www.khronos.org/conformance
HIGH-END GPU
ROADMAP
Mali-T604 First Midgard architecture
product
OpenGL® ES 3.0 support
Scalable to 4 cores
Mali-T628 50% performance uplift
OpenGL ES 3.0 support
Scalable to 8 cores
Mali-T760 400% energy efficiency of Mali-T604
Scalable to 16 cores
Major bandwidth reduction
Mali-T624 50% performance uplift
OpenGL ES 3.0 support
Scalable to 4 cores
Mali-T622 Small Full Profile GPU Compute solution
50% more energy efficient than Mali-T604
Mali-T678 Highest performance Mali-T600 GPU
OpenGL ES 3.0 support
Scalable to 8 cores
Premium Mobile Device Requirements Increasing screen size and content complexity demand increased performance
Increased performance demands advanced memory technologies to achieve greater bandwidth
Mobile computing devices thermal budget doesn’t increase with performance
Rigid thermal constraints force the adoption of advanced silicon process and memory technologies to achieve greater performance
INCREASED RESOLUTION
INCREASED THERMAL IMPACT
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
HD FHD QXGA WQXGA QSXGA 4K UHD
GB
/Sec
Source: ARM data
ARM Mali-T760 GPU Overview Increased performance and energy efficiency
~400% increase in energy efficiency over ARM Mali-T604 GPU
Multi-processor scalable graphics performance
Coherently scales up to 16 shader core configurations
3D graphics acceleration and Compute APIs
Khronos* compliant OpenGL ES 3.0/2.0, 1.1
Microsoft Windows compliant Direct3D 11.1
Full Profile OpenCL 1.1
RenderScript/FilterScript
Major reduction in bandwidth and SoC power
Memory system optimizations
ARM Frame Buffer Compression (AFBC)
Smart Composition
Proven software DDK quality and performance
# Shader Cores L2
Cache Size Clock Freq (MHz) Pixel Fillrate Triangle Rate
Floating Point
Performance
Up to MP16 2 x 512kB 600MHz 9.6 Gpix/Sec 1066.6 MTri/s 326.4 GFLOPS
*Product is based on a published Khronos Specification, and is expected to pass the Khronos Conformance Testing Process.
Current conformance status can be found at www.khronos.org/conformance
ARM Mali-T760 GPU Scalability
Coherent shader core scalability using an improved L2 Cache interconnect
Scalable up to 16 coherent shader cores
Significant reduction in wire count
Evenly distributed cache utilization
Single or Dual L2 Cache slice each with individual master port
Consistent ACE™-Lite interface
Shad
er
Core
Shad
er
Core
Shad
er
Core
Shad
er
Core
Shad
er
Core
Shad
er
Core
Shad
er
Core
Shad
er
Core
Shad
er
Core
Shad
er
Core
Shad
er
Core
Shad
er
Core
Shad
er
Core
Shad
er
Core
Shad
er
Core
Shad
er
Core
Coherent L2 Cache Interconnect
L2 Cache L2 Cache
AFBC Application in SoC
Employing AFBC throughout the SoC saves significant system bandwidth and power
AFBC
DECODE AFBC
ENCODE
AFBC
DECODE
Frame Buffer Objects
(render to texture)
Uncompressed textures
SW AFBC ENCODE
(TEXTURES)
CPU GPU DISPLAY CONTROLLER
AFBC Raw Compression Rates
Source: ARM data
Smart Composition
Better than 50% reduction of texture read bandwidth on simple Android UI use cases
Significantly reducing read bandwidth and composition work by ignoring repetitive tile
data
Additional Features
Reduce system bandwidth when sending video output
YCrCb framebuffer output
Hardware assisted global illumination
Improved Multiple Render Target
Increased capabilities to support DirectX 11.1
HW super-sampling, TIR
Continuing support for next generation APIs
Direct3D 11.1 Feature Level 11 and next generation OpenGL ES
ARM POP IP for Mali is available from the Physical IP Division
ARM’s EDA tool partners are working with our IP now to prepare for supporting our licensees
28HPM and 16FF process libraries
Entry-level Smart Mobile Devices
Rapid Implementation of 3D Graphics
Mali Mid-range GPU Solutions Designed for cost-effective Graphics
solutions
OpenGL ES, OpenVG
Closer CPU-GPU links
Efficient use of system resources
Cortex-A7 / Cortex-A12 /
Cortex-A53
Protecting partner investments
Common software platform reduces costs
and TTM
Multicore scales performance
to address multiple form factors
Proven solutions
ARM Mali-400 MP and Mali-450 MP GPU
silicon shipping in hundreds of millions of
consumer products
Product is based on a published Khronos Specification, and is expected to pass the Khronos Conformance Testing Process.
Current conformance status can be found at www.khronos.org/conformance
MID-RANGE GPU
ROADMAP
Mali-400 MP First OpenGL ES 2.0 multi-core GPU
Scalable to 4 cores
Leading area-efficiency
Mali-300 Entry-level OpenGL®
ES 2.0 GPU
Mali-450 MP Leading OpenGL ES 2.0 performance
2x Mali-400 MP performance
Scalable to 8 cores
UTGARD ARCHITECTURE
MIDGARD ARCHITECTURE
Mali-T720 First OpenGL ES 3.0 GPU in
Mid-Range Market
Optimized for Android™
Reduced Time to Market
Entry Level Smartphone Tomorrow
Mali-V500 Mali-Display Mali-T720
GPU
ACE-Lite
Quad Core
Cortex® -A7/
Cortex-A53 ACE
Non
Coherent
Devices
I/O
Coherent
Devices
GIC-400
ADB-400
MMU-500
NIC-400 AXI4 AXI4
MMU-500 MMU-500
CoreLink™ CCI-400r1
ACE ACE
ACE-Lite ACE-Lite ACE-Lite
ACE-Lite + DVM ACE-Lite + DVM ACE-Lite + DVM
DMC-400/3rd Party DMC
ACE-Lite ACE-Lite ACE-Lite ACE-Lite
PHY PHY
NIC-400 AXI4
Configurable: AXI4/AXI3/AHB/APB
ADB-400
DDR3/2
LPDDR2/3
DDR3/2
LPDDR2/3
Other
Slaves
Other
Slaves
Quad or Dual-core
Cortex-A53/Cortex-A7
• Most energy-efficient 32-bit
CPU
• 64-bit architecture for new
software
Single Channel
DDR2 @533 MHz
*Product is based on a published Khronos Specification, and is expected to pass the Khronos Conformance Testing Process.
Current conformance status can be found at www.khronos.org/conformance
ARM Mali-T720 GPU Extends lead in area and energy efficiency
Supports OpenGL ES 3.0*
Area-efficient performance
Time to Market Defines Profit Margin
Time-to-market becomes the key differentiator which protects profit margins
Engineering costs grow as process nodes advance
ASPs sharply reduced as competition appears in low-cost markets
$0
$20
$40
$60
$80
$100
$120
$140
90nm 65nm 45nm 32nm 28nm 22nm 14nm
Desi
gn
Co
sts
($M
)
Basic SoC Silicon and Software Design Costs
SW Design Costs
SoC Design Costs
0.0
5.0
10.0
15.0
20.0
25.0
$5.00 $10.00 $15.00 $20.00 $25.00 $30.00
Bre
ak e
ven
Un
it V
olu
me (
M U
nit
s)
Average Selling Price ($)
Unit volume (M) to break even for Basic SoC Designs @28nm
- Source: Semico Research, 2011
ARM Mali-T720 GPU Overview Increased energy efficiency of previous cost optimized ARM Mali GPU
~ 150% energy efficiency increase over previous cost optimized GPUs
Android optimized version of ARM Mali-T62x
~30% area reduction from previous Midgard generation
Up to 15% less dynamic power
Natural companion to quad core implementations of:
Cortex-A7 / Cortex-A12 / Cortex-A53
Targeted optimizations for Android OS
OpenGL ES 3.0, Renderscript/FilterScript support
Maintain the area efficiency leadership of ARM Mali-4xx GPU
Area efficient OpenGL ES 3.0* GPU
Significantly decreased time-to-market
Ease of implementation & integration in SoC designs
# Shader Cores
L2
Cache Size Clock Freq (MHz) Pixel Fillrate Triangle Rate
Floating Point
Performance
Up to MP8 2x128kB 600MHz 4.8 Gpix/Sec 533.2 Mtri/s 81.6 GFLOPS
*Product is based on a published Khronos Specification, and is expected to pass the Khronos Conformance Testing Process. Current
conformance status can be found at www.khronos.org/conformance
Decreased Time-to-Market
Increased routing density significantly decreases die area Constrained routing layers to minimize cost
Crafted reference methodologies to match tool chain
ARM POP IP available from the ARM Physical IP Division
ARM’s EDA tool partners are working with our IP now to
prepare for supporting our licensees on 28nm process
libraries
Opportunities for optimizing
layout utilization
ARM POP IP Exploration Worksheet Extensive exploration analysis to map the wide range of Vt/L options on a process to
evaluate multiple PPA trade-offs.
Determine most effective Vt/L choice for optimizing different design goals.
The dominant effects of Vt/L choice can be easily seen in this post-synthesis dataset.
Use datasheets to estimate GPU PPA
Extensive Mali Ecosystem
Summary
ARM Mali-T760 GPU addresses the needs of high-end mobile and consumer devices by
providing:
Scalable graphics and compute performance with up to 16 core configurations capable of twice the
performance of previous generations
~400% increase in energy efficiency over ARM Mali-T604 GPU
Included technology capable of greatly reducing SoC level memory bandwidth utilization and power
consumption
ARM Mali-T720 GPU solve the needs of semiconductor partners focusing on low-cost
mobile devices by providing:
Significant reduction in shader core area and increased area efficiency
Greater facilities to assist licensees to drastically reduce time-to-market
Targeted optimizations for Android OS - OpenGL ES 3.0, RenderScript/FilterScript support
Steve Steele
Senior Product Manager [email protected]