UplinQ - qualcomm® snapdragon™ processors a super gaming platform

34
1 ©2013-2014 Qualcomm Technologies, Inc. and/or its affiliated companies. All Rights Reserved.

Transcript of UplinQ - qualcomm® snapdragon™ processors a super gaming platform

Page 1: UplinQ - qualcomm® snapdragon™ processors a super gaming platform

1 ©2013-2014 Qualcomm Technologies, Inc. and/or its affiliated companies. All Rights Reserved.

Page 2: UplinQ - qualcomm® snapdragon™ processors a super gaming platform

Qualcomm® Snapdragon™ Processors: A Super Gaming Platform

Manish Sirdeshmukh, Product Manager, Staff Todd LeMoine, Engineer, Principal/Manager Qualcomm Technologies, Inc.

Qualcomm Snapdragon is a product of Qualcomm Technologies, Inc.

Page 3: UplinQ - qualcomm® snapdragon™ processors a super gaming platform

3 ©2013-2014 Qualcomm Technologies, Inc. and/or its affiliated companies. All Rights Reserved.

Source: Gartner, October 2013, “Forecast Video Game Ecosystem Worldwide”

Total mobile gaming revenues (for all platforms) are projected to grow from $13 billion in 2013 to $22 billion in 2015

$ 22B

Page 4: UplinQ - qualcomm® snapdragon™ processors a super gaming platform

4 ©2013-2014 Qualcomm Technologies, Inc. and/or its affiliated companies. All Rights Reserved.

Gaming on mobile today

Comparison: PC Comparison: Mobile

Page 5: UplinQ - qualcomm® snapdragon™ processors a super gaming platform

5 ©2013-2014 Qualcomm Technologies, Inc. and/or its affiliated companies. All Rights Reserved.

Gaming on mobile today

Desktop PC Snapdragon 805

“Epic now has brought Unreal Engine 4 to Android with the Snapdragon 800 and 805 chipsets from Qualcomm Technologies,” said Niklas Smedberg, Senior Engine Programmer, Epic Games. “Recently we worked with Qualcomm [QTI] to elevate graphics to the next level on the Snapdragon Adreno GPU hardware, which delivers some of the most power-efficient unified shader capabilities we’ve seen yet for Android smartphones and tablets.”

Comparison: PC Comparison: Mobile

Page 6: UplinQ - qualcomm® snapdragon™ processors a super gaming platform

6 ©2013-2014 Qualcomm Technologies, Inc. and/or its affiliated companies. All Rights Reserved.

Image: Modern Combat 5 by Gameloft

What is involved in games?

Gameplay execution (animation): Animation for water movement and anchored boat motion

Gameplay execution (AI): Enemy helicopter controlled by AI

Gameplay execution (physics): Particle physics makes explosions look real

Console-quality graphics: Lens effect on the sunlight breaking through the clouds

Console-quality graphics: Hi-res textures provide rich details to the scene

Console-quality graphics: Bloom glare from gun fire provide immersive experience

Fast connectivity: Play a mission in multi-player gaming

High-quality video: After completing the level, watch a cut scene transition

Responsive and accurate control: Control the character movement

Multi-screen experience: Mirror your screen to TV

Cinema-quality sound: Hear gunfire, explosions, bullets flying by, and the helicopter’s rotor blades

Page 7: UplinQ - qualcomm® snapdragon™ processors a super gaming platform

7 ©2013-2014 Qualcomm Technologies, Inc. and/or its affiliated companies. All Rights Reserved.

Snapdragon processors

Page 8: UplinQ - qualcomm® snapdragon™ processors a super gaming platform

8 ©2013-2014 Qualcomm Technologies, Inc. and/or its affiliated companies. All Rights Reserved.

Snapdragon processors

Page 9: UplinQ - qualcomm® snapdragon™ processors a super gaming platform

9 ©2013-2014 Qualcomm Technologies, Inc. and/or its affiliated companies. All Rights Reserved.

How is SoC utilized by a game? Heterogeneous hardware blocks and data flow

Graphics Textures, Shaders, Geometry

Video Data

Audio Data

Start

Quad Core CPU

System Memory Final Frame

CPU #1 CPU #2 CPU #3 CPU #4

Phys

ics

Ani

mat

ion

Gam

e lo

gic

Art

ifici

al

Inte

llige

nce To Display Panel

To Wi-Fi Display Panel

Encoded Final Frame

Input Signals

Display Reads G

PU Reads

Video

Graphics Rendering

Aud

io

Gra

phic

s Pi

xel W

rite

s

Video Pixel Writes

To Speakers

Wi-Fi Engine

Video Decoder

Video Encoder

DSP (Audio Decoder)

Sensor Engine

Display Engine

GPU

Page 10: UplinQ - qualcomm® snapdragon™ processors a super gaming platform

10 ©2013-2014 Qualcomm Technologies, Inc. and/or its affiliated companies. All Rights Reserved.

Qualcomm Gobi, Qualcomm Adreno, Qualcomm Hexagon and Krait are products of Qualcomm Technologies, Inc.

Qualcomm® Adreno™ GPU

• Adreno is Qualcomm Technologies, Inc.’s (QTI) integrated GPU

• Adreno 420 is QTI’s latest integrated GPU shipping in Snapdragon 805

• Adreno GPUs are custom designed for mobile use

Qualcomm® Krait™ 450 Quad Core CPU

Location GPS, GLONASS, Beidou, Galileo Satellites

Adreno 420 GPU OpenGL ES 2.0/3.1*

OpenCL 1.2 Full

Snapdragon Display Engine 4K, Miracast, picture enhancement

Dual ISPs (Imaging)

Up to 55MP 1.2GPix/s bw Camera SW

USB 3.0

Multimedia Processing

4K Decode HEVC Decode

Snapdragon Voice Activation Gestures

Studio Access Security

Memory 2x64 bit LPDDR3

Qualcomm® Hexagon™ DSP Ultra Low Power Sensor Engine

Fusion 4.5

Fusion 4.5

Qualcomm® Gobi™ 9x35 Modem

4th gen CAT 6 LTE Up to 3x20MHz CA

*Product is based on provisional Khronos Specification, and is designed to pass the Khronos Conformance Testing Process when available. Current conformance status can be found at www.khronos.org/comformance.

Page 11: UplinQ - qualcomm® snapdragon™ processors a super gaming platform

11 ©2013-2014 Qualcomm Technologies, Inc. and/or its affiliated companies. All Rights Reserved.

Adreno 420 GPU highlights

• Desktop and console quality graphics on mobile

• Complete DirectX11 FL 11_2 pipeline, supports OpenGL ES 3.1

• Support for dynamic hardware tessellation & geometry shaders

Richer, visually immersive graphics

No Tessellation Tessellation

Page 12: UplinQ - qualcomm® snapdragon™ processors a super gaming platform

12 ©2013-2014 Qualcomm Technologies, Inc. and/or its affiliated companies. All Rights Reserved.

Adreno 420 supports most advanced graphics APIs

Feature/APIs OpenGL ES 3.0 OpenGL ES 3.1 Android Extension Pack

Compute Shader No Yes Yes

Atomics No Yes Yes

Image Load/Store No Yes Yes

Draw Indirect No Yes Yes

Texture Gather No Yes Yes

Multisample Textures No Yes Yes

Stencil Textures No Yes Yes

Separate Shader Objects No Yes Yes

Advanced Blending Modes (Programmable Blending)

No Yes Yes

Geometry Shaders No No Yes

Tessellation Shaders No No Yes

Page 13: UplinQ - qualcomm® snapdragon™ processors a super gaming platform

13 ©2013-2014 Qualcomm Technologies, Inc. and/or its affiliated companies. All Rights Reserved.

ASTC Unified Shaders FlexRender™ technology

FlexRender is a product of Qualcomm Technologies, Inc.

Adreno 420 GPU highlights

• Improved architecture for performance & efficiency

• Better performance

• Reduced power consumption

Dir

ect

Rend

erin

g T

iled

Rend

erin

g

Dynamic Switching

Original ASTC Compression

24bpp 8bpp 3.56bpp 2bpp

Unified Shaders

Pixel Vertex

Compute Tessellation Geometry

Adreno GPU

System memory

Tile buffer

Adreno GPU

System memory

Page 14: UplinQ - qualcomm® snapdragon™ processors a super gaming platform

14 ©2013-2014 Qualcomm Technologies, Inc. and/or its affiliated companies. All Rights Reserved.

Adreno 420 architectural improvements

• DX11.2 3D pipeline − Hardware tessellation

− Geometry shading

− Stream out from VS, DS, GS

− Programmable blending

• Upgraded compute − Direct compute, OpenCL 1.2 Full profile

− Faster RenderScript

• Improved texturing − Improved texture performance

− Support for higher level texture filtering (e.g., Aniso) with less performance impact

− ASTC support, better LOD & filtering quality

− Larger caches: texture cache, L2 cache

• Improved ROPs & Z − Faster depth rejection

− Designed to achieve peak draw rate more often

System Memory Command Processor

(Input Assembler)

Vertex Shader

Hull Shader (LOD, Control Patch)

Tessellator

Domain Shader (Vertex Calculation

& Displacement)

Geometry Shader

Rasterizer

Pixel Shader

Render Backend

Index Buffers

Hardware Tessellation

Pipeline

Vertex Buffers

Constant Buffers

Unordered Access

Resources

Texture Resources

Render Targets

Textures

Buffers

Unified Shader

Processor

Frame Buffer

Stream Out

Page 15: UplinQ - qualcomm® snapdragon™ processors a super gaming platform

15 ©2013-2014 Qualcomm Technologies, Inc. and/or its affiliated companies. All Rights Reserved.

Adreno GPU architecture

Advantages:

• Designed to minimize unnecessary data traffic to host memory

• Designed to minimize power consumption

• Use of transparency / anti-aliasing is inexpensive

Tiled Rendering architecture Early Z (Depth) Reject feature

Objects in background

Objects in foreground

Advantages:

• Designed to prevent unnecessary use of GPU resources in drawing pixels for occluded objects

• Designed to increase overall graphics performance for larger scenes with opaque geometry

Page 16: UplinQ - qualcomm® snapdragon™ processors a super gaming platform

16 ©2013-2014 Qualcomm Technologies, Inc. and/or its affiliated companies. All Rights Reserved.

Adreno GPU architecture

Dynamic FlexRender technology Double Rate Half Precision (DRHP) design

Adreno GPU

System memory

Direct rendering

GMEM (Tile Buffer)

Adreno GPU

System memory

Tiled rendering

FlexRender

Dynamic Switching 1X

Speed for “highp” Shaders

2X Speed for

“mediump” Shaders

Advantages:

• Better performance and power for wider range of use cases

• More developer flexibility

Advantages:

• Use additional/complex shaders without compromising performance

• Better performance with power efficiency

Page 17: UplinQ - qualcomm® snapdragon™ processors a super gaming platform

17 ©2013-2014 Qualcomm Technologies, Inc. and/or its affiliated companies. All Rights Reserved.

OpenGL ES optimizations

Page 18: UplinQ - qualcomm® snapdragon™ processors a super gaming platform

18 ©2013-2014 Qualcomm Technologies, Inc. and/or its affiliated companies. All Rights Reserved.

Optimization: frame buffer objects Worst case pattern of FBO usage

Frame buffer

Clear Draw

FBO 0

Draw

Frame buffer

Draw

Store Store

Load Load

Store

Frame rendering

Page 19: UplinQ - qualcomm® snapdragon™ processors a super gaming platform

19 ©2013-2014 Qualcomm Technologies, Inc. and/or its affiliated companies. All Rights Reserved.

Optimization: frame buffer objects Optimized render order

Frame Buffer

Clear Draw

FBO 0

Store

Invalidate Framebuffer Draw

Store

Optimal rendering order: FBO0 invalidate, FBO0 draw … FBOn invalidate, FBOn draw, FB clear, FB draw

Frame rendering

Page 20: UplinQ - qualcomm® snapdragon™ processors a super gaming platform

20 ©2013-2014 Qualcomm Technologies, Inc. and/or its affiliated companies. All Rights Reserved.

Optimization: dynamic vertex buffer objects

• In the worst case the complete sequence of VBO updates and draw calls may have to be repeated for each bin

• Even when using glBufferSubData multiple copies of the entire VBO may need to be maintained by the driver

Worst case pattern of VBO usage

Update VBO0 Update VBO0 Update VBO0 Draw Draw Draw

Frame rendering

Page 21: UplinQ - qualcomm® snapdragon™ processors a super gaming platform

21 ©2013-2014 Qualcomm Technologies, Inc. and/or its affiliated companies. All Rights Reserved.

Optimization: dynamic vertex buffer objects Optimized dynamic VBO order

Update VBO0 Draw VBO0

Update VBO0 Update VBOn Draw VBO0 Draw VBOn

Or if multiple dynamic VBOs are used

Frame rendering

Frame rendering

Page 22: UplinQ - qualcomm® snapdragon™ processors a super gaming platform

22 ©2013-2014 Qualcomm Technologies, Inc. and/or its affiliated companies. All Rights Reserved.

Optimization: sorting

Potential to reduce both the number of state changes as well as overdraw - both of which have a negative impact on GPU performance

• Sort by material

− Reduces shader and texture state changes

• Sort opaque draw calls front-to back

− Reduces time spent shading fragments which will be overwritten later

− Have observed > 10ms/frame performance increase in some fragment bound content with just this optimization.

• Draw the skybox last

− Typically the skybox is covered by foreground geometry in half or more of the screen

Page 23: UplinQ - qualcomm® snapdragon™ processors a super gaming platform

23 ©2013-2014 Qualcomm Technologies, Inc. and/or its affiliated companies. All Rights Reserved.

Optimization: shader performance

Precision

• Operations on 16 bit floating point (mediump) values are 2x faster than on 32 bit (highp) − Recommend setting default precision to mediump and promoting only values which require higher

precision, E.g

Scalar architecture

• Adreno 3xx and 4xx GPUs utilize a scalar architecture

• Avoid using components that aren’t needed for the final result

• Wherever possible re-order operations to execute on as few components as possible

precision mediump float; // Set default precision in FS to fp16

out vec2 vSmallTexCoord; // Uses mediump out highp vec2 vLargeTexCoord; // Uses highp

Page 24: UplinQ - qualcomm® snapdragon™ processors a super gaming platform

24 ©2013-2014 Qualcomm Technologies, Inc. and/or its affiliated companies. All Rights Reserved.

Optimization: tessellation

Tessellation allows for incredible levels of detail and can substantially reduce memory bandwidth and CPU cycles by allowing other game sub-systems to operate on low resolution representations of meshes, but …

• High levels of tessellation can generate sub-pixel triangles which cause poor rasterizer utilization

− Very important to utilize distance, screen space size or other adaptive metrics for computing tessellation factors which avoid sub-pixel triangles

Full Rasterizer Utilization Partial Rasterizer Utilization

Page 25: UplinQ - qualcomm® snapdragon™ processors a super gaming platform

25 ©2013-2014 Qualcomm Technologies, Inc. and/or its affiliated companies. All Rights Reserved.

Optimization: tessellation

Culling

• Hardware back-face culling occurs after the tessellation stage, which potentially wastes GPU resources tessellating back facing primitives

• Back-facing primitives can be identified in the TCS and culled by setting their edge tessellation factors to 0

− A slight “fudge” factor may be needed in this calculation if displacement mapping will be used in the TES as this technique may change the visibility of primitives

General

• Whenever possible disable the TCS and TES stages if the tessellation factor for the mesh would be ~1

− Eliminates the use of unnecessary GPU stages

Page 26: UplinQ - qualcomm® snapdragon™ processors a super gaming platform

26 ©2013-2014 Qualcomm Technologies, Inc. and/or its affiliated companies. All Rights Reserved.

Adreno tools

Page 27: UplinQ - qualcomm® snapdragon™ processors a super gaming platform

27 ©2013-2014 Qualcomm Technologies, Inc. and/or its affiliated companies. All Rights Reserved.

Graphics content development & tools

Asset Creation

Compress/ Optimize

Code Emulate Compile Deploy Analyze/ Debug

Page 28: UplinQ - qualcomm® snapdragon™ processors a super gaming platform

28 ©2013-2014 Qualcomm Technologies, Inc. and/or its affiliated companies. All Rights Reserved.

Adreno SDK and Adreno Profiler and products of Qualcomm Technologies, inc.

Adreno tools

• Support for OpenGL ES 3.1, 3.0 & 2.0, DirectX, and OpenCL

• Supported on Windows, Mac OSX, and Linux

• Comprehensive collection of utilities

• Over 100 samples and tutorials

• Thorough documentation

Adreno SDK

Available on developer.qualcomm.com

Adreno Profiler

• Comprehensive profiling tool

• Supported on Windows, Mac OSX, and Linux

• Enables detailed analysis of GPU utilization

• Proven effective and easy to use

• Works with commercial devices & apps

Page 29: UplinQ - qualcomm® snapdragon™ processors a super gaming platform

29 ©2013-2014 Qualcomm Technologies, Inc. and/or its affiliated companies. All Rights Reserved.

Adreno Profiler: introduction

Grapher mode: real-time analysis Scrubber mode : detailed frame analysis

API call stack

Optimization suggestions

Shader stats

Shader editor

Texture browser

Detailed frame stats

Overrides

Metrics

Frame emulation

Scrubber metrics

Page 30: UplinQ - qualcomm® snapdragon™ processors a super gaming platform

30 ©2013-2014 Qualcomm Technologies, Inc. and/or its affiliated companies. All Rights Reserved.

Adreno Profiler demo

Reign of Amira™ Available on GooglePlay Reign of Amira is a product of Qualcomm Technologies, Inc.

Page 31: UplinQ - qualcomm® snapdragon™ processors a super gaming platform

31 ©2013-2014 Qualcomm Technologies, Inc. and/or its affiliated companies. All Rights Reserved.

Adreno SDK

• Desktop OpenGL ES emulator − Now supporting OpenGL ES 3.1

• Over 100 samples and tutorials − Simple tutorials to advanced demos

− Covers OpenGL ES 2.0 and 3.0, DirectX, and OpenCL

• Utilities and libraries − Texture compression

− Mesh optimization

• Adreno texture tool

• Developer documentation − Adreno Developer Guide

Shader samples

Animal materials (fur, elephant skin, fish scales, alligators, etc.)

General lighting (ambient, diffuse, specular, Blinn-Phong, parallax, etc.)

Human materials (skin, eye, etc.) Other effects (environment mapping, warping, glass distortion, god rays, etc.)

Other materials (cloth, wood, plastic, marble, leather, metal, etc.)

Advanced rendering (toon shading, deferred lighting, eye adaption, etc.)

Page 32: UplinQ - qualcomm® snapdragon™ processors a super gaming platform

32 ©2013-2014 Qualcomm Technologies, Inc. and/or its affiliated companies. All Rights Reserved.

Adreno SDK demo

Reign of Amira™ Available on GooglePlay

Page 33: UplinQ - qualcomm® snapdragon™ processors a super gaming platform

33 ©2013-2014 Qualcomm Technologies, Inc. and/or its affiliated companies. All Rights Reserved.

Special thanks

Page 34: UplinQ - qualcomm® snapdragon™ processors a super gaming platform

34 ©2013-2014 Qualcomm Technologies, Inc. and/or its affiliated companies. All Rights Reserved.

For more information on Qualcomm, visit us at: www.qualcomm.com & www.qualcomm.com/blog

©2013-2014 Qualcomm Technologies, Inc. and/or its affiliated companies. All Rights Reserved. Qualcomm, Snapdragon, Adreno, Gobi, Hexagon, FlexRender and Reign of Amira are trademarks of Qualcomm Incorporated, registered in the United States and other countries. Krait and Uplinq are trademarks of Qualcomm Incorporated. All Qualcomm Incorporated trademarks are used with permission. Other products and brand names may be trademarks or registered trademarks of their respective owners. References in this presentation to “Qualcomm” may mean Qualcomm Incorporated, Qualcomm Technologies, Inc., and/or other subsidiaries or business units within the Qualcomm corporate structure, as applicable. Qualcomm Incorporated includes Qualcomm’s licensing business, QTL, and the vast majority of its patent portfolio. Qualcomm Technologies, Inc., a wholly-owned subsidiary of Qualcomm Incorporated, operates, along with its subsidiaries, substantially all of Qualcomm’s engineering, research and development functions, and substantially all of its product and services businesses, including its semiconductor business, QCT.

Thank you FOLLOW US ON: