Future Directions for CUDA...Device API 1000+ new NVPP functions cuBLAS cuFFT Thrust cuRand cuSparse...

© 2013 NVIDIA

Mark Harris Chief Technologist, GPU Computing Software, NVIDIA

Future Directions for CUDA

http://www.gputechconf.com/page/home.html

© 2013 NVIDIA

Platform for Parallel Computing

The CUDA Platform is a

foundation that supports a

diverse parallel computing

ecosystem.

Platform

© 2013 NVIDIA

1.0 2.0 3.0 4.0 5.0

C++ Dynamic

Parallelism

C

Device Code

Linking NVCC

Fortran (PGI)

cuda-memcheck

Nsight

Eclipse Ed.

Detect

Shared Memory

Hazards

cuBLAS

Device API 1000+ new NVPP

functions

cuBLAS

cuFFT

Thrust

cuRand

cuSparse

LLVM

New Visual

Profiler

GPU-Aware

MPI

C++ new/delete

Virtual functions

Templates

UVA

nvidia-smi

GPUDirect

Recursion

cuda-gdb

Visual Profiler

Command-

Line Profiler

NVPP

Nsight IDE

OpenACC

Inheritance

Function pointers


Compiler Tool Chain

Programming Languages

Libraries

Developer Tools

Platform

© 2013 NVIDIA

Investing in the Future

Enabling More Programmers

Programming Model

Future Computing Platforms

Platform

© 2013 NVIDIA

Unified Programming Language

© 2013 NVIDIA

GPU

A

CPU

main

Unified Run-Time Interface

B

C

X

Y

Z

CUDA Dynamic Parallelism

© 2013 NVIDIA

CUDA UVM Demo

© 2013 NVIDIA

Simpler, More Integrated Programming

16

2

4

6

8

10

12

14

DP G

FLO

PS p

er

Watt

2008 2010 2012 2014 Unified Language

Unified

Run-Time

Unified Virtual

Memory

Tesla Fermi

Kepler

Maxwell

© 2013 NVIDIA

Diversity of Programming Languages

http://www.ohloh.net

© 2013 NVIDIA

Enabling More Programming Languages

Developers want to build

front-ends for

Python, Java, R, DSLs …

Target other processors like

ARM, FPGAs, GPUs, x86 …

CUDA C, C++, Fortran

LLVM Compiler For CUDA

NVIDIA GPUs

x86 CPUs

New Language Support

New Processor Support

© 2013 NVIDIA

Enabling More Programming Languages

CUDA C, C++, Fortran

LLVM Compiler For CUDA

NVIDIA GPUs

x86 CPUs

New Language Support

New Processor Support

Halide (http://halide-lang.org/)

Mozilla Rust

© 2013 NVIDIA

Rapid Development

Powerful Libraries

Commercial Support

Large Community

© 2013 NVIDIA

Is Python Fast Enough for HPC?

Python apps often implement

performance critical functions in C/C++.

© 2013 NVIDIA

Compile Python for Parallel Architectures

Anaconda Accelerate from Continuum Analytics

NumbaPro array-oriented compiler for Python & NumPy

Compile for CPUs or GPUs (uses LLVM + NVIDIA Compiler SDK)

Fast Development + Fast Execution: Ideal Combination

http://continuum.io

Free Academic

License

© 2013 NVIDIA

10242 Mandelbrot Time Speedup v. Pure Python

Pure Python 4.85s --

NumbaPro (CPU) 0.11s 44x

CUDA Python (K20) .004s 1221x

CUDA Python

CUDA Programming,

Python Syntax

© 2013 NVIDIA

CUDA 5 | OpenGL 4.3

Kick starts ARM + CUDA Ecosystem

NAMD Ported in 2 Days

Kayla Development Platform

Quad ARM + Kepler GPU

Quad ARM + Any CUDA GPU

© 2013 NVIDIA

1.0 2.0 3.0 4.0 5.0

C++ Dynamic

Parallelism

C

Device Code

Linking NVCC

Fortran (PGI)

cuda-memcheck

Nsight

Eclipse Ed.

Detect

Shared Memory

Hazards

cuBLAS

Device API 1000+ new NVPP

functions

cuBLAS

cuFFT

Thrust

cuRand

cuSparse

LLVM

New Visual

Profiler

GPU-Aware

MPI

C++ new/delete

Virtual functions

Templates

UVA

nvidia-smi

GPUDirect

Recursion

cuda-gdb

Visual Profiler

Command-

Line Profiler

NVPP

Nsight IDE

OpenACC

Inheritance

Function pointers


Compiler Tool Chain


Libraries

Developer Tools

Platform

© 2013 NVIDIA

5.0


JIT

Linking

JIT

Compilation

Profiler

Step-by-Step Guidance Single-GPU Debugging

Multi-GPU Support ARM Support

Compiler Tool Chain


Libraries

Developer Tools

C++11

Sparse Solvers

Platform

© 2013 NVIDIA

Ubiquitous

parallel

programming

Power

Aware

Programming

Hybrid

operating

system

Enablement

Parallel

Compiler

Foundation

Enablement

Optimizing

locality and

computation

Task, Thread

& Data

Parallelism

Today Easier

Parallel

Programming

Future Challenges

Future Directions for CUDA...Device API 1000+ new NVPP functions cuBLAS cuFFT Thrust cuRand cuSparse...

Documents

Transcript of Future Directions for CUDA...Device API 1000+ new NVPP functions cuBLAS cuFFT Thrust cuRand cuSparse...