Pycon2014 GPU computing

19
GPU ACCELERATED HIGH PERFORMANCE COMPUTING PRIMER G RAJA SUMANT ASHOK ASHWIN //early draft copy

Transcript of Pycon2014 GPU computing

Page 1: Pycon2014 GPU computing

GPU ACCELERATEDHIGH PERFORMANCECOMPUTING PRIMER

G RAJA SUMANT ASHOK ASHWIN

//early draft copy

Page 2: Pycon2014 GPU computing

CONTENTS1. INTRODUCTION2.WHY USE GPU?3.CPU ARCHITECTURE4.GPU ARCHITECTURE5.WHY PYTHON FOR GPU6.HOW GPU ACCELERATION WORKS7.TECHNOLOGIES AVAILABLE TODAY FOR GPU COMPUTING8.CUDA+PYTHON9.PyCuda sample code

Page 3: Pycon2014 GPU computing

INTRODUCTIONGPU-accelerated computing is the use of a graphics processing unit (GPU) together with a CPU to accelerate scientific, engineering, and enterprise applications. Pioneered in 2007 by NVIDIA, GPUs now power energy-efficient datacenters in government labs, universities, enterprises, and small-and-medium businesses around the world.

Page 4: Pycon2014 GPU computing

WHY USE GPU?

GPU-accelerated computing offers unprecedented application performance by offloading compute-intensive portions of the application to the GPU, while the remainder of the code still runs on the CPU. From a user's perspective, applications simply run significantly faster.

Page 5: Pycon2014 GPU computing

CPU ARCHITECTURE

Design target for CPUs:

1)Focus on Task parallelism2)Make a single thread very fast3)Hide latency through large caches

4)Predict, speculate

Page 6: Pycon2014 GPU computing

GPU ARCHITECTURE

The GPU architecture of AMD

Page 7: Pycon2014 GPU computing

GPU ARCHITECTURE

NVIDIA ARCHITECTURE

Page 8: Pycon2014 GPU computing

GPU ARCHITECTURE

Page 9: Pycon2014 GPU computing

WHY PYTHON FOR GPU

Go to a terminal , type python >> import this

read the output

Page 10: Pycon2014 GPU computing

WHY PYTHON FOR GPU

GPUs are everything that scripting languages are not.>Highly parallel>Very architecture-sensitive>Built for maximum FP/memory throughput>complement each otherCPU: largely restricted to control>tasks (1000/sec)>Scripting fast enough>Python + CUDA = PyCUDA>Python + OpenCL = PyOpenCL

Page 11: Pycon2014 GPU computing
Page 12: Pycon2014 GPU computing
Page 13: Pycon2014 GPU computing

http://www.nvidia.com/object/what-is-gpu-computing.html

Page 14: Pycon2014 GPU computing

TECHNOLOGIES AVAILABLE TODAY FOR GPU COMPUTING

Open computing language (OpenCL)> Many vendors: AMD, Nvidia, Apple, Intel, IBM...> Standard CPUs may report themselves as OpenCL capable>Works on most devices, but>Implemented feature set and extensions may vary

Compute unified device architecture (CUDA)>One vendor: Nvidia (more mature tools)>Better coherence across a limited set of devices

Page 15: Pycon2014 GPU computing

CUDA + PYTHON

PyCUDA>You still have to write your kernel in CUDA C>. . . but integrates easily with numpy>Higher level than CUDA C, but not much higher>Full CUDA support and performancegnumpy/CUDAMat/cuBLAS>gnumpy: numpy-like wrapper for CUDAMat>CUDAMat: Pre-written kernels and partial cuBLAS wrapper>cuBLAS: (incomplete) CUDA implementation of BLAS

Page 16: Pycon2014 GPU computing

PyCUDA sample code

>> open helloCUDA.py in editorlook and analyse the code

Page 17: Pycon2014 GPU computing

CUDAMAT

The aim of the cudamat project is to make it easy to perform basic matrix calculations on CUDA-enabled GPUs from Python. cudamat provides a Python matrix class that performs calculations on a GPU. At present, some of the operations the GPU matrix class supports include: Easy conversion to and from instances of numpy.ndarray. Limited slicing support. Matrix multiplication and transpose, Elementwise addition, subtraction, multiplication, and division.

open cudamat examples.

Page 18: Pycon2014 GPU computing

GNumpy

Module gnumpy contains class garray, which behaves much like numpy.ndarray

Module gnumpy also contains methods like tile() and rand(), which behave like their numpy counterparts except that they deal with gnumpy.garray instances, instead of numpy.ndarray instances.

gnumpy builds on cudamat

Page 19: Pycon2014 GPU computing

References● http://documen.tician.de/pycuda/tutorial.html

● http://on-demand.gputechconf.com/gtc/2010/presentations/S12041-PyCUDA-Simpler-GPU-Programming-Python.pdf

● http://www.tsc.uc3m.es/~miguel/MLG/adjuntos/slidesCUDA.pdf

● http://femhub.com/docs/cuda_en.pdf

● http://www.ieap.uni-kiel.de/et/people/kruse/tutorials/cuda/tutorial01p/web01p/tutorial01p.pdf

● http://conference.scipy.org/static/wiki/scipy09-pycuda-tut.pdf