Pycon2014 GPU computing
-
Upload
ashwin-ashok -
Category
Engineering
-
view
195 -
download
1
Transcript of Pycon2014 GPU computing
GPU ACCELERATEDHIGH PERFORMANCECOMPUTING PRIMER
G RAJA SUMANT ASHOK ASHWIN
//early draft copy
CONTENTS1. INTRODUCTION2.WHY USE GPU?3.CPU ARCHITECTURE4.GPU ARCHITECTURE5.WHY PYTHON FOR GPU6.HOW GPU ACCELERATION WORKS7.TECHNOLOGIES AVAILABLE TODAY FOR GPU COMPUTING8.CUDA+PYTHON9.PyCuda sample code
INTRODUCTIONGPU-accelerated computing is the use of a graphics processing unit (GPU) together with a CPU to accelerate scientific, engineering, and enterprise applications. Pioneered in 2007 by NVIDIA, GPUs now power energy-efficient datacenters in government labs, universities, enterprises, and small-and-medium businesses around the world.
WHY USE GPU?
GPU-accelerated computing offers unprecedented application performance by offloading compute-intensive portions of the application to the GPU, while the remainder of the code still runs on the CPU. From a user's perspective, applications simply run significantly faster.
CPU ARCHITECTURE
Design target for CPUs:
1)Focus on Task parallelism2)Make a single thread very fast3)Hide latency through large caches
4)Predict, speculate
GPU ARCHITECTURE
The GPU architecture of AMD
GPU ARCHITECTURE
NVIDIA ARCHITECTURE
GPU ARCHITECTURE
WHY PYTHON FOR GPU
Go to a terminal , type python >> import this
read the output
WHY PYTHON FOR GPU
GPUs are everything that scripting languages are not.>Highly parallel>Very architecture-sensitive>Built for maximum FP/memory throughput>complement each otherCPU: largely restricted to control>tasks (1000/sec)>Scripting fast enough>Python + CUDA = PyCUDA>Python + OpenCL = PyOpenCL
http://www.nvidia.com/object/what-is-gpu-computing.html
TECHNOLOGIES AVAILABLE TODAY FOR GPU COMPUTING
Open computing language (OpenCL)> Many vendors: AMD, Nvidia, Apple, Intel, IBM...> Standard CPUs may report themselves as OpenCL capable>Works on most devices, but>Implemented feature set and extensions may vary
Compute unified device architecture (CUDA)>One vendor: Nvidia (more mature tools)>Better coherence across a limited set of devices
CUDA + PYTHON
PyCUDA>You still have to write your kernel in CUDA C>. . . but integrates easily with numpy>Higher level than CUDA C, but not much higher>Full CUDA support and performancegnumpy/CUDAMat/cuBLAS>gnumpy: numpy-like wrapper for CUDAMat>CUDAMat: Pre-written kernels and partial cuBLAS wrapper>cuBLAS: (incomplete) CUDA implementation of BLAS
PyCUDA sample code
>> open helloCUDA.py in editorlook and analyse the code
CUDAMAT
The aim of the cudamat project is to make it easy to perform basic matrix calculations on CUDA-enabled GPUs from Python. cudamat provides a Python matrix class that performs calculations on a GPU. At present, some of the operations the GPU matrix class supports include: Easy conversion to and from instances of numpy.ndarray. Limited slicing support. Matrix multiplication and transpose, Elementwise addition, subtraction, multiplication, and division.
open cudamat examples.
GNumpy
Module gnumpy contains class garray, which behaves much like numpy.ndarray
Module gnumpy also contains methods like tile() and rand(), which behave like their numpy counterparts except that they deal with gnumpy.garray instances, instead of numpy.ndarray instances.
gnumpy builds on cudamat
References● http://documen.tician.de/pycuda/tutorial.html
● http://on-demand.gputechconf.com/gtc/2010/presentations/S12041-PyCUDA-Simpler-GPU-Programming-Python.pdf
● http://www.tsc.uc3m.es/~miguel/MLG/adjuntos/slidesCUDA.pdf
● http://femhub.com/docs/cuda_en.pdf
● http://www.ieap.uni-kiel.de/et/people/kruse/tutorials/cuda/tutorial01p/web01p/tutorial01p.pdf
● http://conference.scipy.org/static/wiki/scipy09-pycuda-tut.pdf