BY: ALI AJORIAN ISFAHAN UNIVERSITY OF TECHNOLOGY 2012 GPU Architecture 1.

19
BY: ALI AJORIAN ISFAHAN UNIVERSITY OF TECHNOLOGY 2012 GPU Architecture 1

Transcript of BY: ALI AJORIAN ISFAHAN UNIVERSITY OF TECHNOLOGY 2012 GPU Architecture 1.

Page 1: BY: ALI AJORIAN ISFAHAN UNIVERSITY OF TECHNOLOGY 2012 GPU Architecture 1.

1

BY: ALI AJORIANISFAHAN UNIVERSITY OF TECHNOLOGY

2012

GPU Architecture

Page 2: BY: ALI AJORIAN ISFAHAN UNIVERSITY OF TECHNOLOGY 2012 GPU Architecture 1.

2

Age of parallelism

Single CPU performance Doubled every 2 years for 30 years until 5 years ago. Marginal improvement in the last 5 years.

2005 year and checking walls Memory Wall Power Wall Processor Design Complexity

Sequential or parallel: this is the problem!!! More cores rather than more clock rate

Page 3: BY: ALI AJORIAN ISFAHAN UNIVERSITY OF TECHNOLOGY 2012 GPU Architecture 1.

3

Early parallel computing

It was not a big idea Main frames and super computers

Page 4: BY: ALI AJORIAN ISFAHAN UNIVERSITY OF TECHNOLOGY 2012 GPU Architecture 1.

4

And now GPUs

Stands for “Graphics Processing Unit”Integration Scheme: a card on the

motherboard with Massively Parallel computing power

Page 5: BY: ALI AJORIAN ISFAHAN UNIVERSITY OF TECHNOLOGY 2012 GPU Architecture 1.

5

A desktop supper computer

Page 6: BY: ALI AJORIAN ISFAHAN UNIVERSITY OF TECHNOLOGY 2012 GPU Architecture 1.

6

History of parallel computing

Page 7: BY: ALI AJORIAN ISFAHAN UNIVERSITY OF TECHNOLOGY 2012 GPU Architecture 1.

7

GPUs: A Brief History

Stage0: graphic accelerators Early VGA cards accelerate 2D GUI Just configurable

Stage1: Fixed Graphics Hardware Graphics-only platform Very limited programmability

Stage2: GPGPU Trick GPU to do general purpose computing Programmable, but requires knowledge on computer graphics

Stream Processing Platforms High-level programming interface No knowledge on Computer Graphics is required Examples: NVIDIA’s CUDA, OpenCL

Page 8: BY: ALI AJORIAN ISFAHAN UNIVERSITY OF TECHNOLOGY 2012 GPU Architecture 1.

8

Stream Processing Characteristics

Fairly simple computation on huge amount of data (streams) Single Program Multiple Data (SPMD)

Data Parallelism e.g., Matrix Operations, Image Processing

Page 9: BY: ALI AJORIAN ISFAHAN UNIVERSITY OF TECHNOLOGY 2012 GPU Architecture 1.

9

Graphic accelerators to CUDA GPUs(cont)

Page 10: BY: ALI AJORIAN ISFAHAN UNIVERSITY OF TECHNOLOGY 2012 GPU Architecture 1.

10

CUDA programming model

CPU + GPU heterogeneous programming Applications with sequential and parallel parts

Host : CPU Sequential threads

Device : GPU Parallel threads in SIMT architecture some kernels that runs on a grid of threads.

Page 11: BY: ALI AJORIAN ISFAHAN UNIVERSITY OF TECHNOLOGY 2012 GPU Architecture 1.

11

CUDA programming model

Page 12: BY: ALI AJORIAN ISFAHAN UNIVERSITY OF TECHNOLOGY 2012 GPU Architecture 1.

12

CUDA programming model(cont)

Page 13: BY: ALI AJORIAN ISFAHAN UNIVERSITY OF TECHNOLOGY 2012 GPU Architecture 1.

13

GPU Architecture (NVIDIA)

Page 14: BY: ALI AJORIAN ISFAHAN UNIVERSITY OF TECHNOLOGY 2012 GPU Architecture 1.

14

GPU Architecture (Fermi)

Page 15: BY: ALI AJORIAN ISFAHAN UNIVERSITY OF TECHNOLOGY 2012 GPU Architecture 1.

15

SM architecture

Page 16: BY: ALI AJORIAN ISFAHAN UNIVERSITY OF TECHNOLOGY 2012 GPU Architecture 1.

16

CUDA programming model

Page 17: BY: ALI AJORIAN ISFAHAN UNIVERSITY OF TECHNOLOGY 2012 GPU Architecture 1.

17

Memory types

Per block registers shared memory

Per thread local memory

Per grid Global memory Constant memory Texture memory

Page 18: BY: ALI AJORIAN ISFAHAN UNIVERSITY OF TECHNOLOGY 2012 GPU Architecture 1.

18

Memory types(cont)

Page 19: BY: ALI AJORIAN ISFAHAN UNIVERSITY OF TECHNOLOGY 2012 GPU Architecture 1.

19

Questions?