Heterogeneous Computing: New Directions for Efficient and Scalable High-Performance Computing Dr....

Heterogeneous Computing:New Directions for Efficient and Scalable High-Performance Computing

Dr. Jason D. Bakos

CSCE 190: Computing in the Modern World 2

Logic Synthesis

• Behavior:– S = A + B– Assume A is 2

bits, B is 2 bits, C is 3 bits

00 (0) 00 (0) 000 (0)

00 (0) 01 (1) 001 (1)

00 (0) 10 (2) 010 (2)

00 (0) 11 (3) 011 (3)

01 (1) 00 (0) 001 (1)

01 (1) 01 (1) 010 (2)

01 (1) 10 (2) 011 (3)

01 (1) 11 (3) 100 (4)

10 (2) 00 (0) 010 (2)

10 (2) 01 (1) 011 (3)

10 (2) 10 (2) 100 (4)

10 (2) 11 (3) 101 (5)

11 (3) 00 (0) 011 (3)

11 (3) 01 (1) 100 (4)

11 (3) 10 (2) 101 (5)

11 (3) 11 (3) 110 (6)

010011101012

010101100101012

010100011010101012

010101010101

0101010101012

BBABBAAAABBC

BBAABBAAAAAABBC

BBAAAABBAAAAAAABBC

BBAABBAABBAA

BBAABBAABBAAC

Logic Gates

AY BAY

inv NAND2NAND3

Layout

3-input NAND

CSCE 791 April 2, 2010 5

Minimum Feature Size

Year Processor Speed Transistors Process

1982 i286 6 - 25 MHz ~134,000 1.5 mm

1986 i386 16 – 40 MHz ~270,000 1 mm

1989 i486 16 - 133 MHz ~1 million .8 mm

1993 Pentium 60 - 300 MHz ~3 million .6 mm

1995 Pentium Pro 150 - 200 MHz ~4 million .5 mm

1997 Pentium II 233 - 450 MHz ~5 million .35 mm

1999 Pentium III 450 – 1400 MHz ~10 million .25 mm

2000 Pentium 4 1.3 – 3.8 GHz ~50 million .18 mm

2005 Pentium D 2 cores/package ~200 million .09 mm

2006 Core 2 2 cores/die ~300 million .065 mm

2008 Core i7 4 cores/die8 threads/die

~800 million .045 mm

2010 “Sandy Bridge”

8 cores/die16 threads/die??

?? .032 mm

Computer Architecture Trends

• Multi-core architecture:– Individual cores are large and heavyweight, designed to force performance out of

generalized code– Programmer utilizes multi-core using OpenMP

CSCE 791 April 2, 2010 6

L2 Cache (~50% chip)

Memory

Co-Processors

CSCE 791 April 2, 2010 7

• Special-purpose (not general) processor• Accelerates CPU

IBM Cell/B.E. Architecture

CSCE 791 April 2, 2010 8

• 1 PPE, 8 SPEs

• Programmer must manually manage 256K memory and threads invocation on each SPE

• Each SPE includes a vector unit like the one on current Intel processors– 128 bits wide

CSCE 791 April 2, 2010 9

High-Performance Reconfigurable Computing

• Heterogeneous computing with reconfigurable logic, i.e. FPGAs

CSCE 791 April 2, 2010 10

Programming FPGAs

Heterogeneous Computing

CSCE 791 April 2, 2010 11

initialization

0.5% of run time

“hot” loop

99% of run time

clean up

0.5% of run time

49% of code

1% of code

co-processor

Kernelspeedu

Application

speedup

Execution

50 34 5.0 hours

100 50 3.3 hours

200 67 2.5 hours

500 83 2.0 hours

1000 91 1.8 hours

• Example:– Application requires a

week of CPU time– Offload computation

consumes 99% of execution time

CSCE 791 April 2, 2010 12

Heterogeneous Computing with FPGAs

Annapolis Micro SystemsWILDSTAR 2 PRO

GiDEL PROCSTAR III

Heterogeneous Computing with FPGAs

CSCE 791 April 2, 2010 13

Convey HC-1

Heterogeneous Computing with GPUs

CSCE 791 April 2, 2010 14

NVIDIA Tesla S1070

CSCE 791 April 2, 2010 15

Heterogeneous Computing now Mainstream:IBM Roadrunner

• Los Alamos, second fastest computer in the world

• 6,480 AMD Opteron (dual core) CPUs• 12,960 PowerXCell 8i GPUs• Each blade contains 2 Operons and 4

Cells• 296 racks

• First ever petaflop machine (2008)

• 1.71 petaflops peak (1.7 billion million fp operations per second)

• 2.35 MW (not including cooling)– Lake Murray hydroelectric plant

produces ~150 MW (peak)– Lake Murray coal plant (McMeekin

Station) produces ~300 MW (peak)– Catawba Nuclear Station near Rock

Hill produces 2258 MW

CSCE 791 April 2, 2010 16

“Traditional” Parallel/Multi-Processing

• Large-scale parallel platforms:– Individual computers connected

with a high-speed interconnect

• Upper bound for speedup is n, where n = # processors– How much parallelism in

program?– System, network overheads?

Acknowledgement

Heterogeneous and Reconfigurable Computing Grouphttp://herc.cse.sc.edu

Zheming JinTiffany Mintz Krishna Nagar Jason Bakos Yan Zhang

CSCE 791 April 2, 2010 17

Heterogeneous Computing: New Directions for Efficient and Scalable High-Performance Computing Dr....

Documents

Transcript of Heterogeneous Computing: New Directions for Efficient and Scalable High-Performance Computing Dr....

Directive-based approach to Heterogeneous Computing

Heterogeneous Computing at USC Dept. of Computer Science and Engineering University of South Carolina Dr. Jason D. Bakos Assistant Professor Heterogeneous.

Heterogeneous Computing for Graph Algorithms

Heterogeneous Computing in Ma Thematic A

Heterogeneous Computing & GPU Introduction

Portable Heterogeneous High-Performance Computing via ...

Scientific Computing on Heterogeneous …babrodtk.at.ifi.uio.no/files/publications/brodtkorb_phd...Scientiﬁc Computing on Heterogeneous Architectures Thesis by André Rigland Brodtkorb

EECC722 - Shaaban #1 Lec # 12 Fall 2005 11-2-2005 Heterogeneous Computing (HC) & Micro-Heterogeneous Computing (MHC) High Performance Computing (HPC) Trends.

OpenCL Heterogeneous Parallel Computing

Enabling Rack-scale Confidential Computing Using ... · A. Heterogeneous Data-Center Computing Architecture Heterogeneous computing support. Heterogeneous comput-ing architectures

HeteroRefactor: Refactoring for Heterogeneous Computing ...

The Future Is Heterogeneous Computing

Using Heterogeneous Computing for Solving Vehicle Routing ... · Computing for Solving Vehicle Routing Problems ... Capacitated Vehicle Routing Problem ... Using Heterogeneous Computing

EECC722 - Shaaban #1 Lec # 12 Fall 2004 11-8-2004 Heterogeneous Computing (HC) & Micro-Heterogeneous Computing (MHC) High Performance Computing (HPC) Trends.

Yang Gao , Jason D. Bakos Heterogeneous and Reconfigurable Computing Lab (HeRC)

Heterogeneous Computing in the Edge

Heterogeneous Computing in Mathematica 8

SINTEF ICT, Applied Mathematics, Heterogeneous Computing Group · 2014-11-17 · Heterogeneous Computing (Oslo) performs research on heterogeneous computing, many-core and data-stream

EECC722 - Shaaban #1 Lec # 12 Fall 2011 10-31-2011 Heterogeneous Computing (HC) & Micro-Heterogeneous Computing (MHC) High Performance Computing (HPC)

Dr. Jason D. Bakos Assistant Professor Heterogeneous and Reconfigurable Computing Lab (HeRC)