The Evolving High-Performance Computing Space and UHjohnsson/Talks/Research_Day_2004_RS.pdf · The...

45
Research Day Univ of Houston April 2, 2004 The Evolving High-Performance Computing Space and UH Lennart Johnsson Director TLC 2 Cullen Professor of Computer Science, Mathematics and Electrical and Computer Engineering

Transcript of The Evolving High-Performance Computing Space and UHjohnsson/Talks/Research_Day_2004_RS.pdf · The...

Research Day

Univ of Houston

April 2, 2004

The Evolving High-Performance Computing Space and UH

Lennart JohnssonDirector TLC2

Cullen Professor of Computer Science, Mathematics and Electrical and Computer

Engineering

Research Day

Univ of Houston

April 2, 2004

"The first communication revolution of the20th Century gave us telephone-basedcommunications. The second gave uscomputer-based communications like emailand the Internet. The 21st Century will bring usa knowledge-based communicationsrevolution. We will be able to use intelligentnetwork software to enhance and expandhuman knowledge.“

-Kenan Sahin, Vice Precident, Bell Labs, 1999

Research Day

Univ of Houston

April 2, 2004

Moore’s Law$1 US Purchasing Power circa 2000*

1 candy bar = 1 million transistors

1 newspaper = 1 million transistors

1 cup of coffee = 1 million transistors

*Figures based on US Gov’t CPI

Research Day

Univ of Houston

April 2, 2004

The transistor 1947First Junction Transistor, 1951

Integrated Circuits

Jack Kilby TI, 1958

Robert Noyce Fairchild, 1959

Intel 4004, 2300 transistors, 1.5 x 3 mm

The Microprocessor, Hoff and Faggin, 1971

The Transistor

Research Day

Univ of Houston

April 2, 2004

TRS-80, 19764 kb ROM16 kb RAM

IBM 5150Year: 1981Chip: 4.77MHz Intel 8088Memory: 64KStorage: 160K per 5.25-inch floppyPrice: $3,005Apple I, 1976

$500

TMC CM-5

IBM SP

The PC, 1976 - 1981 Parallel Computing, 1985 - 1995

Clusters

Research Day

Univ of Houston

April 2, 2004

Fiberoptic Communication Milestones First Laser 1960

First room temperature laser, ~1970

Continuous mode commercial lasers, ~1980

Tunable lasers, ~1990

Commercial fiberoptic WANs, 1985

10 Tbps/strand demonstrated in 2000 (10% of fiber peak capacity). (10 Tbps is enough bandwidth to transmit a million high-definition resolution movies simultaneously, or over 100 million phone calls).

WAN fiberoptic cables often have 384 strands of fiber and would have a capacity of 2 Pbps. Several such cables are typically deployed in the same conduit/right-of-way

Research Day

Univ of Houston

April 2, 2004

Optical Communication costs

Larry Roberts, Caspian Networks

Research Day

Univ of Houston

April 2, 2004

Fiber Optic Communication

In 2010. . .

A million books can be sent across the Pacific for 1$ in 8 seconds

All books in the American Research Libraries can be sent across the Pacific in about 1 hr for $500

Research Day

Univ of Houston

April 2, 2004

Grids, 1995 -The I-Way@SC95

UH, 1996 -

Research Day

Univ of Houston

April 2, 2004

Storage Costs

10,000 Books 35 hrs of CDQuality audio

2 min of DVDQuality Video

IBM 9.1GB Ultra 2XP

1980 1985 1990 1995 2000 2005 2010Year

0.001

0.01

0.1

1

10

100

1000

Pric

e/M

Byt

e, D

olla

rs

HDD DRAM Flash Paper/FilmAverage Price of Storage

IBM 18.2GB Ultrastar

IBM Deskstar 37GB

Toshiba 6.4GB

IBM Deskstar4

IBM Deskstar3

IBM 16.8GB Deskstar

IBM 8.1GB Travelstar

Seagate 8.6GB

Quant 4.5GB

64MB

IBM 9.1GB Ultrastar

96 MB Flash Camera Mem.

64MB Flash

4MB Flash

16MB Flash1MB Flash

512KB Flash256KB Flash

128KB Flash

8KB

32KB 64KB

128KB

512KB 1MB 2MB

4MB

IBM6150

Wren II Seagate ST125

Maxt170IBM0615

IBM0663

Seagate B'cuda4

Seagate ST500

oem

prc2

000a

a.pr

z

128MB Flash

64MB

Ed Grochowski at Almaden

128MB Flash

IBM 25GB Travelstar

IBM 340 MB Microdrive

IBM Deskstar 25GB

IBM Deskstar 75GXP

IBM 1 GB Microdrive

1" HDD ProjectionDataQuest 2000

Flash ProjectionDataQuest 2000

Range of Paper/Film

3.5 " HDD 2.5 " HDD

1 " HDD

Flash

DRAM

In 2010, $1 will buy enough disk space to store

Research Day

Univ of Houston

April 2, 2004

Optical Networksthe 21st Century Driver

Scientific American, January 2001Number of Years

0 1 2 3 4 5

Perf

orm

ance

per

Dol

lar S

pent

Data Storage(bits per square inch)

(Doubling time 12 Months)

Optical Fiber(bits per second)

(Doubling time 9 Months)

Silicon Computer Chips(Number of Transistors)

(Doubling time 18 Months)

1986 to 2000– Computers: x 500– Networks: x 340,000

2001 to 2010– Computers: x 60– Networks: x 4000

WAN fiberoptic cables often have 384 strands of fiber and would have a capacity of 2 Pbps. Several such cables are typically deployed in the same conduit/right-of-way

Dark Fiber: $0.01 – $4/strand-meter

Research Day

Univ of Houston

April 2, 2004

"The next major revolution in business computing has alreadyarrived, it's called Grid Computing, and will provide access withinthe palm of your hand to virtually every known electronic resource.The Grid Computing paradigm, the first 'killer app' to finallyobsolesce the desktop PC as we know it today, will arrive throughthe ability to access a world wide network of computing powerfrom virtually any device, including handheld devices andembedded computers. One of the factors that will drive the Gridcomputing revolution is the limited capacity of handheld devicesto support applications."

- Delphi report "Global Grid: The Quiet Revolution.” “

April 2002

Research Day

Univ of Houston

April 2, 2004

“The Internet began as a platform [for] communicating, butgoing forward it will evolve into a platform for computing -- gridcomputing. Those vendors not aligned with that community willbe on the wrong side of history."

William Zeitler, IBM, Senior Vice President,

LinuxWorld 2002

Research Day

Univ of Houston

April 2, 2004

Grids“We will perhaps see the spread of

‘computer utilities’, which, like present electric and telephone utilities, will service individual homes and offices

across the country”

(Len Kleinrock, 1969) (An Internet Founder)

Research Day

Univ of Houston

April 2, 2004

E-Science: Data Gathering, Analysis, Simulation, and Collaboration

LHC

CMS

Simulated Higgs Decay

Research Day

Univ of Houston

April 2, 2004

21st Century Science and Engineering• The three fold way

– theory– experiment– computational simulation

• Supported by– multimodal collaboration systems– distributed, multi-petabyte data archives– leading edge computing systems– distributed experimental facilities– internationally distributed

multidisciplinary teamsTh

eory

Expe

rimen

t

Simulation

Courtesy Paul Messina

Research Day

Univ of Houston

April 2, 2004

Driving ApplicationsAstronomy

CMS

Atlas

LHCb

ALICE

Physics Weather

Life Sciences Medicine

BPBloodGlucose

Heart Rate

Temp

Engineering

Research Day

Univ of Houston

April 2, 2004

CCSM Development Projections

Current Community Climate System Model (CCSM) Component Resolution

Atmosphere 240km 26 levelsLand 50kmOcean 100km 40 levelsSea Ice 100kmModel years/day 8Tot. Compute Resources 3TFlopsStorage (TB/century) 1

At current scientific complexity a century simulation requires 12.5 days.

Single researcher transfers 80Gb/day and generates 30TB storage each year.

Machine and Data Requirements

36.6

14.531.9

70.3154

15004000

12.2

4.810.6

23.451.5

510

1500

1

10

100

1000

10000

dyn v

eg

trop c

hemist

rybio

geoch

emca

rbon c

ycle

strat c

hemed

dy re

solv

cloud

reso

lv

Model Development

TflopsTbytes

Machine and Data Requirements

36.6

14.531.9

70.3154

15004000

12.2

4.810.6

23.451.5

510

1500

1

10

100

1000

10000

dyn v

eg

trop c

hemist

rybio

geoch

emca

rbon c

ycle

strat c

hemed

dy re

solv

cloud

reso

lv

Model Development

TflopsTbytes

Research Day

Univ of Houston

April 2, 2004

Molecular dynamics of a lipopolysaccharide (LPS)

Nanosecond molecular dynamics of the LPS membrane of Pseudomonas aeruginosa

QM/MM molecular dynamics of membrane plus mineral

Computational Molecular BiologyMembrane with possible uses for bioremediation

.25TF

11TF

Energy transduction across membranes is still out of reach at 1.0PF

400 TF>1.0 PF

TLC2

ClusterNational Centers

2003/20042007 - 2010

Research Day

Univ of Houston

April 2, 2004

Proteomics

http://www.csm.ornl.gov/ghpn/report4.pdf

Research Day

Univ of Houston

April 2, 2004

Astronomy

• The planned Large Synoptic Survey Telescope will produce over 10 petabytes per year by 2008!– All-sky survey every

few days, so will have fine-grain time series for the first time

Coming Floods of Data

Research Day

Univ of Houston

April 2, 2004

LHC Computing Grid (LCG)

LHC: 10 – 15 PB/yr

Research Day

Univ of Houston

April 2, 2004

The coming floods of data ….

150 M searches/day1.6 M downloads/day

10 TB data transfer/day 1-2 TB data transfer/day

100 servers 15000 servers

Research Day

Univ of Houston

April 2, 2004

Archival Storage• Datalagring

– Uppskattad tillvaxt (Paul Messina) • 2000 ~0.5 petabyte• 2005 ~10 petabytes• 2010 ~100 petabytes• 2015 ~1000 petabytes?

• Astronomi (Paul Messina)– Idag, 100+ TB

• Digital Palomar Observatory survey, 3 TB• 2 Micron All Sky Survey, 10 TB• MACHO, SDSS, GSC-II, COBE, MAP, NVSS, FIRST, GALEX, ROSAT, OGLE, ... var och en tio-tals TB

– The planned Large Synoptic Survey Telescope will produce over 10 PB per year by 2008!• Physics

– LHC – 5 – 15 PB/yr• “Earth Science”

– Earth Resources Observation System 220TB– USGS Emergency response GIS application, 0.5 TB/urban area

• Microscopy– Electron microscopy, 1 – 10 TB/yr/group @ 2kx2k resolution 8kx8k soon– 2-Photon confocal laser microscope, ~150 Mpixel images

• Medicine– Mammography 10 PB/yr in US (300 TB/yr Sweden?)– BIRN 10 TB today 400TB

Research Day

Univ of Houston

April 2, 2004

Estimated Bandwidth NeedsDiscipline Size of Archives Growth rate of Archives Bandwidth needs

Life Sciences

Mammography 50 – 100 PB 25 – 60 PB/yr ~ 10 Gbps

Microscopy 100+ PB 50 – 100 PB/yr 1 – 10 Gbps

Other imaging 200+ PB 100+ PB/yr 10 – 100 Gbps

Major medical center 100 – 1,000PB ~100 Gbps

Earth Sciences

Weather 1 – 10 Gbps

Climate 100+ PB 50 – 100 PB/yr 10 – 100 Gbps

Environment ~ 10 Gbps

High-Energy Physics 100+ PB 20 – 50 PB/yr 10 – 100 Gbps

Astronomy 100+ PB 20 – 50 PB/yr 10 – 100 Gbps

Telemedicine, Telescience 1 - 10 Gbps

Collaboration 1 Gbps

Remote steering 10 Gbps

Remote Visualization 10 Gbps

Computation 100 – 1,000 Gbps

Research Day

Univ of Houston

April 2, 2004

• What is UH doing to – contribute to the technology (r)evolution, – and benefit from it?

Research Day

Univ of Houston

April 2, 2004

TLC2 – PGH 200

48 seats with- 15” flat panel displays- 1.4 GHz AMD Athlon PCs- individual teacher – student

interaction facilitiesAV for video conferencingDual projection capability

Research Day

Univ of Houston

April 2, 2004

TLC2 – PGH 232

126 seats with……..

Video conference facilitiesStereographic projection capabilities

Research Day

Univ of Houston

April 2, 2004

TLC2 Collaboration TechnologiesETF Management Meeting Seminar

Seminar

SC Global Workshop

Performance Art

2003

Access Grid

Research Day

Univ of Houston

April 2, 2004

TLC2 – Visualization Laboratory

Research Day

Univ of Houston

April 2, 2004

TLC2 40 TB Distributed Storage Facility

2002/2003

Research Day

Univ of Houston

April 2, 2004

TLC2 1.6 TFlops Linux Cluster

Second most powerful computer in a Texas Educational Institution

Ranked 102 among the worlds 500 most powerful computers

Acquired as part of an Intel/HP/UH partnership

Research Day

Univ of Houston

April 2, 2004

In the Works – 10 GigE Houston Research and Education Network Infrastructure

Research Day

Univ of Houston

April 2, 2004

High Performance Computing Across Texas (HiPCAT) — http://www.hipcat.net

Research Day

Univ of Houston

April 2, 2004

A Science Based Case for Large-Scale Simulation: DoE Office of Science, Vol I, June 2003.

Research Day

Univ of Houston

April 2, 2004

• Computer science• Managing exponential complexities

Research Day

Univ of Houston

April 2, 2004

GrADSoft Architecture

Config-urableObject

Program

Execution EnvironmentProgram Preparation SystemPerformance

Feedback

Whole-ProgramCompiler

Libraries

SourceAppli-cation

SoftwareComponents

Binder

PerformanceProblem

Real-timePerformance

Monitor

ResourceNegotiator

Scheduler

GridRuntimeSystem

Negotiation

Research Day

Univ of Houston

April 2, 2004

OPUS, TORC, CYPHER

0

500

1000

1500

2000

2500

3000

3500

0 5000 10000 15000 20000Matrix Size

Tim

e (s

econ

ds)

5 OPUS8 OPUS 8 OPUS

8 OPUS, 6 CYPHER

8 OPUS, 2 TORC, 6 CYPHER

6 OPUS, 5 CYPHER

2 OPUS, 4 TORC, 6 CYPHER

8 OPUS, 4 TORC, 4 CYPHER

OPUS OPUS, CYPHER

ScaLAPACK Across 3 Clusters

Research Day

Univ of Houston

April 2, 2004

PDSYEVX – Timing BreakdownPDSYEVX - torcs, cyphers

0

1000

2000

3000

4000

5000

6000

1000

-1

2000

-1

3000

-1

4000

-2

5000

-4

7000

-5

1000

0-10

N-nproc

Tim

e (s

)

other_grid_overheadperf_modeling_timenwsmdspdsyevx_driver_overheadback transformationcompute eigenvectorscompute eigenvaluestridiagonal reduction

Uses torcs only

Uses 5 torcs and 5 cyphers

Research Day

Univ of Houston

April 2, 2004

Cactus – Migration

Research Day

Univ of Houston

April 2, 2004

Key:Fixed library code

Generated code

Code generator

Unparser Scheduler

Optimizer Initializer(Algorithm Abstraction)

FFT CodeGenerator

Library ofFFT Modules

InitializationRoutines

Mixed-Radix(Cooly-Tukey)

Prime FactorAlgorithm

Split-RadixAlgorithm

Rader'sAlgorithm

ExecutionRoutines

Utilities

UHFFTLibrary

UHFFT Architecture

Funded in part by the Alliance (NSF) and LACSI (DoE)

Research Day

Univ of Houston

April 2, 2004

Performance TuningMethodology

Input ParametersSystem specifics,

UHFFT Code generator

Library of FFT modules

Performancedatabase

User options

Installation

Input ParametersSize, dim., …

InitializationSelect best plan

ExecutionCalculate one or more FFTs

Run-time

Research Day

Univ of Houston

April 2, 2004

UHFFT Codelet Performance

Research Day

Univ of Houston

April 2, 2004

UHFFT Codelet Performance

Research Day

Univ of Houston

April 2, 2004

VGrADS