CHEP 2012 Computing in High Energy and Nuclear Physics

CHEP 2012Computing in High Energy and Nuclear Physics

Forrest Norrod

Vice President and General Manager, Servers

Advancing human progress through basic knowledgeadvances technological innovations

Or whether, in an urge to provide better communication, one might have found electromagnetic waves.

They weren't found that way.

One might ask … whether basic circuits in computers might have been found by people who wanted to build computers.

As it happens, they were discovered in the thirties by physicists

dealing with the counting of nuclear particles because they were interested in nuclear physics. 1

They were found by Hertz who emphasized the beauty of physics

and who based his work on the theoretical considerations of Maxwell. 1

1: C.H. Llewellyn Smith, former Director-General of CERN

Physicists’ needs are outpacing Engineers delivery2007 goal was to achieve Exascale by 2015So far … off by 100x

Major challenges Power Reliability/

Resiliency Concurrency &

locality (drives efficiency)

COSTSource: www.top500.org

Performance Trajectories

Source: ExaScale Computing Study: Technology Challenges in Achieving Exascale Systems

What Levers do We Have?• Challenge: Sustain performance trajectory without massive

increases in cost, power, real estate, and unreliability

• Solutions: No single answer, must intelligently turn “Architectural Knobs” = ( Freq ) × ( ) × ( # of Sockets ) × ( × ( Efficiency )

Hardware Performance What you can really get

Architecture Knob #1: frequency

Frequency stagnant for a decade

Unlikely to change anytime soon

• Thermal density challenges

• Macro power & cooling concerns

• No longer a main focus of IC industry

Moore’s law is fine …

–130nm -> 22nm LOTS of Transistors

Cores/socket growth in standard CPUs tapering off …

Physical “uncore” problem: very costly to feed the powerful beast

Mismatch with DRAM speeds

Huge growth in CPU skt size, pins, memory (500 to 2000 pins in last 10 years, 3500+ pin beasts on horizon)

Architecture Knob #2: cores / socket

Architecture knob #3: # of sockets

Heaviest recent growth Despite accelerators“Easiest” knob to turn

But, Massive implicationsNetwork performanceResiliency & StabilityDensityPower/cooling costs

Architecture knob #5: efficiency

Large compute networks generally = lower system efficiency

Particularly as more CPUs & infrastructure are added

Efficiency driven by Locality, storage, network, CPU,…..and code quality

Typical values range: 0.3 – 0.9

More Sockets

Architecture knob #4: Instructions per core/clock

IPC & FLOPS growth continues

uArch , replication of pipeline subunits, # of FPUs, FMA , BLAS libraries, others

ISA, AVX, SSEx, FPU Vector Width

Direct FPGA-implementation of algorithms

Challenges: continual modification of software in order to take advantage of new instructions, new features, etc.

So what can we do to make this possible and affordable?

Driving IPC

Tomorrow Accelerator hybrids FPGA, SOC …

Today Hybrid of CPU:GPU

2008: NVIDIA GPU2012 / 13: AMD APU & INTEL MIC

FPGAs: performance and perf/watt advantages

•> 4k DSP cores in 28nmSystem-on-Chip implementations

OpenCL: ease programming modelEnablement on top of CPUs, GPUS, FPGAs - hiding the complexity of silicon designCoupled with FPGAs - could change computational landscape

And Open programming

Scaling Sockets

Disruptive integrated designs

Innovative packagingShared Infrastructure evolving

Hybrid nodesConcurrent serviceabilityHighest efficiency for power and coolingExtending design to facility

Modularized compute/storage/facility optimization

2000 nodes, 30 PB storage, 600 kW in 22 m2

Ultimate in power efficiency, PUE as low as 1.02

TomorrowNew CPU Core Contender ARM

ARM: potential to disrupt perf/$$, perf/watt model

Standard ISA and Open ecosystemARM+Native acceleration SOCs possible

Efficient Coordination

Reinventing Memory Hierarchy

Redefining NetworkingFlatten & UnLock Network Topologies

• Software Defined (OpenFlow)• Flat L2 networks • East-West optimization• 10G -> 40G -> 100G

Memory Access a principal driver of “stalls”• CPU/DRAM stacking• Flash as a new tier (1/2 way in between DRAM and Disk)•Memory virtualization coming soon…

Accelerating collaborative usage and models

• Transparent Infrastructure stacks• Leverage “Cloud computing”

investments• Cloud usage for low-duty-cycle usage• Openflow, OpenStack, Hadoop,…

open collaboration

CPU regsDRAM

DiskFlash

We’ll get there….but not in 2015

CHEP 2012 Computing in High Energy and Nuclear Physics

Documents

Transcript of CHEP 2012 Computing in High Energy and Nuclear Physics

The Evolution of Cloud Computing in ATLASheprcdocs.phys.uvic.ca/presentations/chep-taylor-2015.pdfCHEP 2015 Evolution of Cloud Computing in ATLAS 10 T2 & Remote Cloud Performance Comparison

2015 CHEP Hypertension Recommendations

CHEP Capabilities Brochure

Annual Report 2020 - CHEP

: adding and removing zones from Federal University of Rio de Janeiro CHEP’09 International Conference on Computing in High Energy and Nuclear Physics.

CHEP – Mumbai, February 2006 State of Readiness of LHC Computing Infrastructure Jamie Shiers, CERN.

HEP@Home A Distributed Computing System Based on BOINC September - CHEP 2004 Pedro Andrade António Amorim Jaime Villate.

CHEP Cares

2012 CHEP Recommendations for CHEP Management of ...

Chep Fd Food

Introduction - CHEP

CHEP’04 Opening

HEP GRID Workshop @ CHEP, KNU 11/9/2002 Youngjoon Kwon (Yonsei Univ.) 1 Belle Computing / Data Handling What is Belle and why we need large-scale computing?

CHEP Members (Geographical) May 2018 First Name … · CHEP Members (Geographical) May 2018 First Name Last Name City State Certification Cert # Dana Kamal Abu Dhabi Abu Dhabi CHEP

ASU‐CHEP STUDENT GUIDEmct.asu.edu.eg/uploads/1/4/0/8/14081679/asu-chep-studentguide-2014-2015.pdf · ASU-CHEP seeks to offer high quality, student-centered learning environment

Steve Pawlowski Intel Senior Fellow GM, Architecture and Planning CTO, Digital Enterprise Group Intel Corporation CHEP ‘09 March 24, 2009 More Computing.

Presentation at CHEP 2006

17-Oct-2013 1 CHEP conference, Amsterdam Oct 2013 Next generation database relational solutions for ATLAS Distributed Computing Gancho Dimitrov (CERN)

Nuclear Force General principles Examples: quantitative nuclear theory; predictive capability Realities of high- performance computing Nuclear structure.

CHEP test folder