Performance Analysis of Computer Systems...– Architecture and performance analysis of High...

61
Holger Brunst ([email protected] ) Matthias S. Mueller ([email protected] ) Center for Information Services and High Performance Computing (ZIH) Performance Analysis of Computer Systems Introduction

Transcript of Performance Analysis of Computer Systems...– Architecture and performance analysis of High...

Page 1: Performance Analysis of Computer Systems...– Architecture and performance analysis of High Performance Computers – Programming methods and techniques for HPC systems – Software

Holger Brunst ([email protected])

Matthias S. Mueller ([email protected])

Center for Information Services and High Performance Computing (ZIH)

Performance Analysis of Computer Systems

Introduction

Page 2: Performance Analysis of Computer Systems...– Architecture and performance analysis of High Performance Computers – Programming methods and techniques for HPC systems – Software

Organization

Lecture: Every Wednesday in INF E001 from 13:00 to 14:30

Labs: Every Thursday in INF E046 from 13:00 to 14:30

First Exercise: October 21st, guided tour through all machine rooms at ZIH

– Meeting point: Treffz-Bau, below overbridge,

All slides will be in English

Ten minute summary of last lecture at the beginning of each lecture

List of attendees

Slide 2 LARS: Introduction and Motivation

Page 3: Performance Analysis of Computer Systems...– Architecture and performance analysis of High Performance Computers – Programming methods and techniques for HPC systems – Software

Class Material on the Web

Slides will be put on the web prior or shortly after each class

The slides from last year are still online.

– http://tu-dresden.de/die_tu_dresden/zentrale_einrichtungen/zih/lehre/ws0910/lars

Be aware of upgrades for this term.

– http://tu-dresden.de/die_tu_dresden/zentrale_einrichtungen/zih/lehre/ws1011/lars

Slide 3 LARS: Introduction and Motivation

Page 4: Performance Analysis of Computer Systems...– Architecture and performance analysis of High Performance Computers – Programming methods and techniques for HPC systems – Software

Class Outline

15 lectures with 14 corresponding exercises

Class structure

– Introduction and motivation

– Performance requirements, metrics, and common evaluation mistakes

– Workload types, selection, and characterization

– Commonly used benchmarks

– Monitoring techniques

– Capacity planning for future systems

– Performance data presentation

– Summarizing measured data

– Regression models

– Experimental design

– Performance simulation and prediction

– Introduction to queuing theory

Slide 4 LARS: Introduction and Motivation

Page 5: Performance Analysis of Computer Systems...– Architecture and performance analysis of High Performance Computers – Programming methods and techniques for HPC systems – Software

Literature

Raj Jain: The Art of Computer Systems Performance Analysis

John Wiley & Sons, Inc., 1991 (ISBN: 0-471-50336-3)

Rainer Klar, Peter Dauphin, Fran Hartleb, Richard Hofmann, Bernd Mohr, Andreas Quick, Markus Siegle Messung und Modellierung paralleler und verteilter Rechensysteme B.G. Teubner Verlag, Stuttgart, 1995 (ISBN:3-519-02144-7)

Dongarra, Gentzsch, Eds.: Computer Benchmarks, Advances in Parallel Computing 8, North Holland, 1993 (ISBN: 0-444-81518-x)

Slide 5 LARS: Introduction and Motivation

Page 6: Performance Analysis of Computer Systems...– Architecture and performance analysis of High Performance Computers – Programming methods and techniques for HPC systems – Software

Holger Brunst ([email protected])

Matthias S. Mueller ([email protected])

Introduction and Motivation

Why is Performance Analysis Important?

Page 7: Performance Analysis of Computer Systems...– Architecture and performance analysis of High Performance Computers – Programming methods and techniques for HPC systems – Software

Overview

Development of hardware performance

Implications on application performance

Compute power at Technische Universität Dresden

Research at ZIH

Some advertising

Slide 7 LARS: Introduction and Motivation

Page 8: Performance Analysis of Computer Systems...– Architecture and performance analysis of High Performance Computers – Programming methods and techniques for HPC systems – Software

Moore’s Law: 2X Transistors / “year”

“Cramming More Components onto Integrated Circuits”

Gordon Moore, Electronics, 1965

# on transistors / cost-effective integrated circuit double every N months (18 N 24)

Slide 8 LARS: Introduction and Motivation

Page 9: Performance Analysis of Computer Systems...– Architecture and performance analysis of High Performance Computers – Programming methods and techniques for HPC systems – Software

Performance Development in TOP500

Slide 9 LARS: Introduction and Motivation

Page 10: Performance Analysis of Computer Systems...– Architecture and performance analysis of High Performance Computers – Programming methods and techniques for HPC systems – Software

John Shalf (NERSC, LBNL)

Slide 10 LARS: Introduction and Motivation

Page 11: Performance Analysis of Computer Systems...– Architecture and performance analysis of High Performance Computers – Programming methods and techniques for HPC systems – Software

Number of Cores per System is Increasing Rapidly

Total # of Cores in Top15

0

200000

400000

600000

800000

1000000

1200000

Ju

n 9

3

De

z 9

3

Ju

n 9

4

De

z 9

4

Ju

n 9

5

De

z 9

5

Ju

n 9

6

De

z 9

6

Ju

n 9

7

De

z 9

7

Ju

n 9

8

De

z 9

8

Ju

n 9

9

De

z 9

9

Ju

n 0

0

De

z 0

0

Ju

n 0

1

De

z 0

1

Ju

n 0

2

De

z 0

2

Ju

n 0

3

De

z 0

3

Ju

n 0

4

De

z 0

4

Ju

n 0

5

De

z 0

5

Ju

n 0

6

De

z 0

6

Ju

n 0

7

De

z 0

7

Ju

n 0

8

De

z 0

8

Pro

cesso

rs

Slide 11 LARS: Introduction and Motivation

Page 12: Performance Analysis of Computer Systems...– Architecture and performance analysis of High Performance Computers – Programming methods and techniques for HPC systems – Software

Number of Cores per System is Increasing Rapidly

Slide 12 LARS: Introduction and Motivation

Page 13: Performance Analysis of Computer Systems...– Architecture and performance analysis of High Performance Computers – Programming methods and techniques for HPC systems – Software

Cray XT5 (Jaguar) at Oak Ridge National Laboratory

Slide 13 LARS: Introduction and Motivation

Page 14: Performance Analysis of Computer Systems...– Architecture and performance analysis of High Performance Computers – Programming methods and techniques for HPC systems – Software

Dawning Nebulae at NSCS

Number two in TOP 500 (June 2010)

Installed at National Supercomputing Centre in Shenzhen (China)

Specification not published

Hybrid architecture

Presumably: 4640 nodes with each

– Two Intel Xeon X5650 processor (10.64 GFLOPS)

– One Nvidia C2050 GPU

Total number of cores

– 4640 nodes * (12 processor cores + 14 shader cluster) = 120640 cores

Slide 14 LARS: Introduction and Motivation

Page 15: Performance Analysis of Computer Systems...– Architecture and performance analysis of High Performance Computers – Programming methods and techniques for HPC systems – Software

IBM Roadrunner at Los Alamos National Laboratory

First computer to surpass the 1 Petaflop (250 FLOPS ) barrier

Installed at Los Alamos National Laboratories

Hybrid Architecture

13,824 AMD Opteron cores

116,640 IBM PowerXCell 8i cores

Costs: $120 Mio.

Slide 15 LARS: Introduction and Motivation

Page 16: Performance Analysis of Computer Systems...– Architecture and performance analysis of High Performance Computers – Programming methods and techniques for HPC systems – Software

IBM BlueGene/P (JUGENE) at Research Centre Jülich

Number five in TOP 500

Installed at Forschungszentrum Jülich

72 Racks with 32 node cards x 32 compute cards (total 73728)

294,912 PowerPC 450, 850 MHz

144 TB main memory

Slide 16 LARS: Introduction and Motivation

Page 17: Performance Analysis of Computer Systems...– Architecture and performance analysis of High Performance Computers – Programming methods and techniques for HPC systems – Software

What Kind of Know-How is Required for HPC?

Algorithms and methods

Performance Analysis

Programming (Paradigms and details of implementations)

Operation of supercomputers (network, infrastructure, service, support)

Slide 17 LARS: Introduction and Motivation

Page 18: Performance Analysis of Computer Systems...– Architecture and performance analysis of High Performance Computers – Programming methods and techniques for HPC systems – Software

Challenges

Languages

– Fortran95, C/C++, Java,

– Also scripting languages!

Parallelization:

– MPI, OpenMP

Network

– Ethernet, Infiniband, Myrinet, …

Scheduling

– Distributed components, job scheduling, process scheduling

System architecture

– Processors, memory hierarchy

Slide 18 LARS: Introduction and Motivation

Page 19: Performance Analysis of Computer Systems...– Architecture and performance analysis of High Performance Computers – Programming methods and techniques for HPC systems – Software

Holger Brunst ([email protected])

Matthias S. Mueller ([email protected])

Application Performance

Page 20: Performance Analysis of Computer Systems...– Architecture and performance analysis of High Performance Computers – Programming methods and techniques for HPC systems – Software

From Modeling to Execution

Slide 20 LARS: Introduction and Motivation

Page 21: Performance Analysis of Computer Systems...– Architecture and performance analysis of High Performance Computers – Programming methods and techniques for HPC systems – Software

Short History of X86 CPUs

CPU Year Bit

Width

#Transistors Clock Structure L1 / L2 /L3

4004 1971 4 2300 740 kHz 10 micro

8008 1972 8 3500 500 kHz 10 micro

8086 1978 16 29.000 10 Mhz 3 micro

80286 1982 16 134.000 25 MHz 1.5 micro

80386 1985 32 275.000 33 Mhz 1 micro

80486 1989 32 1.200.000 50 MHz 0.8 micro 8K

Pentium I 1994 32 3.100.000 66 MHz 0.8 micro 8K

Pentium II 1997 32 7.500.000 300 MHz 0.35 micro 16K/512K*

Pentium III 1999 32 9.500.000 600 MHz 0.25 micro 16K/512K*

Pentium IV 2000 32 42.000.000 1.5 GHz 0.18 micro 8K/256K

P IV F 2005 64 2.8- 3.8 GHz

90 nm 16K/2MB

Core i7 2008 64 781.000.000 3.2 GHz 45 nm 32K/256K/8MB

Slide 21 LARS: Introduction and Motivation

Page 22: Performance Analysis of Computer Systems...– Architecture and performance analysis of High Performance Computers – Programming methods and techniques for HPC systems – Software

Intel Nehalem

Released 2008

4 cores

781.000.000 transistors

45nm technology

32 K L1Data, 32K L1Instruction

256 K L2

8 MB shared L3 cache

Hyperthreading

3.2 GHz*4 cores*4 FLOPS/cycle = 51.2 Gflop/s peak

Integrated memory controller

QPI between processors

Slide 22 LARS: Introduction and Motivation

Page 23: Performance Analysis of Computer Systems...– Architecture and performance analysis of High Performance Computers – Programming methods and techniques for HPC systems – Software

Nehalem Core

Execution

Units

Out-of-Order

Scheduling & Retirement

L2 Cache

& Interrupt Servicing

Instruction Fetch

& L1 Cache

Branch Prediction Instruction

Decode & Microcode

Paging

L1 Data Cache

Memory Ordering

& Execution

Slide 23 LARS: Introduction and Motivation

Page 24: Performance Analysis of Computer Systems...– Architecture and performance analysis of High Performance Computers – Programming methods and techniques for HPC systems – Software

Potential factors limiting performance

“Peak performance”

Floating point units

Integer units

… any other feature of micro architecture

Bandwidth (L1,L2,L3, main memory, other cores, other nodes)

Latency (L1,L2,L3, main memory, other cores, other nodes)

Slide 24 LARS: Introduction and Motivation

Page 25: Performance Analysis of Computer Systems...– Architecture and performance analysis of High Performance Computers – Programming methods and techniques for HPC systems – Software

Performance development in TOP500

Slide 25 LARS: Introduction and Motivation

Page 26: Performance Analysis of Computer Systems...– Architecture and performance analysis of High Performance Computers – Programming methods and techniques for HPC systems – Software

Develops the rest of the system at CPU speed?

μProc 60%/yr. (2X/1.5yr)

DRAM 9%/yr. (2X/10 yrs) 1

10

100

1000

1980

1981

1983

1984

1985

1986

1987

1988

1989

1990

1991

1992

1993

1994

1995

1996

1997

1998

1999

2000

DRAM

CPU

1982

Processor-Memory Performance Gap: (grows 50% / year)

Perform

ance

Time

“Moore’s Law”

Processor-DRAM Memory Gap (latency)

Slide 26 LARS: Introduction and Motivation

Page 27: Performance Analysis of Computer Systems...– Architecture and performance analysis of High Performance Computers – Programming methods and techniques for HPC systems – Software

Performance Trends measured by SPECint

Source: Hennessy, Patterson: „Computer Architecture, a quantitative approach“.

Slide 27 LARS: Introduction and Motivation

Page 28: Performance Analysis of Computer Systems...– Architecture and performance analysis of High Performance Computers – Programming methods and techniques for HPC systems – Software

CPUint2006 development 2005 - 2009

Slide 28 LARS: Introduction and Motivation

Page 29: Performance Analysis of Computer Systems...– Architecture and performance analysis of High Performance Computers – Programming methods and techniques for HPC systems – Software

Performance Trends measured by SPECint

2009

23%

Slide 29 LARS: Introduction and Motivation

Page 30: Performance Analysis of Computer Systems...– Architecture and performance analysis of High Performance Computers – Programming methods and techniques for HPC systems – Software

CPUfp2006 development 1991 - 2009

CPU 95

Released 1995

602 results between 3/1991 and 1/2001

CPUfp2000

Released 2000

1385 results between 10/1996 and 2/2007

CPUfp2006

Released 2006

1217 results between 4/1997 and 4/2009

42%

33%

30%

Slide 30 LARS: Introduction and Motivation

Page 31: Performance Analysis of Computer Systems...– Architecture and performance analysis of High Performance Computers – Programming methods and techniques for HPC systems – Software

Performance Trends over a 20 years life cycle

Slide 31 LARS: Introduction and Motivation

Page 32: Performance Analysis of Computer Systems...– Architecture and performance analysis of High Performance Computers – Programming methods and techniques for HPC systems – Software

Performance Trends over a 20 years life cycle

Where is your

application?

Slide 32 LARS: Introduction and Motivation

Page 33: Performance Analysis of Computer Systems...– Architecture and performance analysis of High Performance Computers – Programming methods and techniques for HPC systems – Software

Holger Brunst ([email protected])

Matthias S. Mueller ([email protected])

Center of Information Services and HPC

A short introduction

Page 34: Performance Analysis of Computer Systems...– Architecture and performance analysis of High Performance Computers – Programming methods and techniques for HPC systems – Software

HPC in Germany

Slide 34 LARS: Introduction and Motivation

Page 35: Performance Analysis of Computer Systems...– Architecture and performance analysis of High Performance Computers – Programming methods and techniques for HPC systems – Software

Responsibilities of ZIH

Providing infrastructure and qualified service for TU Dresden and Saxony

Research topics

– Architecture and performance analysis of High Performance Computers

– Programming methods and techniques for HPC systems

– Software tools to support programming and optimization

– Modeling algorithms of biological processes

– Mathematical models, algorithms, and efficient implementations

Role of mediator between vendors, developers, and users

Pick up and preparation of new concepts, methods, and techniques

Teaching and Education

Slide 35 LARS: Introduction and Motivation

Page 36: Performance Analysis of Computer Systems...– Architecture and performance analysis of High Performance Computers – Programming methods and techniques for HPC systems – Software

Compute Server Infrastructure

HPC - Komponente

Hauptspeicher 6,5 TB PC - Farm

HPC - SAN

Festplatten - kapazität :

68 TB

PC - SAN

Festplatten - kapazität :

68 TB

PetaByte - Bandarchiv

Kapazität : 1 PB

8 GB / s 4 GB / s 4 GB / s

1 , 8 GB / s HPC-Component

– SGI® Altix® 4700

– 2048 of

– MonteCito Cores

– 6.5 TByte main memory

PC-Farm

System from Linux Networx

AMD opteron CPUs (dual core, 2.6 GHz)

728 boards with 2592 cores

Infiniband networks between the nodes

Slide 36 LARS: Introduction and Motivation

Page 37: Performance Analysis of Computer Systems...– Architecture and performance analysis of High Performance Computers – Programming methods and techniques for HPC systems – Software

HPC-System: SGI Altix 4700 (Mars)

32 x 42U Racks

1024 x Sockets with Itanium2 Montecito Dual-

Core CPUs (1.6 GHz/9MB L3 Cache)

13 TFlop/s peak performance

11.9 TFlop/s linpack

6.5 TB shared memory

Slide 37 LARS: Introduction and Motivation

Page 38: Performance Analysis of Computer Systems...– Architecture and performance analysis of High Performance Computers – Programming methods and techniques for HPC systems – Software

Linux Networx PC-Farm (Deimos)

– 26 water cooled racks (Knürr)

– 1296 AMD Opteron x85 Dual-Core CPUs (2,6 GHz)

– 728 compute nodes with 2 (384), 4 (232) or 8 (112) cores

– 2 Master- und 11 Lustre-Server

– 2 GB memory per core

– 68 TB SAN disc (RAID 6)

– Local scratch discs (70, 150, 290 GB)

– 2 4x-Infiniband Fabrics (MPI + I/O)

– OS: SuSE SLES 10

– Batch system: LSF

– Compiler: Pathscale, PGI, Intel, Gnu

– ISV-Codes: Ansys100, CFX, Fluent, Gaussian, LS-DYNA, Matlab, MSC

Slide 38 LARS: Introduction and Motivation

Page 39: Performance Analysis of Computer Systems...– Architecture and performance analysis of High Performance Computers – Programming methods and techniques for HPC systems – Software

Computer Rooms – Extension to the Building

Slide 39 LARS: Introduction and Motivation

Page 40: Performance Analysis of Computer Systems...– Architecture and performance analysis of High Performance Computers – Programming methods and techniques for HPC systems – Software

Performance of Supercomputers at ZIH

Slide 40 LARS: Introduction and Motivation

Page 41: Performance Analysis of Computer Systems...– Architecture and performance analysis of High Performance Computers – Programming methods and techniques for HPC systems – Software

Holger Brunst ([email protected])

Matthias S. Mueller ([email protected])

Research at ZIH

Selected Projects and Activities

Page 42: Performance Analysis of Computer Systems...– Architecture and performance analysis of High Performance Computers – Programming methods and techniques for HPC systems – Software

Forschungsbereiche am ZIH

Software-Werkzeuge zur Unterstützung von Programmierung und Optimierung

Programmiermethoden und Techniken für Hochleistungsrechner

Grid-Computing

Mathematische Methoden, Algorithmen und effiziente Implementierungen

Architektur und Leistungsanalyse von Hochleistungsrechnern

Algorithmen und Methoden zur Modellierung biologischer Prozesse

Slide 42 LARS: Introduction and Motivation

Page 43: Performance Analysis of Computer Systems...– Architecture and performance analysis of High Performance Computers – Programming methods and techniques for HPC systems – Software

Software-Werkzeuge …

Vampir

– Visualisierung und Analyse von parallelen Anwendungen

Marmot

– Erkennung von fehlerhafter Nutzung der MPI Kommunikationsbibliothek

ParBench

– Analyse von Multiprogramming Eigenschaften

BenchIT

– Ausführung/Archivierung/Darstellung von Benchmarks und deren Ergebnisse

Screenshots: Marmot for Windows

Slide 43 LARS: Introduction and Motivation

Page 44: Performance Analysis of Computer Systems...– Architecture and performance analysis of High Performance Computers – Programming methods and techniques for HPC systems – Software

Vampir: Framework

Slide 44 LARS: Introduction and Motivation

Page 45: Performance Analysis of Computer Systems...– Architecture and performance analysis of High Performance Computers – Programming methods and techniques for HPC systems – Software

Vampir: Timelines

Slide 45 LARS: Introduction and Motivation

Page 46: Performance Analysis of Computer Systems...– Architecture and performance analysis of High Performance Computers – Programming methods and techniques for HPC systems – Software

Vampir: Summaries

Slide 46 LARS: Introduction and Motivation

Page 47: Performance Analysis of Computer Systems...– Architecture and performance analysis of High Performance Computers – Programming methods and techniques for HPC systems – Software

BenchIT

BenchIT measurement core

Command line interface

GUI

Website

Slide 47 LARS: Introduction and Motivation

Page 48: Performance Analysis of Computer Systems...– Architecture and performance analysis of High Performance Computers – Programming methods and techniques for HPC systems – Software

Cluster Challenge 2008

Herausforderung:

– 6 Studenten

– 44 Stunden

– 1 (selbst zusammengestellter) Cluster mit max. 3,1 kW Leistungsaufnahme

– 5 wissenschaftliche Anwendungen

Ziel:

– Maximaler Durchsatz an Jobs innerhalb der Wettkampfzeit

Teilnehmerfeld:

Purdue University mit SiCortex, Univerity of Alberta mit SGI, TUD/IU mit IBM & Myricom, Taiwan mit HP, Arizona State mit Cray/MS, Colorado mit Aspen Systems, MIT mit Dell

Slide 48 LARS: Introduction and Motivation

Page 49: Performance Analysis of Computer Systems...– Architecture and performance analysis of High Performance Computers – Programming methods and techniques for HPC systems – Software

Cluster Challenge 2008

Slide 49 LARS: Introduction and Motivation

Page 50: Performance Analysis of Computer Systems...– Architecture and performance analysis of High Performance Computers – Programming methods and techniques for HPC systems – Software

Cluster Challenge 2008

Hardware-Optimierungen

– 10G Myrinet Interconnect (~120W für Switch + Host Adapter)

– Optimale DIMM Konfiguration für die Anwendungen (16 GB pro Knoten)

– Booten von USB-Sticks und Nutzen der lokalen Platten nur wenn nötig

– Bestimmen der Stromverbrauchsprofile der Anwendungen, um die “richtige” Gesamtknotenzahl zu wählen

Software-Optimierungen

– Wo sinnvoll, Einsatz kommerzieller Compiler (signifikanter Aufwand)

– Tracing der Anwendungen, um Kommunikation zu verstehen und zu optimieren

Durchsatz-Optimierungen

– Nutzen der Stromverbrauchs- und Laufzeitabschätzungen zur optimalten Auslastung des Clusters

Ergebnis: 1. Platz

Slide 50 LARS: Introduction and Motivation

Page 51: Performance Analysis of Computer Systems...– Architecture and performance analysis of High Performance Computers – Programming methods and techniques for HPC systems – Software

Cluster Challenge 2008

Slide 51 LARS: Introduction and Motivation

Page 52: Performance Analysis of Computer Systems...– Architecture and performance analysis of High Performance Computers – Programming methods and techniques for HPC systems – Software

Holger Brunst ([email protected])

Matthias S. Mueller ([email protected])

Das ZIH als Arbeitgeber

Page 53: Performance Analysis of Computer Systems...– Architecture and performance analysis of High Performance Computers – Programming methods and techniques for HPC systems – Software

Infrastruktur

Hochleistungsrechner:

Arbeitsplätze:

Slide 53 LARS: Introduction and Motivation

Page 54: Performance Analysis of Computer Systems...– Architecture and performance analysis of High Performance Computers – Programming methods and techniques for HPC systems – Software

Internationale Zusammenarbeit

Tracing

ParMA

VI HPS

Open MPI

Slide 54 LARS: Introduction and Motivation

Page 55: Performance Analysis of Computer Systems...– Architecture and performance analysis of High Performance Computers – Programming methods and techniques for HPC systems – Software

Zukunftsaussichten

In der Many-Core Ära wird paralleles Rechnen immer wichtiger

Kontakte zu internationalen Partnern

Industriekontakte: IBM, SUN, Cray, SGI, NEC; Intel, AMD, …

Mögliche Auslandsaufenthalte oder Industrieinternships

– Beispiele für Auslandsaufenthalte

• LLNL, CA, U.S.A.

• BSC, Barcelona, Spain

• Eugene, OR, U.S.A.

– Beispiele für Internships:

• Cray

• IBM

Slide 55 LARS: Introduction and Motivation

Page 56: Performance Analysis of Computer Systems...– Architecture and performance analysis of High Performance Computers – Programming methods and techniques for HPC systems – Software

Holger Brunst ([email protected])

Matthias S. Mueller ([email protected])

Diplomarbeiten am ZIH

Page 57: Performance Analysis of Computer Systems...– Architecture and performance analysis of High Performance Computers – Programming methods and techniques for HPC systems – Software

Evaluierung der GCC Plug-In Schnittstelle

Thema: Evaluierung der neuen Plug-In Schnittstelle des GCC im Hinblick auf die Instrumentierung von HPC Programmen

Fragestellung:

– Welche Neuerungen und Vorteile bietet der Plug-In Mechanismus?

– Wie können GCC Plug-Ins zur Instrumentierung von HPC Programmen genutzt werden?

– Ist effizientes Filtern zur Laufzeit möglich?

– Vergleich mit konventioneller Instrumentierung

Betreuer: Bert Wesarg ([email protected]

Slide 57 LARS: Introduction and Motivation

Page 58: Performance Analysis of Computer Systems...– Architecture and performance analysis of High Performance Computers – Programming methods and techniques for HPC systems – Software

Programmspuranalyse mit Signalverarbeitung

Thema: Evaluierung von Analysemethoden aus der Signalverarbeitung im Hinblick auf Programmspuren

Fragestellung:

– Wie lassen sich Programmspuren sinnvoll auf Signale abbilden?

– Inwieweit eignen sich Methoden der Signalverarbeitung (Sampling, Wavelet Transformation, Korrelation) zur effizienteren Verarbeitung von Leistungsdaten aus Programmspuren?

– Ist eine automatische Mustererkennung und Datengruppierung möglich?

Betreuer: Matthias Weber ([email protected])

Slide 58 LARS: Introduction and Motivation

Page 59: Performance Analysis of Computer Systems...– Architecture and performance analysis of High Performance Computers – Programming methods and techniques for HPC systems – Software

Perf.-Analyse für Speedstep-Architekturen

Thema: “Verbesserung der Performance-Analyse für Multicore-Architekturen und Systeme mit Speedstep-Fähigkeiten”

Fragestellung:

– Untersuchung der Möglichkeiten unter Linux, den ausführenden CPU-Kern für einen Prozess zu bestimmen

– Integration der Information in Programmspuren

– Suche einer portablen und nicht intrusiven Lösung, Taktfrequenzänderungen von CPU-Kernen aufzuzeichnen

– Darauf basierend, Normierung von Zeitintervallen in Programmspuren

Betreuer: Jens Doleschal ([email protected])

Slide 59 LARS: Introduction and Motivation

Page 60: Performance Analysis of Computer Systems...– Architecture and performance analysis of High Performance Computers – Programming methods and techniques for HPC systems – Software

Performance Analyse und Softwareentwicklung

Thema: Performance Analyse als Integraler Bestandteil der Softwarentwicklung

Fragestellung:

– Integration von Performance Analyse (VAMPIR) in IDE (Eclipse)

– Geeignete Abstraktion und Darstellung von “Performance Summaries”

– Integration von paralleler Performance Analyse in den Softwareentwicklungsprozess

– Betreuer: Matthias Mueller, Andreas Knüpfer

Slide 60 LARS: Introduction and Motivation

Page 61: Performance Analysis of Computer Systems...– Architecture and performance analysis of High Performance Computers – Programming methods and techniques for HPC systems – Software

Holger Brunst ([email protected])

Matthias S. Mueller ([email protected])

Thank you!

Hope to see you next time…