Gem5 in a nutshellpeople.rennes.inria.fr/Christophe.Huriaux/static/huriaux...7 Simulation modes...

18
1 Gem5 in a nutshell Christophe Huriaux, Post-doc Inria, IRISA — CAIRN Project-Team CAIRN project-team — SAV 2016 June 30th-July 1st 2016 - 1

Transcript of Gem5 in a nutshellpeople.rennes.inria.fr/Christophe.Huriaux/static/huriaux...7 Simulation modes...

Page 1: Gem5 in a nutshellpeople.rennes.inria.fr/Christophe.Huriaux/static/huriaux...7 Simulation modes Full-system (FS) Models bare-metalhardware Includes the various specified devices, caches,

1

Gem5 in a nutshell

Christophe Huriaux, Post-docInria, IRISA — CAIRN Project-Team

CAIRN project-team — SAV 2016 June 30th-July 1st 2016 - 1

Page 2: Gem5 in a nutshellpeople.rennes.inria.fr/Christophe.Huriaux/static/huriaux...7 Simulation modes Full-system (FS) Models bare-metalhardware Includes the various specified devices, caches,

2

Spoiler alert

§ Not a research report…

§ ... but a (quick) overview of how Gem5 works and what it can offer

June 30th-July 1st 2016CAIRN project-team — SAV 2016 - 2

Page 3: Gem5 in a nutshellpeople.rennes.inria.fr/Christophe.Huriaux/static/huriaux...7 Simulation modes Full-system (FS) Models bare-metalhardware Includes the various specified devices, caches,

3

Outline§ Introduction

§ What is Gem5 useful for ?(or rather: what you should not use it for)

§ Overview of the system simulator§ Simulation modes§ Behind the scene of a simulation§ What’s under the hood ?§ Memory system

§ Running example

§ Conclusion

June 30th-July 1st 2016CAIRN project-team — SAV 2016 - 3

Page 4: Gem5 in a nutshellpeople.rennes.inria.fr/Christophe.Huriaux/static/huriaux...7 Simulation modes Full-system (FS) Models bare-metalhardware Includes the various specified devices, caches,

4

Introduction§ Gem5 is the fusion of two projects

§ GEMS : simulation of multi-processor systems§ M5 : simulation of networked systems

§ System simulator§ Accurate simulation of complex components

interactions (OS / CPU / Caches / Devices / …)§ Accuracy depends on the model completeness

§ All-in-one simulation framework§ Don’t rely on other software

§ But we can plug them in easily…§ Lot of components available out-of-the-box

§ (CPUs, memories, I/Os, …)June 30th-July 1st 2016CAIRN project-team — SAV 2016 - 4

Page 5: Gem5 in a nutshellpeople.rennes.inria.fr/Christophe.Huriaux/static/huriaux...7 Simulation modes Full-system (FS) Models bare-metalhardware Includes the various specified devices, caches,

5

What is Gem5 useful for ?

June 30th-July 1st 2016CAIRN project-team — SAV 2016 - 5

Architectural exploration ?Yes !👍Gem5 provides a fast and easy frameworkto interconnect hardware components and evaluate them !

Hardware/software performance evaluation ?Yes !👍Gem5 have a good support of various ISA and allows forrealistic HW/SW performance evaluation.

Page 6: Gem5 in a nutshellpeople.rennes.inria.fr/Christophe.Huriaux/static/huriaux...7 Simulation modes Full-system (FS) Models bare-metalhardware Includes the various specified devices, caches,

6

What is Gem5 useful for ?

June 30th-July 1st 2016CAIRN project-team — SAV 2016 - 6

Hardware/software verification ?No …👎RTL functional verification is much more mature and accurate !

Software development and verification ?Ugh.. Please stop!👎👎👎Faster technologies are available through binary-translation(e.g. QEMU, OVP)

Page 7: Gem5 in a nutshellpeople.rennes.inria.fr/Christophe.Huriaux/static/huriaux...7 Simulation modes Full-system (FS) Models bare-metalhardware Includes the various specified devices, caches,

7

Simulation modes

§ Full-system (FS)§ Models bare-metal hardware

§ Includes the various specified devices, caches, …§ Boots an entire OS from scratch

§ Gem5 can boot Linux (several variants) or Android out-of-the-box

§ Syscall Emulation (SE)§ Runs a single static application § System calls are emulated or forwarded to the

host OS§ Lot of simplifications (address translation,

scheduling, no pthread …)June 30th-July 1st 2016CAIRN project-team — SAV 2016 - 7

Page 8: Gem5 in a nutshellpeople.rennes.inria.fr/Christophe.Huriaux/static/huriaux...7 Simulation modes Full-system (FS) Models bare-metalhardware Includes the various specified devices, caches,

8

Behind the scene of a simulation

June 30th-July 1st 2016CAIRN project-team — SAV 2016 - 8

Collection ofcomponents

Simulatorinternals

C++ C++ / Python

Gem5 binary

Compilation of the simulator

Page 9: Gem5 in a nutshellpeople.rennes.inria.fr/Christophe.Huriaux/static/huriaux...7 Simulation modes Full-system (FS) Models bare-metalhardware Includes the various specified devices, caches,

9

Behind the scene of a simulation

June 30th-July 1st 2016CAIRN project-team — SAV 2016 - 9

Python script instanciating the componenthierarchy and defining simulation parameters

Gem5 binary

Simulation !

Python interpreter

Collection ofcomponentinterfacesAssembled C++

objects

Simulation

Output(statistics, traces…)

Page 10: Gem5 in a nutshellpeople.rennes.inria.fr/Christophe.Huriaux/static/huriaux...7 Simulation modes Full-system (FS) Models bare-metalhardware Includes the various specified devices, caches,

10

What’s under the hood ?

June 30th-July 1st 2016CAIRN project-team — SAV 2016 - 10

Collection ofcomponents

C++ / Python

.cc .py

1 component = 1 simulation object

C++ functionalmodel

(for simulation)

Python interface(for instanciation)

+

§ SimObjects follow a strict C++ class hierarchyfor easier extension with code reuse

SimObject

BaseCPU BaseTimingCPU

BaseO3CPU

……

Simulation objects

ClockedObject …

Page 11: Gem5 in a nutshellpeople.rennes.inria.fr/Christophe.Huriaux/static/huriaux...7 Simulation modes Full-system (FS) Models bare-metalhardware Includes the various specified devices, caches,

11

What’s under the hood ?

§ Gem5 is event-driven§ Discrete event timing model§ Not related to real time whatsoever§ The real time duration of 1 tick can be user-

defined

§ Simulation objects schedule events for the next cycle of after a specific time elapsed§ The Gem5 simulation scheduler takes care of the

rest !

June 30th-July 1st 2016CAIRN project-team — SAV 2016 - 11

Events

Page 12: Gem5 in a nutshellpeople.rennes.inria.fr/Christophe.Huriaux/static/huriaux...7 Simulation modes Full-system (FS) Models bare-metalhardware Includes the various specified devices, caches,

12

What’s under the hood ?§ Memory ports are present on every

MemObject§ They model physical memory connections§ You interconnect them during the hierarchy

instanciation§ E.g. a CPU data bus to a L1 cache

§ Work by pairs: 1 master port always connectto 1 slave port

§ Data is exchanged atomically as packets

June 30th-July 1st 2016CAIRN project-team — SAV 2016 - 12

Memory ports

CPUInst.L1$

DataL1$

Xbar

DDR3

Flash

M

M

M M

M M

S

S

S

S

S

S

Page 13: Gem5 in a nutshellpeople.rennes.inria.fr/Christophe.Huriaux/static/huriaux...7 Simulation modes Full-system (FS) Models bare-metalhardware Includes the various specified devices, caches,

13

What’s under the hood ?

§ 3 types of transport interfaces for packets§ Functional

§ Instantaneous in a single function call§ Caches and memories are updated automagically at

once§ Atomic

§ Instantaneous§ Memory model updated (caches, coherence …)§ Approximate latency, but no contention nor delay

§ Timing§ Transaction split into multiple phases§ Models all timing in the memory system

§ The transport interface depends on the SimObject implementation

June 30th-July 1st 2016CAIRN project-team — SAV 2016 - 13

Memory ports

Page 14: Gem5 in a nutshellpeople.rennes.inria.fr/Christophe.Huriaux/static/huriaux...7 Simulation modes Full-system (FS) Models bare-metalhardware Includes the various specified devices, caches,

14

Memory system

§ Models a system running heterogeneousapplications…

§ … running on heterogeneous processing tiles§ ... using heterogeneous memories and

interconnect

June 30th-July 1st 2016CAIRN project-team — SAV 2016 - 14

One memory system to rule them all…

CPUCPUCPUCPUCPUGPU

CPUCPUAccelerators

DDR3 SRAM Flash …Interconnect

Page 15: Gem5 in a nutshellpeople.rennes.inria.fr/Christophe.Huriaux/static/huriaux...7 Simulation modes Full-system (FS) Models bare-metalhardware Includes the various specified devices, caches,

15

Memory system

§ Two memory systems in Gem5§ Classic

§ All components instanciated in a hierarchy along withCPUs, etc.

§ MOESI coherence protocol only

§ Ruby§ Detailed simulation model of various cache hierarchies§ Various cache coherence protocols (MESI, MOESI, …)§ Interconnection networks

§ Classic is faster but less detailed

June 30th-July 1st 2016CAIRN project-team — SAV 2016 - 15

…and in the simulation, interconnect them

Page 16: Gem5 in a nutshellpeople.rennes.inria.fr/Christophe.Huriaux/static/huriaux...7 Simulation modes Full-system (FS) Models bare-metalhardware Includes the various specified devices, caches,

16

Running example

§ FFT kernel from the SPLASH2 benchmark suite§ ARM Instruction Set Architecture§ 1 or 4 out-of-order detailed CPUs§ Caches

§ L1: 64kb data, 32kb instruction§ L2: 2Mb, shared

June 30th-July 1st 2016CAIRN project-team — SAV 2016 - 16

Is my FFT faster with more processors ?Hardware/software performance evaluation !👍Let’s evaluate !

Page 17: Gem5 in a nutshellpeople.rennes.inria.fr/Christophe.Huriaux/static/huriaux...7 Simulation modes Full-system (FS) Models bare-metalhardware Includes the various specified devices, caches,

17

Conclusion

§ Quick introduction to Gem5

§ Much more things to explore in the framework !§ Integration of power models in development§ Memory / CPU traces generation§ Statistics output for performance evaluation§ SystemC co-simulation§ Automatic benchmark run§ Checkpointing, fast-forwarding

June 30th-July 1st 2016CAIRN project-team — SAV 2016 - 17

Page 18: Gem5 in a nutshellpeople.rennes.inria.fr/Christophe.Huriaux/static/huriaux...7 Simulation modes Full-system (FS) Models bare-metalhardware Includes the various specified devices, caches,

18

Thank you for yourattention J

CAIRN project-team — SAV 2016 June 30th-July 1st 2016 - 18