Message-driven AMR Techniquesblogs.bu.edu/simac3/files/2013/11/10_Lelbach_AMR.pdf · Message-driven...

31
Message-driven AMR Techniques for Binary Star Simulations Bryce Adelstein-Lelbach [1] , Dominic Marcello [2][3] PI: Hartmut Kaiser [1][2] , Co-PI: Geoffrey Clayton [3] [1]: Center for Computation and Technology [2]: LSU Department of Computer Science [3]: LSU Department of Physics

Transcript of Message-driven AMR Techniquesblogs.bu.edu/simac3/files/2013/11/10_Lelbach_AMR.pdf · Message-driven...

Page 1: Message-driven AMR Techniquesblogs.bu.edu/simac3/files/2013/11/10_Lelbach_AMR.pdf · Message-driven AMR Techniques for Binary Star Simulations Bryce Adelstein-Lelbach[1], ... •HPX

Message-driven AMR Techniques for Binary Star Simulations

Bryce Adelstein-Lelbach[1], Dominic Marcello[2][3]

PI: Hartmut Kaiser[1][2], Co-PI: Geoffrey Clayton[3]

[1]: Center for Computation and Technology [2]: LSU Department of Computer Science [3]: LSU Department of Physics

Page 2: Message-driven AMR Techniquesblogs.bu.edu/simac3/files/2013/11/10_Lelbach_AMR.pdf · Message-driven AMR Techniques for Binary Star Simulations Bryce Adelstein-Lelbach[1], ... •HPX

PHYSICS

5/17/2012 stellar.cct.lsu.edu 2

Page 3: Message-driven AMR Techniquesblogs.bu.edu/simac3/files/2013/11/10_Lelbach_AMR.pdf · Message-driven AMR Techniques for Binary Star Simulations Bryce Adelstein-Lelbach[1], ... •HPX

5/17/2012 stellar.cct.lsu.edu 3

Page 4: Message-driven AMR Techniquesblogs.bu.edu/simac3/files/2013/11/10_Lelbach_AMR.pdf · Message-driven AMR Techniques for Binary Star Simulations Bryce Adelstein-Lelbach[1], ... •HPX

Double White Dwarfs

• White dwarfs (WDs) are the most common end stage of stellar evolution. – Composed of the end products of fusion (He, C, O, Mg, Ne). – ~Earth sized.

• Binary star systems are the most common type of star system(Abt (1983), Tokovini (1987)).

• There are lots of double white dwarfs (DWDs). – On theoretical grounds, about 300,000,000 in our own galaxy

(Nelemans et. al. (2001)).

• DWDs are driven closer together over time by loss of angular momentum due to gravitational radiation. – Eventually many systems are driven close enough to undergo “Roche

lobe over-flow” (RLOF), in which less massive, larger WD “donor” begins to lose mass to the more massive, smaller, WD “accretor”, due to the force of gravity.

5/17/2012 stellar.cct.lsu.edu 4

Page 5: Message-driven AMR Techniquesblogs.bu.edu/simac3/files/2013/11/10_Lelbach_AMR.pdf · Message-driven AMR Techniques for Binary Star Simulations Bryce Adelstein-Lelbach[1], ... •HPX

Unstable Mass Transfer

5/17/2012 stellar.cct.lsu.edu 5

Page 6: Message-driven AMR Techniquesblogs.bu.edu/simac3/files/2013/11/10_Lelbach_AMR.pdf · Message-driven AMR Techniques for Binary Star Simulations Bryce Adelstein-Lelbach[1], ... •HPX

Unstable vs Stable Mass Transfer

• When unstable mass transfer occurs, the donor is tidally disrupted and most of it falls onto the accretor. Two possible cases:

1) If total mass < 1.4 solar masses, the two WDs combine to form a hydrogen-deficient star, such as R Coronae Borealis type stars (Clayton et. al. (2012)).

2) When the total mass of the merged object > 1.4 solar masses, a Type Ia supernova (Webbink (1884), Schaefer and Pagnotta (2012)). • The new object is too massive to be supported by degeneracy pressure, and it

collapses. This triggers a runaway nuclear detonation.

• Type 1a supernovae are “standard candles”, critical to our understanding of the expansion of the Universe.

• Stable mass transfer leads to AM CVn systems (Nelemans et. al. (2001)).

5/17/2012 stellar.cct.lsu.edu 6

Page 7: Message-driven AMR Techniquesblogs.bu.edu/simac3/files/2013/11/10_Lelbach_AMR.pdf · Message-driven AMR Techniques for Binary Star Simulations Bryce Adelstein-Lelbach[1], ... •HPX

Stable Mass Transfer (maybe?)

5/17/2012 stellar.cct.lsu.edu 7

Page 8: Message-driven AMR Techniquesblogs.bu.edu/simac3/files/2013/11/10_Lelbach_AMR.pdf · Message-driven AMR Techniques for Binary Star Simulations Bryce Adelstein-Lelbach[1], ... •HPX

AMR Allows for a Large Range of Spatial Scales

5/17/2012 stellar.cct.lsu.edu 8

Page 9: Message-driven AMR Techniquesblogs.bu.edu/simac3/files/2013/11/10_Lelbach_AMR.pdf · Message-driven AMR Techniques for Binary Star Simulations Bryce Adelstein-Lelbach[1], ... •HPX

Modeled Physics

• Ideal inviscid fluid dynamics. – Angular, radial, and vertical momenta are evolved on a Cartesian mesh (Byerly

et. al. submitted) semi-discrete Kurganov & Tadmor (1999) central scheme. – 3rd order TVD Runge Kutta time stepping. – Dual energy formalism (Bryan et al. 1995). – Cold white dwarf equation of state added to ideal gas equation of state

(Segretain et. al. (1997)). – Piecewise Parabolic Method (PPM) reconstruction (Colella and Woodward

(1984)). – “E*” scheme to conserve total energy (Marcello (2012)).

• Newtonian self-gravity – Geometric multigrid relaxation method for self-gravitation (Martin and

Cartwright (1996)). – The Cartesian Fast Multiple Method used the falcON code (Dehnen (2001))

• We have modified this method to conserve angular momentum and energy to truncation error.

5/17/2012 stellar.cct.lsu.edu 9

Page 10: Message-driven AMR Techniquesblogs.bu.edu/simac3/files/2013/11/10_Lelbach_AMR.pdf · Message-driven AMR Techniques for Binary Star Simulations Bryce Adelstein-Lelbach[1], ... •HPX

Total Energy Conservation

• The K-T scheme achieves numerical stability by application of numerical viscosity at discontinuities and local extrema.

• In the presence of a potential, this viscosity will cause matter to creep up the potential well at no cost to the gas energy, resulting in spontaneous, unphysical energy gains.

• The difference in energy gained (or lost) by unphysical movement of mass up (or down) the potential is removed (added) to the total gas energy, effectively removing the difference from the internal gas energy.

• Stable equilibrium configurations can form.

5/17/2012 stellar.cct.lsu.edu 10

Page 11: Message-driven AMR Techniquesblogs.bu.edu/simac3/files/2013/11/10_Lelbach_AMR.pdf · Message-driven AMR Techniques for Binary Star Simulations Bryce Adelstein-Lelbach[1], ... •HPX

COMPUTER SCIENCE

5/17/2012 stellar.cct.lsu.edu 11

Page 12: Message-driven AMR Techniquesblogs.bu.edu/simac3/files/2013/11/10_Lelbach_AMR.pdf · Message-driven AMR Techniques for Binary Star Simulations Bryce Adelstein-Lelbach[1], ... •HPX

HPX

• HPX - a general-purpose C++ runtime system for applications of any scale. – A runtime system manages certain aspects of a program’s

execution environment.

• Focus on asynchronous/message-driven programming. • Our colleagues at FAU, Thomas Heller and Andreas

Schäfer, are performing petascale N-body simulations using LibGeoDecomp/HPX (Heller et. al. (2013)): – ~0.35 sustained PFLOPS on 16k x86 cores, reaching 79% of

theoretical peak performance. – ~19.5 sustained TFLOPs on 16 Xeon Phis (3.8k threads),

reaching 73% theoretical peak performance.

12 5/17/2012 stellar.cct.lsu.edu

Page 13: Message-driven AMR Techniquesblogs.bu.edu/simac3/files/2013/11/10_Lelbach_AMR.pdf · Message-driven AMR Techniques for Binary Star Simulations Bryce Adelstein-Lelbach[1], ... •HPX

HPX

• The HPX paradigm prefers:

– Asynchronous communication to hide latencies and contention instead of avoiding them.

– Fine-grained parallelism and an active global address space (AGAS) to enable dynamic and heuristic load balancing instead of statically partitioning work.

– Local, dependency-driven synchronization instead of explicit global barriers.

– Sending work to data instead of sending data to work.

5/17/2012 stellar.cct.lsu.edu 13

Page 14: Message-driven AMR Techniquesblogs.bu.edu/simac3/files/2013/11/10_Lelbach_AMR.pdf · Message-driven AMR Techniques for Binary Star Simulations Bryce Adelstein-Lelbach[1], ... •HPX

HPX

• Fine-grained parallelism: M -> N threading model. – Map M user-level threads onto N operating

system threads.

• AGAS: global address space that maintains referential integrity when moving global objects between nodes.

• Asynchronous methodology: future and dataflow models.

14 5/17/2012 stellar.cct.lsu.edu

Page 15: Message-driven AMR Techniquesblogs.bu.edu/simac3/files/2013/11/10_Lelbach_AMR.pdf · Message-driven AMR Techniques for Binary Star Simulations Bryce Adelstein-Lelbach[1], ... •HPX

Threads and Futures

• In HPX, a user-level thread encapsulates the asynchronous execution of a function.

• A future represents a value that exists now or will exist in the future.

– The value of the future will (usually) be produced by a function.

– The producer function is (usually) executed asynchronously in a new HPX thread.

5/17/2012 stellar.cct.lsu.edu 15

Page 16: Message-driven AMR Techniquesblogs.bu.edu/simac3/files/2013/11/10_Lelbach_AMR.pdf · Message-driven AMR Techniques for Binary Star Simulations Bryce Adelstein-Lelbach[1], ... •HPX

The Future Pull Model

5/17/2012 stellar.cct.lsu.edu 16

Node 1

We need V’s value; suspend function A

Execute another function

V’s value is available; resume function A

Node 2

X::M executes

Future V

The result of method M is sent back; it will be set as the value of V.

A remove invocation of the function X::M, which will generate the value. X, an object of a class C, is the target of M, bound to the invocation by a global identifier. Notice we are moving work to data.

Object X

// Let M be a member of X hpx::future<T> V; // Asynchronously invoke // M on X. We refer to X // via a global // identifier V = hpx::async(M, X_id); // Calculate stuff // that does not depend // on V's value // Now we need the value // of V; if V is not // ready, we suspend, and // another function is // executed. T t = V.get();

Page 17: Message-driven AMR Techniquesblogs.bu.edu/simac3/files/2013/11/10_Lelbach_AMR.pdf · Message-driven AMR Techniques for Binary Star Simulations Bryce Adelstein-Lelbach[1], ... •HPX

Octopus

• Adaptive mesh refinement hydrocode built on top of HPX – Uses octree-based AMR

(refining in space). – Uses parallel recursion to

traverse the tree structure for almost all operations.

• Runs at medium scale – Production runs simulating 3D

massless tori with polytropic EOS (Papaloizou & Pringle (1984)) on 512 x86 cores with 5 to 7 levels of refinement.

5/17/2012 stellar.cct.lsu.edu 17

Page 18: Message-driven AMR Techniquesblogs.bu.edu/simac3/files/2013/11/10_Lelbach_AMR.pdf · Message-driven AMR Techniques for Binary Star Simulations Bryce Adelstein-Lelbach[1], ... •HPX

Octopus

• Adaptive mesh refinement hydrocode built on top of HPX – Uses octree-based AMR

(refining in space). – Uses parallel recursion to

traverse the tree structure for almost all operations.

• Runs at medium scale – Production runs simulating 3D

massless tori with polytropic EOS (Papaloizou & Pringle (1984)) on 512 x86 cores with 5 to 7 levels of refinement.

5/17/2012 stellar.cct.lsu.edu 18

Page 19: Message-driven AMR Techniquesblogs.bu.edu/simac3/files/2013/11/10_Lelbach_AMR.pdf · Message-driven AMR Techniques for Binary Star Simulations Bryce Adelstein-Lelbach[1], ... •HPX

Parallel Recursion

// Simple top-down traversal // Order of F invocations not guranteed template <typename F> void octree::apply() { std::vector<hpx::future<void> > futures; // Children is an std::array of global identifiers. for (boost::uint8_t i : children) futures.emplace_back(hpx::async(octree::apply, children[i])); // Invoke f on ourselves ... F(*this); // ... and wait while our children compute. hpx::wait(futures); }

5/17/2012 stellar.cct.lsu.edu 19

Page 20: Message-driven AMR Techniquesblogs.bu.edu/simac3/files/2013/11/10_Lelbach_AMR.pdf · Message-driven AMR Techniques for Binary Star Simulations Bryce Adelstein-Lelbach[1], ... •HPX

Overlapping RK Substeps

• Ghost zones are communicated at the beginning of each RK substep – Neighbor-to-neighbor (unless

it’s an AMR boundary), instead of top-down or bottom-up.

• We want to be able to send ghost zone data out even if our neighbors are not ready to use it. – So, we use futures – one for

each direction. • Actually, we want three for each

direction – one for each RK substep.

5/17/2012 stellar.cct.lsu.edu 20

Future[0][0] Future[0][1] Future[0][2]

Future[2][0] Future[2][1] Future[2][2]

Subgrid Future[1][0] Future[1][1] Future[1][2]

Future[3][0] Future[3][1] Future[3][2]

Page 21: Message-driven AMR Techniquesblogs.bu.edu/simac3/files/2013/11/10_Lelbach_AMR.pdf · Message-driven AMR Techniques for Binary Star Simulations Bryce Adelstein-Lelbach[1], ... •HPX

Overlapping RK Substeps

5/17/2012 stellar.cct.lsu.edu 21

Future[0][0] Future[0][1] Future[0][2]

D start

Future[3][0] Future[3][1] Future[3][2]

Future[0][0] Future[0][1] Future[0][2]

C start

Future[1][0] Future[1][1] Future[1][2]

Future[2][0] Future[2][1] Future[2][2]

B start

Future[3][0] Future[3][1] Future[3][2]

Future[2][0] Future[2][1] Future[2][2]

A start

Future[1][0] Future[1][1] Future[1][2]

Page 22: Message-driven AMR Techniquesblogs.bu.edu/simac3/files/2013/11/10_Lelbach_AMR.pdf · Message-driven AMR Techniques for Binary Star Simulations Bryce Adelstein-Lelbach[1], ... •HPX

Overlapping RK Substeps

5/17/2012 stellar.cct.lsu.edu 22

Future[0][0] Future[0][1] Future[0][2]

D computing

Future[3][0] Future[3][1] Future[3][2]

Future[0][0] Future[0][1] Future[0][2]

C computing

Future[1][0] Future[1][1] Future[1][2]

Future[2][0] Future[2][1] Future[2][2]

B computing

Future[3][0] Future[3][1] Future[3][2]

Future[2][0] Future[2][1] Future[2][2]

A computing

Future[1][0] Future[1][1] Future[1][2]

Page 23: Message-driven AMR Techniquesblogs.bu.edu/simac3/files/2013/11/10_Lelbach_AMR.pdf · Message-driven AMR Techniques for Binary Star Simulations Bryce Adelstein-Lelbach[1], ... •HPX

Overlapping RK Substeps

5/17/2012 stellar.cct.lsu.edu 23

Future[0][0] Future[0][1] Future[0][2]

D computing

Future[3][0] Future[3][1] Future[3][2]

Future[0][0] Future[0][1] Future[0][2]

C computing

Future[1][0] Future[1][1] Future[1][2]

Future[2][0] Future[2][1] Future[2][2]

B wait for A,D

Future[3][0] Future[3][1] Future[3][2]

Future[2][0] Future[2][1] Future[2][2]

A computing

Future[1][0] Future[1][1] Future[1][2]

Page 24: Message-driven AMR Techniquesblogs.bu.edu/simac3/files/2013/11/10_Lelbach_AMR.pdf · Message-driven AMR Techniques for Binary Star Simulations Bryce Adelstein-Lelbach[1], ... •HPX

Overlapping RK Substeps

5/17/2012 stellar.cct.lsu.edu 24

Future[0][0] Future[0][1] Future[0][2]

D wait for D

Future[3][0] Future[3][1] Future[3][2]

Future[0][0] Future[0][1] Future[0][2]

C computing

Future[1][0] Future[1][1] Future[1][2]

Future[2][0] Future[2][1] Future[2][2]

B computing

Future[3][0] Future[3][1] Future[3][2]

Future[2][0] Future[2][1] Future[2][2]

A wait for C

Future[1][0] Future[1][1] Future[1][2]

Page 25: Message-driven AMR Techniquesblogs.bu.edu/simac3/files/2013/11/10_Lelbach_AMR.pdf · Message-driven AMR Techniques for Binary Star Simulations Bryce Adelstein-Lelbach[1], ... •HPX

Overlapping Timesteps

• Our 3D torus simulation has a static gravitational force. – Without a need to solve for gravity each timestep, the only

calculation that depends on the entire grid and needs to be performed each timestep is the computation of the CFL condition.

– The CFL condition is a necessary (but not sufficient) condition for convergence; it dictates how large of a timestep we can take.

– We determine the CFL condition for every discrete point in our simulation, and select the smallest computed timestep size.

– The CFL condition causes all points at timestep T+1 to depend on every point at timestep T.

• How can we break this global barrier?

5/17/2012 stellar.cct.lsu.edu 25

Page 26: Message-driven AMR Techniquesblogs.bu.edu/simac3/files/2013/11/10_Lelbach_AMR.pdf · Message-driven AMR Techniques for Binary Star Simulations Bryce Adelstein-Lelbach[1], ... •HPX

Overlapping Timesteps

• Answer: Cheat. • Use some heuristic to estimate the

timestep size at T+N based on the timestep size used at timestep T, where N is some parameter of the simulation. – For the first N timesteps we compute

the timestep size normally.

• Originally we relied on HPX’s reference counting to manage timestep lifetime. – This was inefficient, so now we use a

distributed circular buffer; each element in the circular buffer is an octree.

5/17/2012 stellar.cct.lsu.edu 26

T

T+1

T+N

Page 27: Message-driven AMR Techniquesblogs.bu.edu/simac3/files/2013/11/10_Lelbach_AMR.pdf · Message-driven AMR Techniques for Binary Star Simulations Bryce Adelstein-Lelbach[1], ... •HPX

Overlapping Timesteps

5/17/2012 stellar.cct.lsu.edu 27

Page 28: Message-driven AMR Techniquesblogs.bu.edu/simac3/files/2013/11/10_Lelbach_AMR.pdf · Message-driven AMR Techniques for Binary Star Simulations Bryce Adelstein-Lelbach[1], ... •HPX

Overlapping Timesteps

5/17/2012 stellar.cct.lsu.edu 28

Page 29: Message-driven AMR Techniquesblogs.bu.edu/simac3/files/2013/11/10_Lelbach_AMR.pdf · Message-driven AMR Techniques for Binary Star Simulations Bryce Adelstein-Lelbach[1], ... •HPX

Future Work/Challenges

• We must develop a strong understanding of our current performance and efficiency. – We don’t have a theoretical performance model and our

legacy codes are known to have performance issues. • So, we have no clear picture of how efficiently we are using HPC

resources.

• Our DWD simulations use dynamic gravity. – Currently solving Poisson’s equations with multigrid

methods. – It is harder for me to cheat here. – Currently working on FMM gravity solver.

• PPM imposes very large ghost zones.

5/17/2012 stellar.cct.lsu.edu 29

Page 30: Message-driven AMR Techniquesblogs.bu.edu/simac3/files/2013/11/10_Lelbach_AMR.pdf · Message-driven AMR Techniques for Binary Star Simulations Bryce Adelstein-Lelbach[1], ... •HPX

[email protected]:STEllAR-GROUP/octopus.git

Page 31: Message-driven AMR Techniquesblogs.bu.edu/simac3/files/2013/11/10_Lelbach_AMR.pdf · Message-driven AMR Techniques for Binary Star Simulations Bryce Adelstein-Lelbach[1], ... •HPX

BACKUP SLIDES

5/17/2012 stellar.cct.lsu.edu 31