Numerical Solutions for Hyperbolic Systems of Conservation ......5th JETSET School Romain Teyssier2...
Transcript of Numerical Solutions for Hyperbolic Systems of Conservation ......5th JETSET School Romain Teyssier2...
1Romain Teyssier5th JETSET School
Numerical Solutions for
Hyperbolic Systems of
Conservation Laws:
from Godunov Method to
Adaptive Mesh Refinement
Romain Teyssier
CEA Saclay
2Romain Teyssier5th JETSET School
- Euler equations, MHD, waves, hyperbolic systems of conservationlaws, primitive form, conservative form, integral form
- Advection equation, exact solution, characteristic curve, Riemanninvariant, finite difference scheme, modified equation, Von Neumananalysis, upwind scheme, Courant condition, Second order scheme
- Finite volume scheme, Godunov method, Riemann problem,approximate Riemann solver, Second order scheme, Slope limiters,Characteristic tracing
- Multidimensional scheme, directional splitting, Godunov, Runge-Kutta, CTU, 3D slope limiting
- AMR, patch-based versus cell-based, octree structure, gradedoctree, flux correction, EMF correction, restriction and prolongation,divB conserving interpolation
- Parallel computing with the RAMSES code
Outline
3Romain Teyssier5th JETSET School
Adaptive Mesh Refinement
4Romain Teyssier5th JETSET School
Patch-based versus tree-based
5Romain Teyssier5th JETSET School
A few AMR codes in astrophysics
ENZO: Greg Bryan, Michael Norman…
ART: Andrey Kravtsov, Anatoly Klypin
RAMSES: Romain Teyssier
NIRVANA: Udo Ziegler
AMRVAC: Gabor Thot and Rony Keppens
FLASH: The Flash group (PARAMESH lib)
ORION: Richard Klein, Chris McKee, Phil Colella
PLUTO: Andrea Mignone (CHOMBO lib, Phil Colella)
CHARM: Francesco Miniati (CHOMBO lib, Phil Colella)
ASTROBear: Adam Frank…
Any other code in the audience ?
6Romain Teyssier5th JETSET School
Cell-centered variables are updated level by level using linked lists.
Cost = 2 integer per cell.
Optimize mesh adaptation to complex flow geometries, but CPU overheadcompared to unigrid can be as large as 50%.
2 type of cell: “leaf” or active cell“split” or inactive cell
Graded Octree structure
Fully Threaded Tree (Khokhlov 98).Cartesian mesh refined on a cell by cell basis.octs: small grid of 8 cellsPointers (arrays of index)• 1 parent cell• 6 neighboring parent cells• 8 children octs• 2 linked list indices
7Romain Teyssier5th JETSET School
Step 1: mesh consistency
if a split cell contains at least one split ormarked cell, then mark the cell with flag = 1and mark its 26 neighbors
Step 2: physical criteria
quasi-Lagrangian evolution, Jeans mass
geometrical constraints (zoom)
Truncation errors, density gradients…
Step 3: mesh smoothing
apply a dilatation operator (mathematicalmorphology) to regions marked forrefinement → convex hull
Compute the refinement map: flag = 0 or 1
Refinement rules for graded octree
8Romain Teyssier5th JETSET School
Berger & Oliger (84), Berger & Collela (89)
Prolongation (interpolation) to finer levels
- fill buffer cells (boundary conditions)
- create new cells (refinements)
Restriction (averaging) to coarser levels
- destroy old cells (de-refinements)
Flux correction at level boundary
Godunov schemes and AMR
Careful choice of interpolation variables (conservative or not ?)
Several interpolation strategies (with RT P = I) :
- straight injection
- tri-linear, tri-parabolic reconstruction
9Romain Teyssier5th JETSET School
Godunov schemes and AMR
Buffer cells provide boundary conditions for the underlying numericalscheme. The number of required buffer cells depends on the kernel ofthe chosen numerical method. The kernel is the ensemble of cells onthe grid on which the solution depends.
- First Order Godunov: 1 cell in each direction
- Second order MUSCL: 2 cells in each direction
- Runge-Kutta or PPM: 3 cells in each direction
Simple octree AMR requires 2 cells maximum. For higher-orderschemes (WENO), we need to have a different data structure (patch-based AMR or augmented octree AMR).
10Romain Teyssier5th JETSET School
Time integration: single time step or recursive sub-cycling
- froze coarse level during fine level solves (one order of accuracy down !)
- average fluxes in time at coarse fine boundaries
Adaptive Time Stepping
11Romain Teyssier5th JETSET School
The AMR catastrophe
First order scheme:
Second order scheme:
At level boundary, we loose one order of accuracy in the modified equation.
First order scheme: the AMR extension is not consistent at level boundary.
Second order scheme: for α=1.5, AMR is unstable at level boundary.
Solutions: 1- refine gradients, 2- enforce first order, 3- add artificial diffusion
Assume a and C>0.
12Romain Teyssier5th JETSET School
Shock wave propagating through level boundary
13Romain Teyssier5th JETSET School
Shock-tube test with regular grid
14Romain Teyssier5th JETSET School
Shock-tube test with AMR
15Romain Teyssier5th JETSET School
Complex geometry with AMR
16Romain Teyssier5th JETSET School
Quasi-Lagrangian mesh evolution:roughly constant number of particlesper cell
Trigger new refinement when n > 10-40 particles. The fractal dimension isclose to 1.5 at large scale (filaments)and is less than 1 at small scales(clumps).
Particle-Mesh on AMR grids:Cloud size equal to the local meshspacing
Poisson solver on the AMR gridMultigrid or Conjugate GradientInterpolation to get Dirichlet boundaryconditions (one way interface)
Cosmology with AMR
17Romain Teyssier5th JETSET School
Code is freely available http://www-dapnia.cea.fr/Projets/COAST
RAMSES: a parallel graded octree AMR
18Romain Teyssier5th JETSET School
Parallel computing using the MPI library with adomain decomposition based on the Peano-Hilbertcurve.
Algorithm inspired by gravity solvers (tree codes).Use locally essential tree.
Tested and operational up to 6144 core.Scaling depends on problem size and complexity.
Domain decomposition for parallel computing
Salmon, J.K. and Warren, M.S., "Parallel out-of-coremethods for N-body simulation", In Eighth SIAMConference on Parallel Processing for ScientificComputing, SIAM,1997.
Peter MacNeice, Kevin M. Olsonb, Clark Mobarryc,Rosalinda de Fainchteind and Charles Packer,« PARAMESH: A parallel adaptive mesh refinementcommunity toolkit, », 2000, Computer PhysicsCommunications, 126, 3.
19Romain Teyssier5th JETSET School
Domain decomposition over 3 processors
Locally essential treein processor #2
Locally essential treein processor #1
Locally essential treein processor #3
Salmon 90,Warren 92,Dubinsky 96
Each processor octree issurrounded by ghost cells(local copy of distantprocessor octrees) so thatthe resulting local octreecontains all the necessaryinformation.
Locally essential trees
20Romain Teyssier5th JETSET School
Dynamic partitioning in RAMSES
Several cell ordering methods:1- column, row or plane major2- Hilbert or Morton3- User defined (angular, column+Hilbert…)
Dynamic partioning is performed every N steps by sorting each cell alongchosen ordering and redistributing the mesh across processors. Usually, agood choice is N=10 (10% overhead).
21Romain Teyssier5th JETSET School
Are there any optimal load balancing strategy ?
Column major Angular
Hilbert
22Romain Teyssier5th JETSET School
Are there any optimal load balancing strategy ?
Recursive bisection
23Romain Teyssier5th JETSET School
Are there any optimal load balancing strategy ?
Hilbert ordering
24Romain Teyssier5th JETSET School
Are there any optimal load balancing strategy ?
Angular ordering
25Romain Teyssier5th JETSET School
Ideal MHD equations in conservative form
Mass conservation
Linear momentum conservation
Total energy conservation
Magnetic flux conservation
Total pressure
Total energy
26Romain Teyssier5th JETSET School
Euler equations using finite volumes: decades of experience in robustadvection and shock-capturing schemes (Godunov, MUSCL, PPM)Ideal MHD : Euler system augmented by the induction equation
Finite volume and cell-centered schemes Crockett et al. 2005• div B cleaning using Poisson solver• div B waves (Powell’s 8 waves formulation)• div B damping
Constrained Transport and staggered grid (Yee 66; Evans & Hawley 88)• 1D Godunov fluxes to compute the EMF Balsara, Spicer 99• 2D Riemann solver to get the EMF Londrillo, DelZanna 01,05, Ziegler 04,05• High-order extension of Balsara’s scheme Gardiner & Stone 05
Constrained Transport for tree-based AMR Teyssier, Fromang & Dormy 2006 Fromang, Hennebelle & Teyssier 2006
Godunov method and MHD
27Romain Teyssier5th JETSET School
For piecewise constant initial data, theflux function is self-similar at corner points
The induction equation in 2D
Finite-surface approximation (Constrained Transport)
Integral form using Stoke’s theorem
For pure induction, the 2D Riemann problem hasthe following exact (upwind) solution:
Numerical diffusivity andInduction Riemann problem
28Romain Teyssier5th JETSET School
AMR and Constrained Transport
« Divergence-free preserving » restriction and prolongation operatorsBalsara (2001) Toth & Roe (2002)
Flux conserving interpolation and averaging within cell faces using TVD slopes in 2dimensions
EMF correction for conservative update at coarse-fine boundaries
?
?? ?
29Romain Teyssier5th JETSET School
n=400
Compound wave (Torrilhon 2004)
n=800
n=20000
neff=106
α=π: 2 solutions: 2 shocks or 1 c.w.α≠π: 2 shocks onlyDissipation properties are crucial.Only AMR can resolve scales smallenough within reasonable CPU time.
30Romain Teyssier5th JETSET School
Field loop advection (Gardiner & Stone 2005)
31Romain Teyssier5th JETSET School
Current sheet and magnetic reconnection
32Romain Teyssier5th JETSET School
The Horizon Simulation
We report on a 70 billions particles N-body simulation for a 2 Gpc/h periodicbox in a LCDM universe.
We use a new French supercomputer BULL Novascale 3045 recentlycommissioned at CCRT (Centre de Calcul Recherche et Technologie, CEA).
We ran RAMSES in pure N-body mode with 6144 processors for 2 months.
We have simulated half of the observable universe, with “resolved” dark matterhalos as small as the Milky Way.
Using our light cone, we have computed a full sky convergence map forsimulating future weak-lensing surveys like DUNE or LSST.
More details in http://www.projet-horizon.fr
Extreme computing with RAMSES
33Romain Teyssier5th JETSET School
The Horizon Simulation
Courtesy of V. Springel
34Romain Teyssier5th JETSET School
The Horizon Simulation
Initial conditions on a 40963 have been generated using mpgrafic and FFTW.
MPI communication routines in RAMSES have been optimized (reduce andbroadcast) with BULL R&D teams.
140 billions AMR cells to compute the gravitational acceleration.
Starting with a base grid of 40963 cells, we used 6 additional level ofrefinements for a formal resolution of 2621443 (7 kpc/h).
104 time steps for 70 billion particle trajectories.
737 main steps (1 hour of elapsed time) for which 3 Tb of data were dumped(checkpoint restart) in 5 minutes using the LUSTRE parallel file system.
We have generated data for 5 snapshots and 2 light cones (50 Tb).
- 1 light cone up to redshift 1 for 4Π.
- 1 light cone up to redshift 7 for 500 square degree.