Large -Scale First -Principles

In te rd isc ip l inar y sympos ium on modern dens i t y funct iona l theor yJun .19 - Jun .23 , 2017

Okoc h i Ha l l , R IKEN

Large-Scale First-Principles Electronic Structure Calculations inPetascale and ExascaleSupercomputers: A Real-Space Density Functional Theory code

J.-I. IwataDepartment of Applied Physics, The University of Tokyo

http://github.com/j-iwata/RSDFT

Density Functional Theory relatively low computational costs : O(N3) reasonable accuracy

Used in the R&D in industry realistic materials → large-scale systems need to be handled

Many program packages VASP, QUANTUM ESPRESSO, ABINIT, CPMD, PHASE, STATE, xTAPP,

QMAS, CASTEP, DACAPO, etc.( listed up only the plane-wave pseudopotential programs here )

F IRST-PRINCIPLES ELECTRONIC STRUCTURE CALCULATIONSSPREAD IN A WIDE RANGE OF FIELDS

WITH SOPHISTICATED PROGRAM PACKAGES

Other basis functions and representative packages atomic orbital ( SIESTA ) pseudo atomic orbital ( OpenMX ) wavelet ( BigDFT ) Finite-element ( FEMTECK ) Finite-difference ( PARSEC, GPAW, RSDFT )

Drawbacks of the plane-wave method Periodic Boundary Condition super-cell models must be used for 0-D〜2-D systems

Delocalized in real space Incompatible with the order-N method

Fast Fourier Transform communication burden on massively-parallel computing

VARIOUS METHODS OTHER THAN THE PLANE-WAVE METHOD

Top seven supercomputers for Nov 2016

https://www.top500.org/

FROM K COMPUTERTO POST-K COMPUTER

Oakforest-PACSIntel Xeon Phi 7250 (Knights Landing)68 cores/CPU ( 3.0464 TFLOPS )1 CPU/node8208 nodes ( 25 PFLOPS )

FX100SPARC64 XIfx32 + 2 cores/CPU ( >1 TFLOPS )

K computerSPARC64 VIIIfx8 cores/CPU ( 128 GFLOPS )

The next flagship machine is being developed (〜2021)

RIKEN AICS and FUJITSU

ARM architecture

Multi core processor Many core processor Core-Memory Group (CMG) several cores are grouped together MPI processes run on each CMG

POST-K COMPUTER

REAL-SPACE FINITE-DIFFERENCE PSEUDOPOTENTIAL METHOD

),,()(6

zymxCx m

m ∆+≈∂∂ ∑

φφ r

VdgridN

iinimnm ∆≈ ∑∫

** )()()()( rrrrr φφφφ

Derivatives →（higher-order）finite difference

Kohn-Sham equation is solved as a finite-differenceequation in discretized space

Integrals → summation over grid points

Ionic potentials → Pseudopotentials

N. Troullier & J. L. Martins, Phys. Rev. B 34, 1993 (1991)

( N << # of grid points )

Time-dependent local-density approximation in real timeK. Yabana and G. F. BertschPhys. Rev. B54, 4484 (1996).

Three-dimensional time-dependent Hartree-Fock calculation :Application to 16O + 16O collisions

H. Flocard, S. E. Koonin and M. S. WeissPhys. Rev. C17, 1682 (1978).

Higher-order finite-difference pseudopotential methodd: An application to diatomic moleculesJ. R. Chelikowsky, N. Troullier, K. Wu, and Y. Saad

Phys. Rev. B50, 11355 (1994).

HISTORY OF RSDFT

Real-space, real-time method for the dielectric functionG. F. Bertsch, J.-I. Iwata, A. Rubio, and K. Yabana

Phys. Rev. B62, 7998 (2000).

J.-I. Iwata et al., J. Comp. Phys. 229, 2339 (2010).http://github.com/j-iwata/RSDFT

Y. Hasegawa, et al., SC’11 Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis

AWARDACM Gordon Bell Prize (2011)

PARALLELIZATION IN RSDFT

Real Space Grids CPU Space

CPU8CPU7CPU6

CPU5CPU4CPU3

CPU2CPU1

MPI ( Message-Passing Interface ) libraryMPI_ISEND, MPI_IRECV → finite-difference calc.MPI_ALLREDUCE → global summation

OpenMPFurther grid parallelization (within each node, CPU,or CMG) is performed by thread parallelization

spin, k-point, and band index parallelization(orbital parallelization) is also implemented

J.-I. Iwata et al., J. Comp. Phys. 229, 2339 (2010).

( CG ) Conjugate-Gradient method

( GS ) Ortho-normalization by Gram-Schmidt method

( SD ) Subspace Diagonalization

Density and potential update

FLOW CHART OF THESTATIC DFT CALCULATION

minimize

within a small dimensional spaceDiagonalize

spanned by

Subspace-iteration method for large-sparse eigenvalue problem

Almost all O(N3) computationscan be performed with LEVEL 3 BLAS

Matrix × Matrix product

Recursive algorithmGram-Schmidt procedure

Execution time (sec.)

Computation time (sec.)

Performance (PFLOPS)

SCF 5456 4417 3.08 ( 43.6 % )

DIAG 3710 3218 2.72 ( 38.5 % )

CG 209.3 57.66 0.05 ( 0.7 % )

GS 1537 1141 4.37 ( 61.8 % )

(110) SiNW# of atoms : 107,292# of grid points : 576 x 576 x 192 = 63,700,992# of orbitals : 229,824

# of CPUs : 55,296 (442,368 cores)( 18,432 (grid para.) x 3 (orbital-para.) )

heoretical Peak : 7.07 PFLOPS

BENCHMARK TEST @K : 100,000-ATOM Si NANOWIRE

Computational time of one step of SCF iteration

FULL-NODE TEST @K

SCF CGO(N2)

GSO(N3)

SDO(N3)

SD(mate)

SD (Eigen)

(rotation)

time (s) 2931 160 946 1797 525 493 779

PFLOPS 5.48 0.06 6.70 5.32 6.15 0.01 8.14

peak% 51.67 0.60 63.10 50.17 57.93 1.03 76.70

system : Si nanowire ( 107,292 atoms )grid : 576 x 576 x 180 = 59,719,680band : 230,400

nodes: 82,944 ( 10.6 PFLOPS peak )cores : 663,552

Elapsed time for 1 step of SCF iteration

Y. Hasegawa et al., J. High Perform. Comp. Appl. 28, 335 (2014)

Level 3 BLAS

TWISTED BILAYER GRAPHENEK. Uchida, S. Furuya, JI, A. Osiyama, Phys. Rev. B (2014)

atoms SCF CG GS SD

22,708 2980 446 710 1452

Elapsed time for 1 step of SCF loop at K computer (841 nodes)(sec.)

Poster no.11Hirofumi Nishi

RSDFT A first-principles electronic structure calculation code based on the real-

space finite-difference pseudopotential method More suitable for massively-parallel computing than that of PW basis

K computer (computable systems ) several thousand atoms with atomic structure opt. ( 〜 4000 ) a few tens of thousand atoms for single-point calc. ( 〜 40,000 )

post-K computer ( 〜 1 EFLOPS ) tens of thousands of atoms with atomic structure opt. the system, which is feasible only with the whole resource of the K

computer, is expected to be computable with a small resource of the post-K computer

SUMMARY

http://github.com/j-iwata/RSDFT

Large -Scale First -Principles

Documents

Transcript of Large -Scale First -Principles

Feature Review Large-scale brain networks in cognition ... · Feature Review Large-scale brain networks in cognition: emerging methods and principles Steven L. Bressler1 and Vinod

Multiprocessors— Large vs. Small Scale Multiprocessors— Large vs. Small Scale.

Cloud computing for large scale data analysis · for online analysis of large scale data. In order to reach our goal, we plan to adopt principles from database ... scaling up of data

Distributed Large-scale Natural Graph Factorization › ... › seminar › 05-Large-Scale-Graph-Factoriz… · Distributed Large-scale Natural Graph Factorization Amr Ahmed Google

Large Scale Pv

LARGE SCALE GEOMETRY OF HOMEOMORPHISM GROUPShomepages.math.uic.edu/.../Large-scale-homeo20.pdf · LARGE SCALE GEOMETRY OF HOMEOMORPHISM GROUPS KATHRYN MANN AND CHRISTIAN ROSENDAL

Large scale agile_svante_lidman

Human Genome Sciences’ Large Scale Manufacturing · PDF fileHuman Genome Sciences’ Large Scale Manufacturing Facility ... HUMAN GENOME SCIENCES LARGE SCALE MANUFACTURING ... Above

Generic principles of good IT governance Large Scale Projects€¦ · Generic principles of good IT governance 1. Security 1. Confidentiality 2. Non repudiation 3. Authorization 4.

Architectural Principles for Large Scale Web sites: a Case Studyessay.utwente.nl/458/1/scriptie_Linnenbank.pdf · 2011. 8. 28. · Architectural Principles for Large Scale Web sites:

Large Scale Artwork

Inferring large-scale patterns in complex networkstuvalu.santafe.edu/~aaronc/...LargeScaleStructure.pdf · large-scale structure of networks large-scale structural analysis! • enormous

Large-scale Disasters LESSONS LEARNED Large ... - … · Large-scale Disasters LESSONS LEARNED ... LARGE-SCALE DISASTERS ... 4.An insurance perspective on disaster management

1 large scale vs small scale

Understanding large-scale instabilities of …gershwin.ens.fr/.../beamer-lectures_cambridge_2012.pdflayered models V. Zeitlin Unstable large-scale ﬂows Modeling large-scale processes

Using Lean Principles to Improve Software Development ...jultika.oulu.fi/files/nbnfioulu-201511212155.pdf · Using Lean Principles to Improve Software Development Practices in a Large-Scale

Configuration Management -Large Systems challenges · wasting resources Cloud native principles -example Refer to Twelve Factor App principles • Scale automatically using proactive

Large Scale Organisations in Context Types of Large Scale Organisations Classification of Large Scale Organisations Contributions to the economy Management.

Module on Large-Scale Integer Programming & Combinatorial ... · Module on Large-Scale Integer Programming & Combinatorial Optimization Module on Large-Scale Integer Programming &

Guiding Principles on Large Scale Land Based Investments in Africa