RPA LaFeAsO, -(ET) 2Cu(SCN) 2, EtMe 3Sb[Pd(dmit)...

Post on 11-May-2018

221 views 1 download

Transcript of RPA LaFeAsO, -(ET) 2Cu(SCN) 2, EtMe 3Sb[Pd(dmit)...

RPA :

LaFeAsO, -(ET)2Cu(SCN)2, EtMe3Sb[Pd(dmit)2]2

中村 和磨 (東大工, A03-9) 

Post constrained RPA Project: Reduction of spatial dimension

KAZUMA NAKAMURA (A03-9) YOSHIHIDE YOSHIMOTO (A02-5) MASATOSHI IMADA (A03-9)

Acknowledge: YOSHIRO NOHARA (Max Plank Institute)

KN-Yoshimoto-Nohara-Imada, J. Phys. Soc. Jpn. 79, 123708 (2010)

z2 z3

z1

z4

z1

z2z3

z4

Aim and Background

Strong correlation and quantum fluctuation from first principles and prediction of new phases and functions of correlated materials

Ab initio construction of effective model describing Low-energy property

Model analysis of derived model considering strong correlation and quantum fluctuation in high accuracy

LDA+Dynamical-Mean-Field Theory, V. I. Anisimov, et al. J. Phys. Cond. Mat., 9, 767 (1997) LDA+path-integral-renormalization-group; Y. Imai, I. V. Solovyev, M. Imada, PRL 95, 176405 (2005)

(1) Iron-bansed superconductors:

(2) Organic compounds:

(3) Alkali-cluster-in-zeolite systems:

- KN-Arita-Imada, JPSJ 77, 093711 (2008) - Miyake-KN-Arita-Imada, JPSJ 79, 044705 (2009) - Misawa-KN-Imada, JPSJ, 80, 023704 (2011) - KN-Yoshimoto-Nohara-Imada, JPSJ 79, 123708 (2010)

- KN-Koretsune-Arita, PRB 80, 043941 (2009)

- KN-Yoshimoto-Kosugi-Arita-Imada, JPSJ 78, 083710 (2009) - Shinaoka-Misawa-KN-Imada, in preparation

(4) Transition metal and its oxides: - KN-Arita-Yoshimoto-Tsuneyuki, PRB 74, 235113 (2006) - Miyake-Aryasetiawan-Imada, PRB 80, 155134 (2009)

(5) Excited states of semiconductors: - KN-Yoshimoto-Arita-Tsuneyuki-Imada, PRB 77,195126(2008)

Feasibility Studies (2006-prenent)

(6) Review: - Imada-Miyake, JPSJ 79, 112001 (2010)

1) Basis function

2) Transfer integral

3) Screened Coulomb, Screened exchange

Low-energy Hamiltonian

1) Maximally localized Wannier function (Marzari- Vanderbilt 1997, Souza-Marzari-Vanderbilt 2002)

2) Matrix elements for DFT Kohn-Sham Hamiltonian

Occupied (O)

Virtual (V)

Target (T)

Ef

RPA polarizability:

(1)

(2)

(3)

(4)

3) constrained RPA; Original idea Aryasetiawan et.al. 2004, Solovyev-Imada 2005 Practical detail KN-Arita-Imada, JPSJ 77, 093711, 2008

LaFeAsO: constrained RPA

bare constrained RPA full RPA

1/r

1/(6.7r)

Inte

racti

on

(eV

)

r (Angstrom)

KN-Arita-Imada, JPSJ 77, 093711 (2008)

LaFeAsOcRPA is 3D interaction with long-range tail decaying with power

What‘s the problem ? We derive ab initio parameters for 3D model, while we solve 2D model in the analysis stage

Considering strong quantum fluctuation effects with high accuracy is considerably difficult for the 3D model

We have serious problem on “dimensional inconsistency” LaFeAsO is quasi-2D system Derived model = 3D model, Analyzed model = 2D model FeAs layer

LaO layer

Reducing 3D to 2D

3D 2D

KEY IDEA: renormalize spatial dimension “Dimensional Downfolding”

We extend cRPA idea to the degree of freedom of “spatial space”

delete

delete

Interlayer interaction

Intralayer interaction

Renormalized interaction

Interlayer screening

(d)

;

Computational details:

1.

2.

3.

4.

5.

6.

with

with

xy yz

z2 x2-y2 zx

LaFeAsO: Band & Wannier

FeAs

LaO

t 300 meVt 10 meV

FeAs

LaO

Typical quasi-2D system, good target of present study

LaFeAsO: 2D downfolded

Inte

racti

on

(eV

)

r (Angstrom)

2D-cRPA

bare 3D-cRPA

full RPA

2D-cRPA

SummaryWe developed a new ab initio downfolding scheme for deriving effective low-energy models in low dimensions

It justify 2D short-ranged Hubbard models as effective models from first principles

Nakamura-Yoshimoto-Nohara-Imada, J. Phys. Soc. Jpn. 79, 123708 (2010)

Inte

racti

on

(eV

)

r (Angstrom) r (Angstrom)

Performance Report for Massively-Parallel Project For constrained-RPA code

KAZUMA NAKAMURA (A03-9) YOSHIHIDE YOSHIMOTO (A02-5)

Acknowledge: YOSHIRO NOHARA (Max Plank Institute) YUICHIRO MATSUSHITA (OSHIYAMA Lab) HIROAKI ISHIZUKA (MOTOME Lab)

Computational cost

Nk Nb Nb NPW

Nk

cost (Nk )2(Nb)

2 NPW

(Nk )2(Nb)

2 NPW

NkNbNPWO(10) O(10)

NkNb=

= 10,000 (if Nk =100, Nb =1,000)

Need of development for “distributed-memory RPA code”

Need: distributed-memory code

Memory size of ~ 400 Gbyte with Nband=2000, Nk=125, NPW = 100000

The data cannot be stored by single node alone

EtMe3Sb[Pd(dmit)2]2

For massively parallelization I

occ

unocc

occ

unocc

occ

unocc

1 2 3 4 5 6 7 8 9 10

Step1

Step2

Step3

occ

unoccStep4

1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9 + 10

Division of data;

1 2 3 4 5 6 7 8 9 10

1 2 3 4 5 6 7 8 9 10

Calc

Data split

Data send to MPI

1 2 3 4 5 6 7 8 9 10

1 2 3 4 5 6 7 8 9 10

1 2 3 4 5 6 7 8 9 10

Proposed by YOSHIHIDE YOSHIMOTO

For massively parallelization II

1 2 3 4 5 6 7 8 9 10

Step9

Step6

Step7

1 2 3 4 5 6 7 8 9 10

Step8

1 2 3 4 5 6 7 8 9 10

1 2 3 4 5 6 7 8 9 10

1 2 3 4 5 6 7 8 9 10

10 1 2 3 4 5 6 7 8 9

1 2 3 4 5 6 7 8 9 10

1 2 3 4 5 6 7 8 9 10

9 10 1 2 3 4 5 6 7 8

1 2 3 4 5 6 7 8 9 10

10 1 2 3 4 5 6 7 8 9

Data Rotation MPI_SENDRECV

Calc

Calc

1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9 + 10

1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9 + 10

Data Rotation MPI_SENDRECV

9 10 1 2 3 4 5 6 7 8

core=128

- 8MPI*4OMP/comm ( ; )

- 4comm (q ; )

MPI_COMM_SPLIT

MPI_COMM_SPLIT

(q1) (q2) (q3) (q4)

Performance of our Code: Benchmark for small

System: SrVO3

SrVO3@kashiwa 2q

n time(sec) 1 341.4 - - - 4(1x4) 89.9 98.2 94.9 3.8 8(2x4) 49.5 97.7 86.2 6.9 12(3x4) 33.8 98.3 84.1 10.1 16(4x4) 27.3 98.1 78.2 12.5

(n=MPI OMP)

SrVO3@kashiwa 20q

+ (n=COMM MPI OMP)

n time(sec) 1 3590 - - - 8( 1x2x4) 492 98.6 91.3 7.3 32( 4x2x4) 126 99.6 89.3 28.6 80(10x2x4) 51 99.8 87.6 70.1 160(20x2x4) 26 99.9 85.8 137.3

Test Run at 2011/1/14: 2048-cores calculation

Performance of our code: Benchmark for large

system: C60

C60@kashiwa 1q

n time(sec) 1 15639.1 - - - 4( 1x4) 4077.0 98.6 95.9 3.8 8( 2x4) 2108.3 98.9 92.7 7.4 16( 4x4) 1071.3 99.4 91.2 14.6 32( 8x4) 542.6 99.6 90.1 28.8 64(16x4) 297.9 99.7 82.0 52.5

(n=MPI OMP)

C60@kashiwa 32q+ (n=COMM MPI OMP)

n time(sec) 64( 1x16x4) 9202.83 - - - 128( 2x16x4) 4657.06 99.98 98.81 126.72 256( 4x16x4) 2352.64 99.99 97.79 250.24 512( 8x16x4) 1166.69 100.00 98.60 504.96 1024(16x16x4) 589.33 100.00 97.62 999.04 2048(32x16x4) 305.81 100.00 94.04 1925.76

Product Run at 2011/2/11: 4096-cores calculation

Constrained RPA for dmit

- Nk=75 (5 5 3), - Nband = 1000 (Nocc= 464, Npocc= 4, Nvir= 532), - Ecut( ) = 36 Ry (100,000 PWs), - Ecut( ) = 4.0 Ry (3,200 PWs)

Condition of product run:

- SGI Altix ICE 8400EX sytem - X5570(4core) 2 - Ifort 11.1, SGI-oriented MPI, InfiniBand - 4096 core (4comm 128MPI 8omp) - Total time = 43h19min

Architecture and performance:

Dielectric function: dmit and -bedt

dmit -bedt

|q + G| (a.u)

M(q

+G

)

|q + G| (a.u)

- 4096 cores - 43h19min - kashiwa

- 128 cores - 384h (16days) - SR11000@ITC - 1/6 of dmit

Ener

gy (

eV)

Convergence:

12.5eV

|q +G| (a.u)

M(q

+G

)

20.0eV

| G| ( )

(q)

|q +G| (a.u)

dmit-bedt

egy

(eV

)

3D-cRPA Interaction: dmit and -bedt

dmit -bedtIn

tera

cti

on

(eV

)

r (Angstrom) r (Angstrom)

bare 3D-cRPA

Unfortunately dmit yet to be converged…

APPENDIX

z1

z2z3

z4

z2

z3

Computational data:

-Cu(SCN)2: Band & Wannier

t 65 meV

t 0.1 meV

Geometry

Wannier

-Cu(SCN)2: 2D downfolded In

tera

cti

on

(eV

)

r (Angstrom)

2D-cRPA

bare 3D-cRPA

full RPA

2D-cRPA

Screening length

c=16.4Å

-Cu(SCN)2LaFeAsO

zero at 8.4Å zero at 16.4Å

c=8.4Å

Thus, screening length of interlayer screening corresponds to the c value

Inte

racti

on

(eV

)

r (Å) r (Å)

z2 z3

z1

z4

Feynman Diagram for Screened interaction

Coulomb interaction between electrons at z1 and z4 are screened by RPA polarization of

(z2,z3)

z1

z2z3

z4

z1

z2

z3

z4

Interlayer screening

z1

z2

z3

z4

Electrons at z1 and z4 are in target layer, while screened electrons exist in z2 and z3 on other layer

z1

z2

z3

z411111111

other types of interlayer screening:

Computational details:

(2) Fourier transform of

(1) Target-band-RPA

wave vector in BZ

reciprocal lattice vector

in-plane, out-of-plane

(0) Below is post-cRPA story

z1

z2z3

z4

Layer 1

Layer 2 = target

Layer 3

z2

z3

z2

z3

z1

z2

z3

z41

Layer 1

Layer 2 = target

Layer 3

z2

z3

z1

z2

z3

z4

Layer 1

Layer 2 = target

Layer 3

z2

z3

z1

z2

z3

z4

Layer 1

Layer 2 = target

Layer 3

z1

z2

z3

z4

z2

z3

Layer 1

Layer 2 = target

Layer 3

We have to cut this polarization to avoid double counting of it

(3) Polarization cutting

CUT0

(4) Inverse FT of cut

(5) 2D dielectric function 2D

g , g’ : reciprocal lattice vector of super lattice

(6) 2D screened Coulomb

(7) 2D screened exchange

LaFeAsO: cRPA (previous slide)

Inte

racti

on

(eV

)

r (Angstrom)

bare 3D-cRPA

full RPA

3D-cRPA3D-cRPA3D-cRPA

full RPA

Program structurePolarization

do q = 1, Nk

do = 1, Npair

do k = 1, Nk

call FFT module to calculate

enddo

call TETRAHEDRON module to calculate

do G=1, NPW

do G’=1, NPW

do = 1,N

  do k=1, Nk

enddo

enddo

enddo

enddo

enndo

enddo