"The BG collaboration, Past, Present, Future. The new available resources". Prof. Dr. Paul Whitford...

23
The Blue Gene Collabora/on Past, Present and Future April 17 th , 2015 Paul Whi<ord ([email protected]) Liaison to USP for HPC Rice University Assistant Professor of Physics Northeastern University usp.rice.edu

Transcript of "The BG collaboration, Past, Present, Future. The new available resources". Prof. Dr. Paul Whitford...

The  Blue  Gene  Collabora/on  Past,  Present  and  Future  

April  17th,  2015    

Paul  Whi<ord  ([email protected])    

Liaison  to  USP  for  HPC  Rice  University  

 Assistant  Professor  of  Physics  

Northeastern  University  

usp.rice.edu  

Discussion  Points  •  Blue  Gene/P  specifica/ons  •  Considera/ons  for  running  and  op/mizing  calcula/ons  •  Status  of  opera/ons  •  Addi/onal  considera/ons  and  resources  •  Newest  Resource:  BG/Q!  

usp.rice.edu  

3  

IBM  Blue  Gene/P  at  Rice  University  

•  377  in  the  Top500  supercomputer  list  of  June  2012  

•  84  teraflops  of  performance  

•  24,576  cores    

usp.rice.edu  

USP-­‐Rice  Blue  Gene  Supercomputer  

•  Low-­‐power  processors,  but  massively  parallel  –  PowerPC  450  processors:  850  MHz  

– High  bandwidth  and  low  latency  connec/ons  

•  Was  ranked  in  the  top  500  supercomputers  in  the  world  

•  30%  dedicated  for  USP  

6  racks  192  cards  6,144  cards  24,576  cores   usp.rice.edu  

Each  chip  is  a  4-­‐way  symmetric  mul/processor  (SMP)  with  4Gb  memory    Roughly  equivalent  to  one  3-­‐Ghz  Intel  core.  

 Can  be  run  in:  

 SMP  mode  (shared  memory,  threading)-­‐  full  memory  per  MPI  process    VN  (virtual  node)  mode  –  4  independent  MPI  processes,  ¼  memory  per  MPI  process      shared  computa/on  and  communica/on  loads  

       DUAL  (dual  node)  mode  -­‐2  processes  w/  2  threads  each,  ½  memory  per  MPI  process    Jobs  are  allocated  in  set  of  4  Node  Card  units  (512  cores)  

 -­‐less  scalable  applica/ons  should  go  elsewhere    General  sugges/on:  high  I/O  uses  VN,  high  compu/ng  use  SMP  (use  what  is  fastest)  

Technical  Considera/on  1:  Distribu/on  of  Calcula/ons  

usp.rice.edu  

•  Frontend  –  SUSE  Linux  

•  Backend  –  ppc450  compute  cores  –  Minimal  OS,  lacking  many  system  calls  

•  Code  is  compiled  on  frontend  and  executed  on  the  backend.  

•  BG/Q  also  requires  cross  compila/on  

•  Can  be  very  difficult  for  users  new  to  BG  –  There  is  support  staff  to  help  with  compiling  

Technical  Considera/on  2:  Cross  Compila/on  

usp.rice.edu  

Technical  Considera/on  3:  Scalability  

•  You  should  always  start  with  scalability  tests.  – Frequently  reveals  computaBonal  boDlenecks.  

•  User  feedback  is  incredibly  valuable!  •  Please  share  your  performance  stats  with  us  

– Can  help  users  apply  for  /me  on  other  resources  (e.g.  XSEDE,  INCITE,  PRACE)  

– Will  facilitate  more  efficient  calculaBons  – Will  help  us  opBmize  builds  

usp.rice.edu  

Examples  of  scaling  data  100k  atom  explicit-­‐solvent  simula/on  using  NAMD  on  BG/P  (Performed  by  Filipe  Lima)  

200k-­‐2.5M  atom  simula/ons  using  an  implicit-­‐solvent  model  in  Gromacs  on  Stampede  

Many  considera/ons  impact  scalability  •  Is  your  specific  calcula/on  well  suited  for  paralleliza/on?  

–  For  example,  molecular  dynamics  simula/ons  for  large  systems  tend  to  scale  to  higher  numbers  of  cores  (weak  vs.  strong  scaling)  

•  Have  you  enabled  all  appropriate  flags  at  run/me?  –  Sopware  packages  open  guess…  you  need  to  test  various  serngs  –  For  example,  in  MD  simula/ons,  you  must  tell  the  code  how  to  distribute  atoms  

across  nodes  •  Is  MPI,  openMP,  hybrid  MPI+openMP  enabled  in  the  code?  

–  MPI  launched  many  processes,  each  with  its  own  memory  and  instruc/ons  –  openMP  launches  threads  that  share  memory  –  hybrid:  each  MPI  process  has  its  own  memory,  which  is  then  shared  with  its  own  

openMP  threads  •  Are  you  trying  the  most  appropriate  combina/on  of  calcula/on,  

paralleliza/on  and  flags  for  the  machine  you  are  using?  •  Does  your  calcula/on  take  long  enough  to  complete  that  scalability  is  

no/ceable?  –  Some/mes,  ini/alizing  paralleliza/on  may  take  minutes,  or  longer.  

•  Is  the  performance  reproducible?  •  How  well  is  the  code  writen?  With  so  many  factors  impac/ng  performance,  it  is  essen/al  that  you  perform  scaling  tests  before  doing  produc/on  calcula/ons  

For  when  the  code  needs  improvement…  High-­‐performance  tuning  with  HPC  toolkit  

HPC  Toolkit  funcBonality  •  Accurate  performance  

measurement  •  Effec/ve  performance  analysis  

•  Pinpoin/ng  scalability  botlenecks  

•  Scalability  botlenecks  on  large-­‐scale  parallel  systems  

•  Scaling  on  mul/core  processors  •  Assessing  process  variability  •  Understanding  temporal  behavior  

HPC  Toolkit  developed  at  Rice  (Mellor-­‐Crummey)  Available  on  the  BG/P  

Experts  are  available  to  help  you  profile  your  code,  even  if  you  think  it  works  well!  

Sopware  on  BG/P  Programs  •  Quantum  Espresso  •  Abinit  •  cp-­‐paw  •  dl_poly  •  wi-­‐aims  •  Gromacs  •  Repast  •  Siesta  •  Yambo  •  LAMMPS  •  GAMESS  •  mpiBLAST  •  NAMD  •  NMChem  •  SPECFEM3D-­‐GLOBE  •  VASP  •  WRF  •  CP2K  

 

Installed  libraries  •  boost  •  yw  •  lapack  •  scalapack  •  essl  •  netcdf  •  zlib  •  szip  •  HDF5  •  Pnetcdf  •  Jasper  •  Java  •  Libpng  •  tcl  

usp.rice.edu  

Current  Ac/vi/es  •  First  phase  (May-­‐August  2012,  pre-­‐contract)  

– Material  Science  and  Quantum  Chemistry  •  15  users  from  5  research  groups  •  Installed  10  packages  for  QC/MS  calcula/ons  so  far  

– Weekly  teleconference  mee/ngs  with  Rice-­‐HPC  and  USP  users  and  faculty  

•  Second  Phase  (August-­‐December  2012)  – USP-­‐Rice  contract  went  into  effect  – Assessed  the  demands  of  addi/onal  scien/fic  communi/es  

– Collabora/on  Webpage  established  (usp.rice.edu)  – First  HPC  Workshop  at  USP:  December  2012  

usp.rice.edu  

Current  Ac/vi/es  •  Con/nued  growth  (2013-­‐2014)  

– Hired  addi/onal  technical  staff  –  Xiaoqin  Huang  –  Regular  teleconference  mee/ngs  

•  Open  for  users  – Opened  DaVinci  resource  for  tes/ng  –  2WHPC  workshop  –  May  2014  

•  Current/Ongoing  Ac/vi/es  –  BG/Q  went  online  (March  2015)  –  Possible  compe//ve  alloca/on  process  

•  Scaling  data  important!  –  Student  exchanges    

•  To  Rice  and  to  USP  –  Build  scien/fic  collabora/ons  

usp.rice.edu  

U/liza/on  of  BG/P  by  USP  Users  >100  users  from  USP,  USP  São  Carlos,  USP  Ribeirão  Preto  and  FMUSP,  across  many  departments  

The  newest  resource:  BG/Q  •  1-­‐rack  Blue  Gene/Q  •  Already  installed  sopware  

–  abinit,  cp2k,  quantum  espresso,  wi-­‐aims,  gamess,  gromacs,  lammps,  mpi-­‐blast,  namd,  OpenFOAM,  repast,  siesta,  vasp,  wps,  wrf;  blacks,  blas,  cmake,  yw,  flex,  hdf5,  jasper,  lapack,  libint,  libpng,  libxc,  pnetcdf,  scalapack,  szip,  tcl,  zlib.  

•  Qualifies  for  Top500  ranking  (~293)  •  Accep/ng  USP  users  

–  Get  a  BG/P  account  –  Email  request  to  LLCA  for  BG/Q  account  ac/va/on  

–  Tech  support  available  –  Details  will  be  provided  at  usp.rice.edu  

Differences  between  BG/P  and  BG/Q  ADribute   BG/P   BG/Q  

Maximum  Performance   84  TFLOPS   209  TFLOPS  

Power  Consump/on   371  MFLOPS/W   2.1  GFLOPS/W  

Total  number  of  cores   24576   16384  

Number  of  nodes   6144   1024  

Cores  per  node   4   16  +  1  spare  +  1  for  O/S  

Memory  per  node   4  GB   16  GB  

Possible  threads  per  core   1   4  

CPU  clock-­‐speed   850  MHz   1.6  GHz  

Minimum  nodes/job   128  (512  physical  cores)   32  (512  physical  cores)  

Scratch  disk  space   245TB   120TB  

4X  cores  and  memory  per  node.  Mul/-­‐level  paralleliza/on  can  lead  to  large  performance  increases  with  BG/Q,  and  larger-­‐memory  applica/ons.  

17

Additional Resource: DAVinCI –  25 TeraFLOPS x86 IBM iDataPlex

•  ~20M CPU-hours/year •  2.8GHz Intel Westmere (Hex-Core) •  192 Nodes (>2,300 cores & >4,600 threads) •  192-MPI nodes

•  16 nodes w/ 32 NVIDIA FERMI GPGPUs •  9.2 Tbyte memory (48GB per node) •  ~450 TeraBytes storage •  4x QDR InfiniBand interconnect (compute/

storage) •  Fully operational: 7/2011 •  Available to USP users on a testing basis

•  10  TeraFLOPS  IBM  POWER8  S822  •  3.0GHz  POWER8  12  core  processors  •  8  nodes  (192  cores,  1,536  threads)  •  3.5  TiB  memory  (mix  of  256  GiB  and  1  TiB)  •  361  TiB  storage  •  FDR  Infiniband  Interconnect  •  To  be  commissioned  in  2015  •  For  applica/ons  with  very  high  memory  demands  •  If  interested  in  tes/ng,  contact  [email protected]  

Additional Resource: PowerOmics  

19

XSEDE (National Major Supercomputing Resource formerly known as TeraGrid)

•  Extreme Science and Engineering Discovery Environment •  Name changed but The Song (and Mission) Remains the Same •  Rice Campus Champions:

–  Roger Moye –  Qiyou Jiang –  Jan Odegard

•  If you need MILLIONS of CPU hours, XSEDE is for you •  They are your advocates for getting XSEDE accounts and CPU

time so let us help you with your biggest Science

•  XSEDE requires US a researcher to be part of the team, but many people would be enthusiastic about collaborations

•  In Europe, there is PRACE, which provides similar opportunities •  Worldwide, there are many internationally competitive resources

20  

Top-­‐caliber  support  and  service  for  research  compu/ng  

A  thriving  community  of  HPC  scien/sts  sharing  research  and  building  collabora/ons  with  a  significant  presence  in  na/onal  cyberinfrastructure  issues.  

GeVng  Help,  Organizing  CollaboraBve  Efforts  

usp.rice.edu  

“One-­‐stop  shop”  Informa/on  on:  •  Gerng  accounts  •  Logging  on  •  Compiling  code  •  Running  jobs  •  Gerng  help  •  Updates  and  ac/vi/es  •  Transferring  data  •  Acknowledgements  

•  Will  be  updated  soon  with  BG/Q  informa/on  

USP-­‐Rice  Blue  Gene  CollaboraBon  Webpage  

For  more  informa/on  •  To  request  access  to  BG/P  and  BG/Q,  send  request  to  LCCA  ([email protected]).    Include:  – Your  name  – Email  – Research  group,  department,  PI  – Research  area  – Code  to  be  run  – Scalability  informa/on  (if  available)  

•  For  addi/onal  ques/ons,  email  me:  – Paul  Whi<ord  ([email protected])  

usp.rice.edu  

Any  ques/ons?  

Feel  free  to  contact  all  of  us.    We  are  here  to  help    LCCA  [email protected]  Paul  Whi<ord  [email protected]  Xiaoqin  Huang  [email protected]  Franco  Bladilo  [email protected]