How to Deploy OpenStack on Tianhe-2 Supercomputer · How to Deploy OpenStack on Tianhe-2...

46
How to Deploy OpenStack on Tianhe-2 Supercomputer Yusong Tan Associated Professor National University of Defense Technology November 5, 2013

Transcript of How to Deploy OpenStack on Tianhe-2 Supercomputer · How to Deploy OpenStack on Tianhe-2...

Page 1: How to Deploy OpenStack on Tianhe-2 Supercomputer · How to Deploy OpenStack on Tianhe-2 Supercomputer Yusong Tan! Associated Professor! National University of Defense Technology!

How to Deploy OpenStack on Tianhe-2 Supercomputer

Yusong Tan  Associated Professor  

National University of Defense Technology  November 5, 2013

Page 2: How to Deploy OpenStack on Tianhe-2 Supercomputer · How to Deploy OpenStack on Tianhe-2 Supercomputer Yusong Tan! Associated Professor! National University of Defense Technology!

2 OpenStack Summit HK 2013

Contents

n Background  n Deployment    n Op4miza4on  n Preliminary  Evalua4on  n Contribu4ons  to  Community  n Cases  n Next  Steps  

2

Page 3: How to Deploy OpenStack on Tianhe-2 Supercomputer · How to Deploy OpenStack on Tianhe-2 Supercomputer Yusong Tan! Associated Professor! National University of Defense Technology!

3 OpenStack Summit HK 2013 3

Background

n  TH-­‐2  Supercomputer  n  Sponsored  by  863  High  Tech.  Program  of  Ministry  of  Science  and  Tech  of  China,  Government  of  Guangdong  province  and  Government  of  Guangzhou  city  

n  Built  by  Na4onal  University  of  Defense  Technology  

Page 4: How to Deploy OpenStack on Tianhe-2 Supercomputer · How to Deploy OpenStack on Tianhe-2 Supercomputer Yusong Tan! Associated Professor! National University of Defense Technology!

4 OpenStack Summit HK 2013 4

Background No.  1  of  41st  Top  500  list  at  June  2013  Rpeak  54.9PFlops,Rmax  33.9PFlops  

Page 5: How to Deploy OpenStack on Tianhe-2 Supercomputer · How to Deploy OpenStack on Tianhe-2 Supercomputer Yusong Tan! Associated Professor! National University of Defense Technology!

5 OpenStack Summit HK 2013 5

Background n  TH-­‐2  Supercomputer  

n  Neo-­‐heterogeneous  architecture  o Xeon  CPU  &  Xeon  Phi  @  X86  ISA  

Items Configuration Nodes 16000

Processors 32000 Intel Xeon CPUs + 48000 Xeon Phis + 4096 FT CPUs

Interconnect Proprietary high-speed interconnection network TH-NI

Memory 1PB in total

Storage Global shared parallel storage system, 12.4PB

Cabinets 162=125+13+24 compute/communication/storage Cabinets

Cooling Closed Air cooling system

Performance Peak performance is 54.9PFlops Max performance is 33.9PFlops

Page 6: How to Deploy OpenStack on Tianhe-2 Supercomputer · How to Deploy OpenStack on Tianhe-2 Supercomputer Yusong Tan! Associated Professor! National University of Defense Technology!

6 OpenStack Summit HK 2013 6

Background n  TH-­‐2  Supercomputer  

n  Installed  in  Na4onal  Supercomputer  Center  in  Guangzhou  (NSCC-­‐GZ)  o Open  pla^orm  for  research  and  educa4on  

o  To  provide  HPC  service  

o Public  informa4on  infrastructure  o  To  accelerate  the  industry  and  economy  

More  than  HPC!

Page 7: How to Deploy OpenStack on Tianhe-2 Supercomputer · How to Deploy OpenStack on Tianhe-2 Supercomputer Yusong Tan! Associated Professor! National University of Defense Technology!

7 OpenStack Summit HK 2013 7

Background n  TH-­‐2  Supercomputer  

n  How  to  be  used  as  a  public  informa4on  infrastructure          Hybrid  Cloud  infrastructure  

o Virtualiza4on  based  IaaS  cloud  compu4ng!  o Different  environments  

o  Redhat,  CentOS,  Windows,  Ubuntu,  SUSE…  

o Different  scales  o  From  several  nodes  to  thousands  of  nodes  

Page 8: How to Deploy OpenStack on Tianhe-2 Supercomputer · How to Deploy OpenStack on Tianhe-2 Supercomputer Yusong Tan! Associated Professor! National University of Defense Technology!

8 OpenStack Summit HK 2013 8

Background

n Hardware  for  OpenStack  deployment  n  6400  nodes  on  TH-­‐2  

o 50  cabinets  o 128  nodes  in  each  cabinet  

n  Each  node:  o   Intel  Xeon  Ivy  Bridge  (12  cores)  *  2  o   96  GB  RAM  o   1TB  local  disk    o   GE  Nic  *  2    o   TH-­‐NI  High-­‐speed  Network  (160  Gbps)  o   Intel  Xeon  Phi  *  3  

Page 9: How to Deploy OpenStack on Tianhe-2 Supercomputer · How to Deploy OpenStack on Tianhe-2 Supercomputer Yusong Tan! Associated Professor! National University of Defense Technology!

Grizzly

9 OpenStack Summit HK 2013 9

Background

n  Sogware  

Server 12.04.2 LTS

v0.56.6 v2.7.11

Page 10: How to Deploy OpenStack on Tianhe-2 Supercomputer · How to Deploy OpenStack on Tianhe-2 Supercomputer Yusong Tan! Associated Professor! National University of Defense Technology!

10 OpenStack Summit HK 2013 10

Contents

n Background  n Deployment    n Op4miza4on  n Preliminary  Evalua4on  n Contribu4ons  to  Community  n Cases  n Next  Steps  

Page 11: How to Deploy OpenStack on Tianhe-2 Supercomputer · How to Deploy OpenStack on Tianhe-2 Supercomputer Yusong Tan! Associated Professor! National University of Defense Technology!

11 OpenStack Summit HK 2013 11

Deployment

n Architecture  n  256  controller  nodes  

o 2  cabinets  o 124  api  nodes:  nova-­‐api,  nova-­‐cells,  glance-­‐*,  cinder-­‐api,  cinder-­‐scheduler,  neutron-­‐server,    keystone      

o 4  LVS+Keepalived  o 96  network  nodes    o 32  Ganglia+Nagios  server  nodes  

n  3840  compute  nodes    o 30  cells,  each  for  one  cabinet  o 2  cell  servers  in  each  cabinet  

Page 12: How to Deploy OpenStack on Tianhe-2 Supercomputer · How to Deploy OpenStack on Tianhe-2 Supercomputer Yusong Tan! Associated Professor! National University of Defense Technology!

12 OpenStack Summit HK 2013 12

Deployment

n Network  Topology  n  1  GE  for  management  n  1  GE  for  Ceph  storage  

o higher  performance  with  TH-­‐NI  virtualized  Ethernet  n  TH-­‐NI  virtualized  Ethernet  for  VMs  communica4on  

Page 13: How to Deploy OpenStack on Tianhe-2 Supercomputer · How to Deploy OpenStack on Tianhe-2 Supercomputer Yusong Tan! Associated Professor! National University of Defense Technology!

13 OpenStack Summit HK 2013 13

Deployment

n  Storage  Ceph  as  “all-­‐in-­‐one”  n  RBD  for  Glance  image  storage  n  RBD  for  volume  backend  n  FS  for  instance  storage  

Page 14: How to Deploy OpenStack on Tianhe-2 Supercomputer · How to Deploy OpenStack on Tianhe-2 Supercomputer Yusong Tan! Associated Professor! National University of Defense Technology!

14 OpenStack Summit HK 2013 14

Deployment

n Puppet  n  Different  manifests  for  different  roles  n  Each  node  reads  its  role  from  same  conf  file  

Page 15: How to Deploy OpenStack on Tianhe-2 Supercomputer · How to Deploy OpenStack on Tianhe-2 Supercomputer Yusong Tan! Associated Professor! National University of Defense Technology!

15 OpenStack Summit HK 2013 15

Deployment

n  Implementa4on  n  Ubuntu    Server  12.04.2  LTS  based  diskless  system  n  Configure  the  role  of  each  node  using  Puppet  when  startup  n  Using  TH-­‐NI  to  pull  images  to  accelerate  boo4ng  

o With  na4ve  TH-­‐NI  RDMA  driver  support  n  Other  configura4ons  

o Par44on  hard  disks  when  first  boot  o SSH  without  password  o Decide  IP  and  hostname  based  on  TH-­‐NI’s  nid  dynamically  

Page 16: How to Deploy OpenStack on Tianhe-2 Supercomputer · How to Deploy OpenStack on Tianhe-2 Supercomputer Yusong Tan! Associated Professor! National University of Defense Technology!

16 OpenStack Summit HK 2013 16

Contents

n Background  n Deployment    n Op4miza4on  n Preliminary  Evalua4on    n Contribu4ons  to  Community  n Cases  n Next  Steps  

Page 17: How to Deploy OpenStack on Tianhe-2 Supercomputer · How to Deploy OpenStack on Tianhe-2 Supercomputer Yusong Tan! Associated Professor! National University of Defense Technology!

17 OpenStack Summit HK 2013 17

Optimization n More  nodes  

n  Load  balance  for  *-­‐apis  n  Separated  Keystone  API  service  for  Neutron  

Page 18: How to Deploy OpenStack on Tianhe-2 Supercomputer · How to Deploy OpenStack on Tianhe-2 Supercomputer Yusong Tan! Associated Professor! National University of Defense Technology!

18 OpenStack Summit HK 2013 18

Optimization n More  workers  

Page 19: How to Deploy OpenStack on Tianhe-2 Supercomputer · How to Deploy OpenStack on Tianhe-2 Supercomputer Yusong Tan! Associated Professor! National University of Defense Technology!

19 OpenStack Summit HK 2013 19

Optimization

n More  workers  n Mul4-­‐processed  

o Mul4-­‐processed  neutron-­‐server  o  api_workers  

o Mul4-­‐processed  nova-­‐api  o  ec2_workers  o  osapi_compute_workers  o metadata_workers  

o Mul4-­‐processed  nova-­‐conductor  o workers  

o Mul4-­‐processed  glance-­‐api  o workers  

Page 20: How to Deploy OpenStack on Tianhe-2 Supercomputer · How to Deploy OpenStack on Tianhe-2 Supercomputer Yusong Tan! Associated Professor! National University of Defense Technology!

20 OpenStack Summit HK 2013 20

Optimization

n More  workers  n  Apache  hos4ng  

o Keystone  o supported  by  its  design  and  implementa4on  

Page 21: How to Deploy OpenStack on Tianhe-2 Supercomputer · How to Deploy OpenStack on Tianhe-2 Supercomputer Yusong Tan! Associated Professor! National University of Defense Technology!

21 OpenStack Summit HK 2013 21

Optimization

n More  powerful  n  Nova  

o Eliminate  api  rate  limit  o  api_rate_limit=false  

o Use  large  db  pool  size  o  sql_dbpool_enable  o  sql_min_pool_size  o  sql_max_pool_size  o  sql_max_overflow  

Page 22: How to Deploy OpenStack on Tianhe-2 Supercomputer · How to Deploy OpenStack on Tianhe-2 Supercomputer Yusong Tan! Associated Professor! National University of Defense Technology!

22 OpenStack Summit HK 2013 22

Optimization

n More  powerful  n  Neutron  

o Use  larger  db  pool  size  o  sqlalchemy_pool_size  

o Increase  agent  down  4me  o  agent_down_4me  

Page 23: How to Deploy OpenStack on Tianhe-2 Supercomputer · How to Deploy OpenStack on Tianhe-2 Supercomputer Yusong Tan! Associated Professor! National University of Defense Technology!

23 OpenStack Summit HK 2013 23

Optimization

n More  powerful  n  Rabbitmq  

o Set  high  memory  watermark  o  rabbitmqctl  set_vm_memory_high_watermark  0.6  

o Set  high  socket  limit  o  ulimit  -­‐n  102400  

o Rabbitmq  Cluster  o  Clustering  rabbitmq  by  running  more  rabbitmq-­‐servers  to  distribute  workload  

Page 24: How to Deploy OpenStack on Tianhe-2 Supercomputer · How to Deploy OpenStack on Tianhe-2 Supercomputer Yusong Tan! Associated Professor! National University of Defense Technology!

24 OpenStack Summit HK 2013 24

Optimization

n More  powerful  n MySQL  

o Use  larger  maximum  connec4ons  

o max_connec4ons=102400  

o Galera  Mysql  cluster  o  Chose  some  api  servers  run  one  mysql  master  

o  using  LVS  to  provide  LB  o  access  db  via  a  single  entry  point  by  using  a  virtual  IP  

Page 25: How to Deploy OpenStack on Tianhe-2 Supercomputer · How to Deploy OpenStack on Tianhe-2 Supercomputer Yusong Tan! Associated Professor! National University of Defense Technology!

25 OpenStack Summit HK 2013 25

Optimization

n More  powerful  n  KVM  

o Huge  pages  enabled  o Using  VhostNet  

o  provides  beser  latency  and  greater  throughput  

o Using  vir4o_blk  o Get  higher  performance  of  storage  devices  

o Kernel  SamePage  merging  enabled(KSM)  o  Because  we  want  to  oversell  

o Kernel  I/O  scheduler  to  "Deadline"  

Page 26: How to Deploy OpenStack on Tianhe-2 Supercomputer · How to Deploy OpenStack on Tianhe-2 Supercomputer Yusong Tan! Associated Professor! National University of Defense Technology!

26 OpenStack Summit HK 2013 26

Optimization

n  Services  HA/LB  n  LVS  +  keepalived  for  all  API  servers  n  Increased  rpc_response_4meout  n  Increased  quantum_url_4meout  

Page 27: How to Deploy OpenStack on Tianhe-2 Supercomputer · How to Deploy OpenStack on Tianhe-2 Supercomputer Yusong Tan! Associated Professor! National University of Defense Technology!

27 OpenStack Summit HK 2013 27

Optimization

n HA  for  instances  

Page 28: How to Deploy OpenStack on Tianhe-2 Supercomputer · How to Deploy OpenStack on Tianhe-2 Supercomputer Yusong Tan! Associated Professor! National University of Defense Technology!

28 OpenStack Summit HK 2013 28

Optimization n Ceph  

n  Inline  data  support  for  small  file  IO  n  Typical  file  read/write  traffic  

o  Client  consult  MDS  for  file  metadata  o  Client  communicate  with  OSD  for  file  data  

n  With  inline  data  support,  for  small  files  o  Client  consult  MDS  for  both  

Page 29: How to Deploy OpenStack on Tianhe-2 Supercomputer · How to Deploy OpenStack on Tianhe-2 Supercomputer Yusong Tan! Associated Professor! National University of Defense Technology!

29 OpenStack Summit HK 2013 29

Optimization

n Ceph  n  Punch  hole  support  

o Enable  sparse  file  support  o Improves  space  efficiency  in  virtualiza4on  situa4on  

Page 30: How to Deploy OpenStack on Tianhe-2 Supercomputer · How to Deploy OpenStack on Tianhe-2 Supercomputer Yusong Tan! Associated Professor! National University of Defense Technology!

30 OpenStack Summit HK 2013 30

Contents

n Background  n Deployment    n Op4miza4on  n Preliminary  Evalua4on  n Contribu4ons  to  Community  n Cases  n Next  Steps  

Page 31: How to Deploy OpenStack on Tianhe-2 Supercomputer · How to Deploy OpenStack on Tianhe-2 Supercomputer Yusong Tan! Associated Professor! National University of Defense Technology!

31 OpenStack Summit HK 2013 31

Preliminary Evaluation

n Maximum  VMs  started  one  round  n  cirros  test  image,  m1.4ny  n  ~5300  n  scheduler  cost  most  4me  

n Concurrent  request  handled  by  api  server  per  second  n  ~1100  list  instances  opera4ons    (  ac4ve  instances  num.  =1000)  

n  Boslenecks  o GE,  use  TH-­‐NI  instead  o nova-­‐api  (if  TH-­‐NI  is  used),  use  more  api  servers  

o  Time  spent  on  query  of  database  only  takes  less  40%  

Page 32: How to Deploy OpenStack on Tianhe-2 Supercomputer · How to Deploy OpenStack on Tianhe-2 Supercomputer Yusong Tan! Associated Professor! National University of Defense Technology!

32 OpenStack Summit HK 2013 32

Contents

n Background  n Deployment    n Op4miza4on  n Preliminary  Evalua4on  n Contribu4ons  to  Community  n Cases  n Next  Steps  

Page 33: How to Deploy OpenStack on Tianhe-2 Supercomputer · How to Deploy OpenStack on Tianhe-2 Supercomputer Yusong Tan! Associated Professor! National University of Defense Technology!

33 OpenStack Summit HK 2013 33

Contributions to Community

n  For  Havanna  n  5  blueprints,  10  bug-­‐fixs  is  merged  (3976  LoC,  4ll  2013.10.12)  

n  ranked  1  (BP)/6  (LoC)  /9  (Commit)  of  independent  developers  

BP LoC Commit

Page 34: How to Deploy OpenStack on Tianhe-2 Supercomputer · How to Deploy OpenStack on Tianhe-2 Supercomputer Yusong Tan! Associated Professor! National University of Defense Technology!

34 OpenStack Summit HK 2013 34

n  Quota  n  per-­‐project-­‐user-­‐quotas-­‐support  

o The  administrators  can  set  quotas  on  a  per-­‐project-­‐user  basis,  rather  than  just  a  per-­‐project  basis.  

n  editable-­‐default-­‐quotas-­‐support  o The  administrators  can  edit  the  default  quota  for  nova  or  cinder  through  horizon  or  CLI.  

Contributions to Community

Page 35: How to Deploy OpenStack on Tianhe-2 Supercomputer · How to Deploy OpenStack on Tianhe-2 Supercomputer Yusong Tan! Associated Professor! National University of Defense Technology!

35 OpenStack Summit HK 2013 35

Contributions to Community

n Access  VMs  via  port  mapping  n  Exis4ng  approaches  to  access  VMs  

o Floa4ng  IP:  limited  by  the  number  of  available  IPs  o VPN:  rela4vely  complex  to  configure  

n  Our  solu4on  o Unused  ports  of  the  public  IP  can  be  dynamically  mapped  to  different  VMs  to  supply  a  access  path  

Page 36: How to Deploy OpenStack on Tianhe-2 Supercomputer · How to Deploy OpenStack on Tianhe-2 Supercomputer Yusong Tan! Associated Professor! National University of Defense Technology!

36 OpenStack Summit HK 2013 36

Contributions to Community

n New  UI  design  &  feature  enhancements  for  Horizon  

Page 37: How to Deploy OpenStack on Tianhe-2 Supercomputer · How to Deploy OpenStack on Tianhe-2 Supercomputer Yusong Tan! Associated Professor! National University of Defense Technology!

37 OpenStack Summit HK 2013 37

Contributions to Community

n  For  Ceph  n  Two  blueprints  (Inline  data  support  for  small  file  IO+  Punch  hole  support)  presented    at  the  first  Ceph  Developer  Summit  

n  Both  are  pushed  into  upstream  n  20  commits,  1000+  LOC,  including  600+  for  linux  kernel  

Page 38: How to Deploy OpenStack on Tianhe-2 Supercomputer · How to Deploy OpenStack on Tianhe-2 Supercomputer Yusong Tan! Associated Professor! National University of Defense Technology!

38 OpenStack Summit HK 2013 38

Contributions to Community

n And  More  n  Add  Mul4-­‐level  User  Management  to  Keystone  via  Project  Nes4ng  (in  development)  

n Mul4ple  Level  User  Quota  Management  (in  development)  n More  bugs  fixed  n  …  

Page 39: How to Deploy OpenStack on Tianhe-2 Supercomputer · How to Deploy OpenStack on Tianhe-2 Supercomputer Yusong Tan! Associated Professor! National University of Defense Technology!

39 OpenStack Summit HK 2013 39

Contents

n Background  n Deployment    n Op4miza4on  n Preliminary  Evalua4on    n Contribu4ons  to  Community  n Cases  n Next  Steps  

Page 40: How to Deploy OpenStack on Tianhe-2 Supercomputer · How to Deploy OpenStack on Tianhe-2 Supercomputer Yusong Tan! Associated Professor! National University of Defense Technology!

40 OpenStack Summit HK 2013 40

Cases

n E-­‐Gov  for  Guangzhou  City  n Websites  of  Guangzhou  government  

o Requirements:Web/Applica4on/Database  Servers  o Efforts:  

o  Load  Balance  o  High  Availability  

   

Page 41: How to Deploy OpenStack on Tianhe-2 Supercomputer · How to Deploy OpenStack on Tianhe-2 Supercomputer Yusong Tan! Associated Professor! National University of Defense Technology!

41 OpenStack Summit HK 2013 41

Cases

n E-­‐Gov  for  Guangzhou  City  n E-­‐Gov  Data  Management  System  

o Requirements:JBoss/WebLogic/Oracle  Servers  o Efforts  

o  Oracle  @  Ceph  RBD  o  VM  Op4mize  for  Oracle  &  Java  

Page 42: How to Deploy OpenStack on Tianhe-2 Supercomputer · How to Deploy OpenStack on Tianhe-2 Supercomputer Yusong Tan! Associated Professor! National University of Defense Technology!

42 OpenStack Summit HK 2013 42

Contents

n Background  n Deployment    n Op4miza4on  n Preliminary  Evalua4on  n Contribu4ons  to  Community  n Cases  n Next  Steps  

Page 43: How to Deploy OpenStack on Tianhe-2 Supercomputer · How to Deploy OpenStack on Tianhe-2 Supercomputer Yusong Tan! Associated Professor! National University of Defense Technology!

43 OpenStack Summit HK 2013 43

Next Steps

For  Supercomputer  Center    n More  performance  tests  

n  Performance  of  cell  n  Neutron  n  Glance  

n Automa4c  Opera4on  for  large  scale  system  n  How  to  operate  6400  nodes    

n Upgrade  to  Havana    n IaaS  Services  

n  More  E-­‐Government  applica4ons  from  Guangzhou  n  IaaS  resource  lease  

n XaaS  Service  n  Database,  Hadoop,  HA  and  more…..  

Page 44: How to Deploy OpenStack on Tianhe-2 Supercomputer · How to Deploy OpenStack on Tianhe-2 Supercomputer Yusong Tan! Associated Professor! National University of Defense Technology!

44 OpenStack Summit HK 2013 44

Next Steps

For  Community  n Automa4c  opera4on  tools  for  large  scale  system  

n  Deployment  n Monitor  n Metering  &  Billing  

n Elas4c  resource  management    n  QoS-­‐based  VM  cluster  scale-­‐in/out  

n Ceph  n  Read  ahead  op4miza4on  n  SSD  based  writeback  cache  support  

Page 45: How to Deploy OpenStack on Tianhe-2 Supercomputer · How to Deploy OpenStack on Tianhe-2 Supercomputer Yusong Tan! Associated Professor! National University of Defense Technology!

45 OpenStack Summit HK 2013 45

This is our Team!

Page 46: How to Deploy OpenStack on Tianhe-2 Supercomputer · How to Deploy OpenStack on Tianhe-2 Supercomputer Yusong Tan! Associated Professor! National University of Defense Technology!

46 OpenStack Summit HK 2013 46

Q&A