perfSONAR · Measurement Measuring network performance and monitoring network components are a...

31
perfSONAR John Hicks Internet2 [email protected]

Transcript of perfSONAR · Measurement Measuring network performance and monitoring network components are a...

Page 1: perfSONAR · Measurement Measuring network performance and monitoring network components are a critical part of any high-performance network deployed today.

perfSONAR

John Hicks Internet2

[email protected]

Page 2: perfSONAR · Measurement Measuring network performance and monitoring network components are a critical part of any high-performance network deployed today.

Agenda  •  Mo*va*on  •  What  is  perfSONAR?  •  Suggested  Deployment  for  Campus/Regional  

Page 3: perfSONAR · Measurement Measuring network performance and monitoring network components are a critical part of any high-performance network deployed today.

•  Networks  are  an  essen*al  part  of  data-­‐intensive  science  –  Connect  data  sources  to  data  analysis  –  Connect  collaborators  to  each  other  –  Enable  machine-­‐consumable  interfaces  to  data  and  analysis  resources  (e.g.  portals),  automa*on,  scale  

•  Performance  is  cri*cal  –  Exponen*al  data  growth  –  Constant  human  factors  –  Technology  changes/improvements/paradigm  shiMs  – Data  movement  and  data  analysis  must  keep  up  

•  Effec*ve  use  of  wide  area  (long-­‐haul)  networks  by  scien*sts  has  historically  been  difficult  

Mo*va*on  

Page 4: perfSONAR · Measurement Measuring network performance and monitoring network components are a critical part of any high-performance network deployed today.

Measurement

Measuring network performance and monitoring network components are a critical part of any high-performance network deployed today. In depth network measurement and monitoring services are key components to provide researches and engineers with views into application performance and to trouble shoot network problems.

Page 5: perfSONAR · Measurement Measuring network performance and monitoring network components are a critical part of any high-performance network deployed today.

Network  Monitoring  •  All  networks  do  some  form  monitoring.      

•  Addresses  needs  of  local  staff  for  understanding  state  of  the  network  o Would  this  informa*on  be  useful  to  external  users?  o  Can  these  tools  func*on  on  a  mul*-­‐domain  basis?  

•  Beyond  passive  methods,  there  are  ac*ve  tools.      o  E.g.  oMen  we  want  a  ‘throughput’  number.    Can  we  automate  that  idea?  

o Wouldn’t  it  be  nice  to  get  some  sort  of  plot  of  performance  over  the  course  of  a  day?    Week?    Year?    Mul*ple  endpoints?  

•  Where  is  the  “Measurement  Middleware”?  Something  to  allow  for  the  easy  exchange  of  metrics  that  are  collected  locally,  on  a  global  scale?  

Page 6: perfSONAR · Measurement Measuring network performance and monitoring network components are a critical part of any high-performance network deployed today.

SoM  Failures  •  SoM  failures  are  where  basic  connec*vity  func*ons,  

but  high  performance  is  not  possible.  •  TCP  was  inten*onally  designed  to  hide  all  

transmission  errors  from  the  user:  –  “As  long  as  the  TCPs  con*nue  to  func*on  properly  and  

the  internet  system  does  not  become  completely  par**oned,  no  transmission  errors  will  affect  the  users.”  (From  IEN  129,  RFC  716)  

•  Some  soM  failures  only  affect  high  bandwidth  long  RTT  flows.  

•  Hard  failures  are  easy  to  detect  &  fix    •  soM  failures  can  lie  hidden  for  years!  •  SoM  failures  can  be  present  on  the  host,  protocol,  applica*on,  or  network  

•  One  network  problem  can  oMen  mask  others  –  this  is  common  

Page 7: perfSONAR · Measurement Measuring network performance and monitoring network components are a critical part of any high-performance network deployed today.

Where  Are  The  Problems?  

• Source  • Campus   • Backbone  

• S  

• NREN  

• Congested  or  faulty  links  between  domains  

• Congested  intra-­‐  campus  links  

• D  

• Des*na*on  • Campus  

• Latency  dependant  problems  inside  domains  with  small  RTT  

• Regional  

Page 8: perfSONAR · Measurement Measuring network performance and monitoring network components are a critical part of any high-performance network deployed today.

• Source  • Campus  

• R&E  • Backbone  

• Regional  

• D  • S  

• Des8na8on  • Campus  

• Regional  

• Performance  is  good  when  RTT  is  <  ~10  ms  

• Performance  is  poor  when  RTT  exceeds  ~10  ms  

• Switch  with  small  buffers  

Local  Tes*ng  Will  Not  Find  Everything  

Page 9: perfSONAR · Measurement Measuring network performance and monitoring network components are a critical part of any high-performance network deployed today.

Agenda  •  Mo*va*on  •  What  is  perfSONAR?  •  Suggested  Deployment  for  Campus/Regional  

Page 10: perfSONAR · Measurement Measuring network performance and monitoring network components are a critical part of any high-performance network deployed today.

What  is  perfSONAR?  •  perfSONAR  is  a  tool  to:  •  Set  network  performance  expecta*ons  for  a  variety  of  use  cases  •  Find  network  problems  (“soM  failures”)  &  help  fix  these  problems  •  Mi*gate  the  risks  that  are  associated  with  the  R&E  environment  (e.g.  get  

out  in  front  of  problems  before  its  too  late)  • All  in  mul*-­‐domain  environments  •  These  problems  are  all  harder  when  mul*ple  networks  are  involved  –  

need  a  mechanism  to  stop  ‘finger  poin*ng’  and  get  real  work  done  • perfSONAR  is  provides  a  standard  way  to  publish  ac:ve  and  passive  monitoring  data  

–  This  data  is  interes*ng  to  network  researchers  as  well  as  network  operators  –  This  is  the  measurement  middleware  –  a  way  to  *e  together  local  and  end-­‐to-­‐

end  measurements  –   A  way  to  separate  a  network  problem  from  that  of  an  applica*on  or  host  

• 10 – ESnet Science Engagement ([email protected]) - 5/6/14

Page 11: perfSONAR · Measurement Measuring network performance and monitoring network components are a critical part of any high-performance network deployed today.

What is perfSONAR (cont.)

•  perfSONAR is an infrastructure for network performance monitoring.

•  It is a services oriented architecture delivering performance measurements in a federated environment.

•  It is an intermediate layer between the performance measurement tools and the diagnostic or visualization applications.

•  A methodology for monitoring network connections that span multiple administrative domains.

•  Partners include: GEANT2, ESNET, I2, RNP •  http://www.perfsonar.net/"

Page 12: perfSONAR · Measurement Measuring network performance and monitoring network components are a critical part of any high-performance network deployed today.

perfSONAR  Present  

Page 13: perfSONAR · Measurement Measuring network performance and monitoring network components are a critical part of any high-performance network deployed today.

Lookup  Service  Directory  Search:    hfp://stats.es.net/ServicesDirectory/    

Page 14: perfSONAR · Measurement Measuring network performance and monitoring network components are a critical part of any high-performance network deployed today.

Lookup Service

•  Services register their existence and capabilities with a LS.

•  Clients discover services by querying the LS. •  LS are found by multicast, well-known servers,

local configuration, or other LSs. •  The LS are queried on attributes (service type,

authentication) and more complex constructs (network location) not simply named-based.

Page 15: perfSONAR · Measurement Measuring network performance and monitoring network components are a critical part of any high-performance network deployed today.

Measurement Archive Service

•  Measurement Archives store data in databases and publish data produced by MPS (or TSs). •  They also provide a historical record of analysis. •  Reduces queries to the MPS by publishing to multiple clients. •  As a server, it accepts and stores setup and publication requests. •  As a client, it registers with an LS and subscribes to a MPS, other MAS and publishes data to subscribers.

Page 16: perfSONAR · Measurement Measuring network performance and monitoring network components are a critical part of any high-performance network deployed today.

•  The ToS is a specific example of a TS used to make topological information available to the framework.

•  Understanding topology is necessary for the measurement system to optimize its operations (closest nodes).

•  ToS may also be used for overviews/maps clients to present measurement data.

Topology Service

Page 17: perfSONAR · Measurement Measuring network performance and monitoring network components are a critical part of any high-performance network deployed today.

Measurement Point Service

• MPS creates and publishes data by initiating active measurements or querying passive devices. •  A setup protocol allows users to request measurements and publish the results. •  As a server, the MPS accepts requests and publishes the data (client subscriber handle must be known in advance). •  As a client, the MPS registers with the LS and publishes to subscribers.

Page 18: perfSONAR · Measurement Measuring network performance and monitoring network components are a critical part of any high-performance network deployed today.

perfSONAR-PS Services

•  Focus on development of major perfSONAR components –  SNMP Based MP/MA –  Lookup Service –  Topology –  Link Status New additions –  OWAMP/BWCTL –  Traceroute –  Pinger (SLAC+Fermilab) –  Visualization (Perfsonar UI plugins + meter)

Page 19: perfSONAR · Measurement Measuring network performance and monitoring network components are a critical part of any high-performance network deployed today.

SNMP Based MP/MA

•  Deployed –  Internet2 Network –  ESNet –  Georgia Tech/SLAC/University of Delaware –  All over

•  Compatible with perfSONAR-UI •  CPAN package in development

Page 20: perfSONAR · Measurement Measuring network performance and monitoring network components are a critical part of any high-performance network deployed today.

Pinger Based MP/MA

•  Joint effort between Fermi Lab and SLAC"• Present views of historic Pinger data"• Expose interface to schedule live tests"

• Development and integration into perfSONAR-PS based on LHC-OPN requirements"

Page 21: perfSONAR · Measurement Measuring network performance and monitoring network components are a critical part of any high-performance network deployed today.

Visualization

•  Utilizing the plugin architecture of perfSONAR-UI"

• Data visualization beyond network utilization"• Google Maps"

• Utilization by physical location"• 'Weather Map' of Internet2 Network"

• Web based speedometer to interact directly with MA code"• Maddash"

Page 22: perfSONAR · Measurement Measuring network performance and monitoring network components are a critical part of any high-performance network deployed today.

Other services in development

• Topology/LS service"• UNIS development (Indiana University)"

• Maddash//mesh"• Ease full mesh deployment"

• OWAMP MA"• Coordinate regular scheduled tests with BWCTL"

• BWCTL MA"• Coordinate regular scheduled tests with OWAMP"

Page 23: perfSONAR · Measurement Measuring network performance and monitoring network components are a critical part of any high-performance network deployed today.

Agenda  •  Mo*va*on  •  What  is  perfSONAR?  •  Suggested  Deployment  for  Campus/Regional  

Page 24: perfSONAR · Measurement Measuring network performance and monitoring network components are a critical part of any high-performance network deployed today.

•  The  “perfSONAR  Toolkit”  is  an  open  source  implementa*on    and  packaging  of  the  perfSONAR  measurement  infrastructure  and  protocols  –  everything  you  (or  your  scien*sts)  needs  to  get  a  baseline  and  start  addressing  true  problems  

•  hfp://psps.perfsonar.net/toolkit    •  All  components  are  available  as  RPMs,  and  bundled  into  a  CentOS  6-­‐based  “ne*nstall”  and  a  “Live  CD”  •  perfSONAR  tools  are  much  more  accurate  if  run  on  a  dedicated  perfSONAR  host,  not  on  the  DTN.      

•  Very  easy  to  install  and  configure  •  Usually  takes  less  than  30  minutes  

perfSONAR  Toolkit  

Page 25: perfSONAR · Measurement Measuring network performance and monitoring network components are a critical part of any high-performance network deployed today.

•  We  can’t  wait  for  users  to  report  problems  and  then  fix  them  (soM  failures  can  go  unreported  for  years!)  

•  Things  just  break  some*mes  –  Failing  op*cs  –  Somebody  messed  around  in  a  patch  panel  and  kinked  a  fiber  –  Hardware  goes  bad  

•  Problems  that  get  fixed  have  a  way  of  coming  back  –  System  defaults  come  back  aMer  hardware/soMware  upgrades  –  New  employees  may  not  know  why  the  previous  employee  set  things  up  a  certain  way  and  back  out  fixes  

•  Important  to  con*nually  collect,  archive,  and  alert  on  ac*ve  throughput  test  results  

Importance  of  Regular  Tes*ng  

Page 26: perfSONAR · Measurement Measuring network performance and monitoring network components are a critical part of any high-performance network deployed today.

Regular  perfSONAR  Tests  •  We  run  regular  tests  to  check  for  two  things  

–  TCP  throughput  –  One  way  delay  and  packet  loss  

•  perfSONAR  has  mechanisms  for  managing  regular  tes*ng  between  perfSONAR  hosts  –  Sta*s*cs  collec*on  and  archiving  –  Graphs  –  Dashboard  display  –  Integrate  with  NAGIOS  

•  This  infrastructure  is  deployed  now  –  perfSONAR  hosts  at  facili*es  can  take  advantage  of  it  

•  At-­‐a-­‐glance  health  check  for  data  infrastructure  

Page 27: perfSONAR · Measurement Measuring network performance and monitoring network components are a critical part of any high-performance network deployed today.

• perfSONAR  Dashboard:  hfp://ps-­‐dashboard.es.net    

Page 28: perfSONAR · Measurement Measuring network performance and monitoring network components are a critical part of any high-performance network deployed today.

•  What  are  you  going  to  measure?  –  Achievable  bandwidth  

•  2-­‐3  regional  des*na*ons  •  4-­‐8  important  collaborators  •  4-­‐8  (more  if  you  are  willing,  especially  to  start)  *mes  per  day  to  each  des*na*on  

•  20-­‐30  second  tests  within  a  region,  longer  across  oceans  and  con*nents    

–  Loss/Availability/Latency  •  OWAMP:    ~10-­‐20  collaborators  over  diverse  paths  

–  Interface  U*liza*on  &  Errors  (via  SNMP)  •  Guidance  on  servers  to  buy:  •    hfp://psps.perfsonar.net/toolkit/hardware.html    •    Virtualiza*on  is  tricky,  recommended  to  go  dedicated  

hardware.      

Develop  a  Test  Plan  

Page 29: perfSONAR · Measurement Measuring network performance and monitoring network components are a critical part of any high-performance network deployed today.

perfSONAR  Deployment  Loca*ons  •  Cri*cal  to  deploy  such  that  you  can  test  with  useful  seman*cs  •  perfSONAR  hosts  allow  parts  of  the  path  to  be  tested  separately  

–  Reduced  visibility  for  devices  between  perfSONAR  hosts  –  Must  rely  on  counters  or  other  means  where  perfSONAR  can’t  go  

•  Effec*ve  test  methodology  derived  from  protocol  behavior  –  TCP  suffers  much  more  from  packet  loss  as  latency  increases  –  TCP  is  more  likely  to  cause  loss  as  latency  increases  –  Tes*ng  should  leverage  this  in  two  ways  

•  Design  tests  so  that  they  are  likely  to  fail  if  there  is  a  problem  •  Mimic  the  behavior  of  produc*on  traffic  as  much  as  possible  

–  Note:  don’t  design  your  tests  to  succeed  •  The  point  is  not  to  “be  green”  even  if  there  are  problems  •  The  point  is  to  find  problems  when  they  come  up  so  that  the  problems  are  

fixed  quickly  

Page 30: perfSONAR · Measurement Measuring network performance and monitoring network components are a critical part of any high-performance network deployed today.

Sample  Site  Deployment  

Page 31: perfSONAR · Measurement Measuring network performance and monitoring network components are a critical part of any high-performance network deployed today.

Questions or Comments

John Hicks Internet2

[email protected]