Evolution of IT Infrastructure For Fusion Control...

17
Evolution of IT Infrastructure For Fusion Control Systems Presentation to 14 th International Conference on Accelerator & Large Experimental Physics Control Systems (ICALEPCS) October 6-11, 2013 Tim Frazier Chief Information Officer NIF & Photon Science LLNL-PRES-644303

Transcript of Evolution of IT Infrastructure For Fusion Control...

Page 1: Evolution of IT Infrastructure For Fusion Control Systemsaccelconf.web.cern.ch/AccelConf/ICALEPCS2013/talks/thcoba04_tal… · NIF’s IT architecture is based on four principles

Evolution of IT Infrastructure For Fusion Control Systems

Presentation to

14th International Conference on Accelerator & Large Experimental Physics Control Systems (ICALEPCS)

October 6-11, 2013

Tim Frazier Chief Information Officer

NIF & Photon Science

LLNL-PRES-644303

Page 2: Evolution of IT Infrastructure For Fusion Control Systemsaccelconf.web.cern.ch/AccelConf/ICALEPCS2013/talks/thcoba04_tal… · NIF’s IT architecture is based on four principles

NIF’s IT architecture is based on four principles

• Individual component failure should not cause infrastructure failure Separate workloads & have more than one running at all

times • Technology should be easily replaceable Avoid an attachment to physical things

• Achieving self-similarity should guide the selection of technology Like model numbers wherever possible

• Use data to forecast resource consumption Create repositories of long-term metrics for analysis

Frazier - ICALEPCS Conference San Francisco, October 6-11, 2013 2 NIF-0000-00000s2.ppt

Page 3: Evolution of IT Infrastructure For Fusion Control Systemsaccelconf.web.cern.ch/AccelConf/ICALEPCS2013/talks/thcoba04_tal… · NIF’s IT architecture is based on four principles

NIF has consolidated its server footprint by 40%

Frazier - ICALEPCS Conference San Francisco, October 6-11, 2013 3 NIF-0911-22970s2.ppt

Partnership with key technology providers has been integral to our success

330 servers

423 servers

228 blade

servers 123

blade servers

Page 4: Evolution of IT Infrastructure For Fusion Control Systemsaccelconf.web.cern.ch/AccelConf/ICALEPCS2013/talks/thcoba04_tal… · NIF’s IT architecture is based on four principles

Our physical footprint has been reduced by 50%

Frazier - ICALEPCS Conference San Francisco, October 6-11, 2013 4 NIF-0911-22970s2.ppt

SPARC-to-Intel migration made possible by port from Ada to Java

High-density, virtualized servers

Single-purpose to multi-purpose servers made possible by Virtualization (Xen)

Page 5: Evolution of IT Infrastructure For Fusion Control Systemsaccelconf.web.cern.ch/AccelConf/ICALEPCS2013/talks/thcoba04_tal… · NIF’s IT architecture is based on four principles

We have kept up with customer demand despite the 40% consolidation in footprint

Frazier - ICALEPCS Conference San Francisco, October 6-11, 2013 5 NIF-0911-22970s2.ppt

Low cost of ownership virtual machines enable single-purpose hosts

Page 6: Evolution of IT Infrastructure For Fusion Control Systemsaccelconf.web.cern.ch/AccelConf/ICALEPCS2013/talks/thcoba04_tal… · NIF’s IT architecture is based on four principles

Virtualization of our Integrated Computer Control System (ICCS) is nearly complete

6 NIF-0311-21172s2.ppt

Shot Director

Bundle 1

Collaboration Server

Subsystem Shot

Supervisors

Injection Laser

Beam Control

Laser Diagnostics

Bundle 2

Collaboration Server

Subsystem Shot

Supervisors

Injection Laser

Beam Control

Laser Diagnostics

NIF

Collaboration Server

Subsystem Shot

Supervisors

Target Diagnostics

Alignment

LPOM

Industrial Controls

Bundles 3-24

. . .

[1]

[50]

[450]

Supervisory Layer

Device Control

Layer

Subsystem Shot

Supervisors

Subsystem Shot

Supervisors

Subsystem Shot

Supervisors

Subsystem Shot

Supervisors

Subsystem Shot

Supervisors

Subsystem Shot

Supervisors

. . .

Injection Laser

Beam Control

Laser Diagnostics

Injection Laser

Beam Control

Laser Diagnostics

Injection Laser

Beam Control

Laser Diagnostics

Injection Laser

Beam Control

Laser Diagnostics

. . .

Collaboration Server

Common

Collaboration Server

Framework Servers

[1300]

Subsystem Shot

Supervisors

Subsystem Shot

Supervisors Analysis Servers

Inspection Systems

Framework Servers

Analysis Servers

Subsystem Shot

Supervisors

Subsystem Shot

Supervisors

Subsystem Shot

Supervisors

Complete

Alignment

Inspection Systems

LPOM

In Progress

Collaboration Server

Collaboration Server

Collaboration Server

Injection Laser

Injection Laser

Beam Control

Beam Control

Laser Diagnostics

Laser Diagnostics

Target Diagnostics

Industrial Controls

(Consoles)

Shot Director

Page 7: Evolution of IT Infrastructure For Fusion Control Systemsaccelconf.web.cern.ch/AccelConf/ICALEPCS2013/talks/thcoba04_tal… · NIF’s IT architecture is based on four principles

To build an infrastructure, you need building blocks

Frazier - ICALEPCS Conference San Francisco, October 6-11, 2013 7 NIF-0911-22970s2.ppt

Ethernet Switch Cisco 6509

Ethernet Switch Cisco 5548

Filer with disks NetApp 3250

Diskless Blade Servers

HP BL460c

Fiber DCX Switch

Page 8: Evolution of IT Infrastructure For Fusion Control Systemsaccelconf.web.cern.ch/AccelConf/ICALEPCS2013/talks/thcoba04_tal… · NIF’s IT architecture is based on four principles

Proto-type a single environment

Frazier - ICALEPCS Conference San Francisco, October 6-11, 2013 8 NIF-0911-22970s2.ppt

Ethernet Switch Cisco 6509

Ethernet Switch Cisco 5548

Filer with disks NetApp 3250

Blade Servers HP BL460c

Fiber DCX Switch

8 Gb Fiber network

10 Gb Ethernet network

Diskless blades

Page 9: Evolution of IT Infrastructure For Fusion Control Systemsaccelconf.web.cern.ch/AccelConf/ICALEPCS2013/talks/thcoba04_tal… · NIF’s IT architecture is based on four principles

Create a segmented infrastructure

Frazier - ICALEPCS Conference San Francisco, October 6-11, 2013 9 NIF-0911-22970s2.ppt

Ethernet Switch Cisco 6509

Fiber DCX Switch

Sandbox Dev – Int - QA Production Controls Services

Page 10: Evolution of IT Infrastructure For Fusion Control Systemsaccelconf.web.cern.ch/AccelConf/ICALEPCS2013/talks/thcoba04_tal… · NIF’s IT architecture is based on four principles

Computational workload is very densely packed onto hypervisors

Frazier - ICALEPCS Conference San Francisco, October 6-11, 2013 10 NIF-0911-22970s2.ppt

6-to-1 @ 27%

utilization

~3-to-1 @ 17%

utilization

Controls2 : Control system virtual machines ShotProd2 : Production, non-control system virtual machines General02 : Development/Integration/QA virtual machines Production02 : Production, non-control system virtual machines

9-to-1 @ 33%

utilization

4-to-1 @ 17%

utilization

Page 11: Evolution of IT Infrastructure For Fusion Control Systemsaccelconf.web.cern.ch/AccelConf/ICALEPCS2013/talks/thcoba04_tal… · NIF’s IT architecture is based on four principles

Memory limits the packing factor for hypervisors

Frazier - ICALEPCS Conference San Francisco, October 6-11, 2013 11 NIF-0911-22970s2.ppt

Controls2 : Control system virtual machines ShotProd2 : Production, non-control system virtual machines General02 : Development/Integration/QA virtual machines Production02 : Production, non-control system virtual machines

Non-controls environments run with less margin

Page 12: Evolution of IT Infrastructure For Fusion Control Systemsaccelconf.web.cern.ch/AccelConf/ICALEPCS2013/talks/thcoba04_tal… · NIF’s IT architecture is based on four principles

We rely on many tools to manage our infrastructure

Asset

3PAR desk manager

AssetDB DNS

Active Directory

F5 Manager NetApp

Manager

Splunk Enterprise Manager

IPAM (netping) Brocade Manager

Statseeker

Storage Infrastructure

Network Infrastructure

Server Infrastructure

Page 13: Evolution of IT Infrastructure For Fusion Control Systemsaccelconf.web.cern.ch/AccelConf/ICALEPCS2013/talks/thcoba04_tal… · NIF’s IT architecture is based on four principles

We have developed metrics to measure discrepancies between tools & performance outliers

Systemic problems can be revealed by trending metrics over time

Frazier - ICALEPCS Conference San Francisco, October 6-11, 2013

Page 14: Evolution of IT Infrastructure For Fusion Control Systemsaccelconf.web.cern.ch/AccelConf/ICALEPCS2013/talks/thcoba04_tal… · NIF’s IT architecture is based on four principles

Two tools are used monitor & manage our server infrastructure

Frazier - ICALEPCS Conference San Francisco, October 6-11, 2013 14 NIF-0911-22970s2.ppt

Oracle Enterprise Manager Performance,

Configuration Management & Incident Management

Splunk Log file mining & alerting

See John Fisher’s Poster “Monitoring of the National

Ignition Facility Control System THPCC082”

Page 15: Evolution of IT Infrastructure For Fusion Control Systemsaccelconf.web.cern.ch/AccelConf/ICALEPCS2013/talks/thcoba04_tal… · NIF’s IT architecture is based on four principles

Agent-based monitoring provides a wealth of information beyond host performance management

Frazier - ICALEPCS Conference France, October 10-14, 2011 15 NIF-0911-22970s2.ppt

Management Service

Management Database

Management Agent

Database Application Host

Database metrics Application metrics

Configuration Management

User-Defined Metrics

Page 16: Evolution of IT Infrastructure For Fusion Control Systemsaccelconf.web.cern.ch/AccelConf/ICALEPCS2013/talks/thcoba04_tal… · NIF’s IT architecture is based on four principles

Final thoughts (informed by hindsight)

• Self-similarity, more than any other quality, has enabled us to grow our infrastructure by 100% without a corresponding increase in staff

• Segmentation, more than any other design principle, has enabled

us to increase reliability by isolating performance degradation & component failures

• Virtualization, more than any other technology, has enabled us to

grow our infrastructure by 100% while at the same time consolidating our physical footprint by 50%

• Data provided by tools, specifically agent-based collection, more

than any other asset, have provided the knowledge needed to manage our infrastructure

• It is time for your questions!

Frazier - ICALEPCS Conference San Francisco, October 6-11, 2013 16 NIF-0911-22970s2.ppt

Page 17: Evolution of IT Infrastructure For Fusion Control Systemsaccelconf.web.cern.ch/AccelConf/ICALEPCS2013/talks/thcoba04_tal… · NIF’s IT architecture is based on four principles