Looking Ahead: A New PSU Research Cloud Architecture Chuck Gilbert - Systems Architect and Systems...

11
Looking Ahead: A New PSU Research Cloud Architecture Chuck Gilbert - Systems Architect and Systems Team Lead Research CI Coordinating Committee Meeting July 31, 2014

Transcript of Looking Ahead: A New PSU Research Cloud Architecture Chuck Gilbert - Systems Architect and Systems...

Page 1: Looking Ahead: A New PSU Research Cloud Architecture Chuck Gilbert - Systems Architect and Systems Team Lead Research CI Coordinating Committee Meeting.

Looking Ahead: A New PSU Research Cloud Architecture

Chuck Gilbert - Systems Architect and Systems Team Lead

Research CI Coordinating Committee Meeting July 31, 2014

Page 2: Looking Ahead: A New PSU Research Cloud Architecture Chuck Gilbert - Systems Architect and Systems Team Lead Research CI Coordinating Committee Meeting.

● ITS implemented a traditional HPC infrastructure based upon:○ Fairshare model

■ Priority bump in the queues○ No guaranteed runtimes

■ Wasted research time waiting for “turn” in the system○ Segregated clusters

■ Limited re-configuration options■ No sharing of High-Speed interconnects

One model for all computing needs!

What we have

Page 3: Looking Ahead: A New PSU Research Cloud Architecture Chuck Gilbert - Systems Architect and Systems Team Lead Research CI Coordinating Committee Meeting.

● Research Community needs are bigger!■ Guaranteed response times■ Self-service access – Needed for empowering the research community to

consume a model that works for their needs■ Root level access – Needed for enabling customized environments■ Virtualization in addition to HPC resources – flexible configurations and

online maintenance of hardware■ Accelerator cards (GPU and Phi)■ Big Data platforms■ Fast Data transfer rates

● Old ITS approach revolved around segmented, and fractured ”clusters” without flexibility and expandability

Old model can not keep pace with current and future computing needs!

What current research needs

Page 4: Looking Ahead: A New PSU Research Cloud Architecture Chuck Gilbert - Systems Architect and Systems Team Lead Research CI Coordinating Committee Meeting.

● ICS-CI2 (High-Performance Research Cloud)○ Fundamental new approach to engineering, deploying, and managing research computing

resources○ On-Premises High-Performance Cloud allows for full customization and control of the

software and hardware stacks■ Flexible configurations■ Guaranteed run times■ Secure data storage■ High-speed network bandwidth■ Multiple possible Service Level Agreement (SLA) models■ On-demand storage purchasing capacity

○ Bursting to public clouds and national Labs (Hybrid Cloud Model)■ Compute bursting for large-scale (10k+ core) jobs■ Participation in XSEDE

○ Model used at CERN and other Research Computing Centers

● Stable computing platform○ Tested and verified software catalogs, including operating systems

■ Linux, Windows■ C, C++, Java, .NET, Scripting Languages, etc.

○ Self-service portals○ Science gateways○ Seamless maintenance○ Enable choice of consumption of resources

What is the solution?Advanced CyberInfrastructure for Innovation (ICS-CI2)

Page 5: Looking Ahead: A New PSU Research Cloud Architecture Chuck Gilbert - Systems Architect and Systems Team Lead Research CI Coordinating Committee Meeting.

Where is ICS-CI2 in the Big Picture?

Customized Environments

………….

To be aligned at later date to conform to governance structures recommended by Research CI Governance Taskforce

Page 6: Looking Ahead: A New PSU Research Cloud Architecture Chuck Gilbert - Systems Architect and Systems Team Lead Research CI Coordinating Committee Meeting.

ICS-CI2 Envisioned End User Experience (Future)

Penn State Research Cloud

Resource Request

ICS-CI2 (On-Premises Research Cloud)

Regional and National Labs

Public Cloud Resources

Page 7: Looking Ahead: A New PSU Research Cloud Architecture Chuck Gilbert - Systems Architect and Systems Team Lead Research CI Coordinating Committee Meeting.

ICS-CI2 Overview

● Compute○ ICS-CI2 compute can be “re-provisioned” as needed to accommodate multiple models○ Utilizing GPU enabled, large memory, and blade servers deployed through each proposed phase○ N number of CI-Cores are built on top of converged compute, segmented by security boundaries, networks,

firewalls● Storage

○ ICS-CI2 Cloud Storage offers choice of provisioning, backup, and retention models○ ICS-CI2 Storage Automation, Metering, Metrics allow for methodical expansion based on usage and trends○ ICS-CI2 Storage scales to multiple Petabytes

● Network○ Direct integration into the PSU Research Network for fast access and data transfers○ Limited single points of failure to minimize downtime/maintenance windows

Page 8: Looking Ahead: A New PSU Research Cloud Architecture Chuck Gilbert - Systems Architect and Systems Team Lead Research CI Coordinating Committee Meeting.

Direct integration into research network core

ICS-CI2Proposed Network / Infrastructure Plan

Page 9: Looking Ahead: A New PSU Research Cloud Architecture Chuck Gilbert - Systems Architect and Systems Team Lead Research CI Coordinating Committee Meeting.

What we have already started to implement

The ITS interactive cluster Hammer was at a breaking point!

Hardware Issue

● 24 compute nodes○ Slow network○ Slow IO○ Inadequate Memory○ Old, outdated operating system

Operational Issue

● Software stack not unified● Memory can not support number of user

requesting resources○ Processes denied running

● Hardware near end-of-life

Page 10: Looking Ahead: A New PSU Research Cloud Architecture Chuck Gilbert - Systems Architect and Systems Team Lead Research CI Coordinating Committee Meeting.

What we have already started to implement

ICS is installing a new interactive cluster with the following enhancements!

Hardware Specifications

● 24 compute nodes○ Dual 10 core processors○ 256 GB of RAM○ NVIDIA K4000 Graphics Card○ 10G Ethernet

● Public 10G ethernet access (10X increase)

● Research Network Ten-Gigabit ethernet access

○ Available late fall● Unified software stack with batch

clusters● Re-usable hardware platform● Interactive processing● 5X improvement on processing

power● 5X improvement on memory

* Hardware will be available for 2014 fall semester

Page 11: Looking Ahead: A New PSU Research Cloud Architecture Chuck Gilbert - Systems Architect and Systems Team Lead Research CI Coordinating Committee Meeting.

Questions ?