1 Penguin Computing and Indiana University partner for “above campus” and campus bridging...

21
1 Penguin Computing and Indiana University partner for “above campus” and campus bridging services to the community Craig A. Stewart ([email protected]) Executive Director, Pervasive Technology Institute Associate Dean, Research Technologies Associate Director, CREST Matthew Link Director, Systems, Research Technologies Associate Director, CREST George Turner Manager, High Performance Systems William K. Barnett Director, National Center for Genome Analysis Support Associate Director, Center for Applied Cybersecurity Research, PTI Indiana University - pti.iu.edu Presented SC11 Exhibits Hall, Nov 14-17, IEE/ACM SC11 conference, Seattle, WA

Transcript of 1 Penguin Computing and Indiana University partner for “above campus” and campus bridging...

Page 1: 1 Penguin Computing and Indiana University partner for “above campus” and campus bridging services to the community Craig A. Stewart (stewart@iu.edu) Executive.

1

Penguin Computing and Indiana University partner for “above campus” and campus bridging services to the community

Craig A. Stewart ([email protected])Executive Director, Pervasive Technology InstituteAssociate Dean, Research TechnologiesAssociate Director, CREST

Matthew LinkDirector, Systems, Research TechnologiesAssociate Director, CREST

George TurnerManager, High Performance Systems

William K. BarnettDirector, National Center for Genome Analysis SupportAssociate Director, Center for Applied Cybersecurity Research, PTI

Indiana University - pti.iu.eduPresented SC11 Exhibits Hall, Nov 14-17, IEE/ACM SC11 conference, Seattle, WA

Page 2: 1 Penguin Computing and Indiana University partner for “above campus” and campus bridging services to the community Craig A. Stewart (stewart@iu.edu) Executive.

2

License terms

• Please cite this presentation as: Stewart, C.A., M.R. Link, G. Turner, W. K. Barnett, 2011. Penguin Computing and Indiana University partner for “above campus” and campus bridging services to the community. Presented SC11 Exhibits Hall, Nov 14-17, IEE/ACM SC11 conference, Seattle, WA. http://hdl.handle.net/2022/13880

• Portions of this document that originated from sources outside IU are shown here and used by permission or under licenses indicated within this document.

• Items indicated with a © are under copyright and used here with permission. Such items may not be reused without permission from the holder of copyright except where license terms noted on a slide permit reuse.

• Except where otherwise noted, the contents of this presentation are copyright 2011 by the Trustees of Indiana University. This content is released under the Creative Commons Attribution 3.0 Unported license (http://creativecommons.org/licenses/by/3.0/). This license includes the following terms: You are free to share – to copy, distribute and transmit the work and to remix – to adapt the work under the following conditions: attribution – you must attribute the work in the manner specified by the author or licensor (but not in any way that suggests that they endorse you or your use of the work). For any reuse or distribution, you must make clear to others the license terms of this work.

Page 3: 1 Penguin Computing and Indiana University partner for “above campus” and campus bridging services to the community Craig A. Stewart (stewart@iu.edu) Executive.

3

Commercial cloud (Iaas and Paas)

Volunteer computing

Workstations at Carnegie research universities

Campus HPC/ Tier 3 systems

Track 2 and other major facilities

NSF Track 1

0 2,000 4,000 6,000 8,000 10,000 12,000

Some CI resources available to science and engineering researchers in US

(March 2011)

TFLOPS

Based on: Welch, V.; Sheppard, R.; Lingwall, M.J.; Stewart, C. A. 2011. Current structure and past history of US cyberinfrastructure (data set and figures). http://hdl.handle.net/2022/13136

Page 4: 1 Penguin Computing and Indiana University partner for “above campus” and campus bridging services to the community Craig A. Stewart (stewart@iu.edu) Executive.

4Adequacy of research CI

Never (10.6%)Some of the time (20.2%)Most of the time (40.2%)All of the time (29%)

Stewart, C.A., D.S. Katz, D.L. Hart, D. Lantrip, D.S. McCaulay and R.L. Moore. Technical Report: Survey of cyberinfrastructure needs and interests of NSF-funded principal investigators. 2011. http://hdl.handle.net/2022/9917

Responses to question asking if researchers had sufficient access to Cyberinfrastructure resources – survey sent to 5,000 researchers selected randomly from 34,623 researchers funded by NSF as Principle Investigators 2005-2009; Results based on 1,028 responses

Page 5: 1 Penguin Computing and Indiana University partner for “above campus” and campus bridging services to the community Craig A. Stewart (stewart@iu.edu) Executive.

5

Photo by http://www.flickr.com/photos/mnsc/http://www.flickr.com/photos/mnsc/2768391365/sizes/z/in/photostream/

http://creativecommons.org/licenses/by/2.0/

Clouds look serene enough

Page 6: 1 Penguin Computing and Indiana University partner for “above campus” and campus bridging services to the community Craig A. Stewart (stewart@iu.edu) Executive.

6

But is ignorance bliss?

• In the cloud, do you know:– Where your data are?– What laws prevail over the physical location of your data?– What license you really agreed to?– What is the security (electronic / physical) around your data?– And how exactly do you get to that cloud, or get things out of it?– How secure your provider is financially? (The fact that something seems

unimaginable, like cloud provider such-and-such cloud provider going out of business abruptly, does not mean it is impossible!)

Page 7: 1 Penguin Computing and Indiana University partner for “above campus” and campus bridging services to the community Craig A. Stewart (stewart@iu.edu) Executive.

7

Cloud computing - NIST

• Cloud computing is a model for enabling convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction.

• This cloud model promotes availability and is composed of five essential characteristics (On-demand self-service, Broad network access, Resource pooling, Rapid elasticity, Measured Service); three service models (Cloud Software as a Service (SaaS), Cloud Platform as a Service (PaaS), Cloud Infrastructure as a Service (IaaS)); and, four deployment models (Private cloud, Community cloud, Public cloud, Hybrid cloud).

• Key enabling technologies include: (1) fast wide-area networks, (2) powerful, inexpensive server computers, and (3) high-performance virtualization for commodity hardware.

• http://www.nist.gov/itl/cloud/index.cfm

Page 8: 1 Penguin Computing and Indiana University partner for “above campus” and campus bridging services to the community Craig A. Stewart (stewart@iu.edu) Executive.

8

• Above Campus Services– "We are seeing the early emergence of a meta-university — a transcendent,

accessible, empowering, dynamic, communally constructed framework of open materials and platforms on which much of higher education worldwide can be constructed or enhanced.” Charles Vest, president emeritus of MIT, 2006

• Goal: achieve economy of scale and retain reasonable measure of control• See: Brad Wheeler and Shelton Waggener. 2009. Above-Campus Services:

Shaping the Promise of Cloud Computing for Higher Education. EDUCAUSE Review, vol. 44, no. 6 (November/December 2009): 52-67.

• www.educause.edu/EDUCAUSE+ReviewEDUCAUSEReviewMagazineVolume44AboveCampusServicesShapingtheP/185222

Above Campus services

Page 9: 1 Penguin Computing and Indiana University partner for “above campus” and campus bridging services to the community Craig A. Stewart (stewart@iu.edu) Executive.

9

Penguin Computing and IU partner for “Cluster as a Service”

• Just what it says: Cluster as a Service• Cluster physically located on IU’s campus, in IU’s Data Center• Available to anyone at a .edu or FFRDC (Federally Funded Research and

Development Center)• To use it:

– Go to podiu.penguincomputing.com– Fill out registration form– Verify via your email– Get out your credit card– Go computing

• This builds on Penguin’s experience - currently host Life Technologies' BioScope and LifeScope in the cloud (http://lifescopecloud.com)

Page 10: 1 Penguin Computing and Indiana University partner for “above campus” and campus bridging services to the community Craig A. Stewart (stewart@iu.edu) Executive.

10

We know where the data are … and they are secure

Page 11: 1 Penguin Computing and Indiana University partner for “above campus” and campus bridging services to the community Craig A. Stewart (stewart@iu.edu) Executive.

11

POD IU (Rockhopper) specifications

Server Information 

Architecture Penguin Computing Altus 1804TFLOPS 4.4Clock Speed 2.1GHzNodes 11 compute; 2 login; 4 management; 3 serversCPUs 4 x 2.1GHz 12-core AMD Opteron 6172 processors per compute nodeMemory Type Distributed and SharedTotal Memory 1408 GBMemory per Node 128GB 1333MHz DDR3 ECCLocal Scratch Storage 6TB locally attached SATA2Cluster Scratch 100TB Lustre

Further Details

OS CentOS 5

Network QDR (40Gb/s) Infiniband, 1Gb/s ethernetJob Management Software SGEJob Scheduling Software SGEJob Scheduling policy Fair ShareAccess keybased ssh login to headnodes

remote job control via Penguin's PODShell

Page 12: 1 Penguin Computing and Indiana University partner for “above campus” and campus bridging services to the community Craig A. Stewart (stewart@iu.edu) Executive.

12

Package name Summary

COAMPS Coupled ocean / atmosphere meoscale prediction system

DesmondDesmond is a software package developed at D. E. Shaw Research to perform high-speed molecular dynamics simulations of biological systems on conventional commodity clusters.

GAMESS GAMESS is a program for ab initio molecular quantum chemistry.

Galaxy Galaxy is an open, web-based platform for data intensive biomedical research.

GROMACSGROMACS is a versatile package to perform molecular dynamics, i.e. simulate the Newtonian equations of motion for systems with hundreds to millions of particles.

HMMERHMMER is used for searching sequence databases for homologs of protein sequences, and for making protein sequence alignments.

Intel compilers and libraries

LAMMPSLAMMPS is a classical molecular dynamics code, and an acronym for Large-scale Atomic/Molecular Massively Parallel Simulator.

MM5

The PSU/NCAR mesoscale model (known as MM5) is a limited-area, nonhydrostatic, terrain-following sigma-coordinate model designed to simulate or predict mesoscale atmospheric circulation. The model is supported by several pre- and post-processing programs, which are referred to collectively as the MM5 modelingsystem.

mpiBLAST mpiBLAST is a freely available, open-source, parallel implementation of NCBI BLAST.

NAMD NAMD is a parallel molecular dynamics code for large biomolecular systems.

Available applications at POD IU (Rockhopper)

Page 13: 1 Penguin Computing and Indiana University partner for “above campus” and campus bridging services to the community Craig A. Stewart (stewart@iu.edu) Executive.

13

Package name Summary

NCBI-BlastThe Basic Local Alignment Search Tool (BLAST) finds regions of local similarity between sequences. The program compares nucleotide or protein sequences to sequence databases and calculates the statistical significance of matches.

OpenAtomOpenAtom is a highly scalable and portable parallel application for molecular dynamics simulations at the quantum level. It implements the Car-Parrinello ab-initio Molecular Dynamics (CPAIMD) method.

OpenFoam

The OpenFOAM®  (Open Field Operation and Manipulation) CFD Toolbox is a free, open source CFD software package produced by OpenCFD Ltd. It has a large user base across most areas of engineering and science, from both commercial and academic organisations. OpenFOAM has an extensive range of features tosolve anything from complex fluid flows involving chemical reactions, turbulence and heat transfer, to solid dynamics and electromagnetics.

OpenMPI Infinibad based Message Passing Interface - 2 (MPI-2) implementation

POPPOP is an ocean circulation model derived from earlier models of Bryan, Cox, Semtner and Chervin in which depth is used as the vertical coordinate. The model solves the three-dimensional primitive equations for fluid motions on the sphere under hydrostatic and Boussinesq approximations.

Portland Group compilers

R R is a language and environment for statistical computing and graphics.

WRF

The Weather Research and Forecasting (WRF) Model is a next-generation mesoscale numerical weather prediction system designed to serve both operational forecasting and atmospheric research needs. It features multiple dynamical cores, a 3-dimensional variational (3DVAR) data assimilation system, and a software architecture allowing for computational parallelism and system extensibility.

Available applications at POD IU (Rockhopper)

Page 14: 1 Penguin Computing and Indiana University partner for “above campus” and campus bridging services to the community Craig A. Stewart (stewart@iu.edu) Executive.

14

IU / POD as an example of effective campus bridging

• The goal of campus bridging is to enable the seamlessly integrated use among a scientist or engineer’s personal cyberinfrastructure; cyberinfrastructure on the scientist’s campus; cyberinfrastructure at other campuses; and cyberinfrastructure at the regional, national, and international levels; as if they were proximate to the scientist.

– Short form: The goal of campus bridging is to make local, regional, and national cyberinfrastructure facilities appear as if they were peripherals to your laptop

• We remember that the speed of light is fixed, but latency is not the biggest problem!• The biggest problems:

– Not enough CI resources available to most researchers– When you go from your campus to the national cyberinfrastructure it can feel like you

are falling off a cliff! That’s why you need bridging….• More info on campus bridging at http://pti.iu.edu/campusbridging/• IU is collaborating with Penguin Computing to support the national research community in

general and in particular with two NSF-funded projects:– eXtreme Science and Engineering Discovery Environment (XSEDE)– National Center for Genome Analysis Support

Page 15: 1 Penguin Computing and Indiana University partner for “above campus” and campus bridging services to the community Craig A. Stewart (stewart@iu.edu) Executive.

15

XSEDE and Penguin – part 1

• XSEDE (eXtreme Science and Engineering Discovery Environment) is a project, an institution, and a set of services.– As a project, XSEDE is a five-year, $121 million grant award made by the

National Science Foundation (NSF) to the National Center for Supercomputing Applications (NCSA) at the University of Illinois and its partners via program solicitation NSF 08-571. XSEDE is a successor to the NSF-funded TeraGrid project, which itself succeeded the NSF supercomputer center program that began in the 1980s.

– As an institution, XSEDE is a collaboration led by NCSA and 18 partner organizations to deliver a series of instantiations of services, each instantiation being developed through a formal systems engineering process.

– As a set of services, XSEDE integrates supercomputers, visualization and data analysis resources, data collections, and software into a single virtual system for enhancing the productivity of scientists, engineers, social scientists, and humanities experts.

Page 16: 1 Penguin Computing and Indiana University partner for “above campus” and campus bridging services to the community Craig A. Stewart (stewart@iu.edu) Executive.

16

XSEDE and Penguin – part 2

• Under TeraGrid, it was never possible to buy “TeraGrid-like” cycles, and many people viewed the allocation process as very slow

• XSEDE is speeding up the allocation process considerably• IU is working with Penguin Computing to install the basic open source

XSEDE software environment on Rockhopper• It will be possible to buy “XSEDE-like” cycles in a matter of minutes using a

credit card• In some circumstances this will be a much better way to meet peak needs,

or use startup funds, than buying and installing “clusters in a closet.”

Page 17: 1 Penguin Computing and Indiana University partner for “above campus” and campus bridging services to the community Craig A. Stewart (stewart@iu.edu) Executive.

17

NCGAS, POD IU, and campus bridging

• The National Center for Genome Analysis Support• A Cyberinfrastructure Service Center affiliated with the Pervasive

Technology Institute at Indiana University (http://pti.iu.edu) • Dedicated to supporting life science researchers who need computational

support for genomics analysis• Initially funded by the National Science Foundation Advances in Biological

Informatics (ABI) program, grant # 1062432• Provides access to genomics analysis software on supercomputers

customized for genomics studies INCLUDING POD IU• Particularly focused on supporting genome assembly codes such as:

– de Bruijn graph methods: SOAPdeNovo, Velvet, ABySS,– consensus methods: Celera, Newbler, Arachne 2

• For more information, see http://ncgas.org

Page 18: 1 Penguin Computing and Indiana University partner for “above campus” and campus bridging services to the community Craig A. Stewart (stewart@iu.edu) Executive.

18

Summary

• IU and its partners are collaborating with Penguin Computing Inc. to implement a new model of ‘above campus’ services that provides many of the advantages of “cloud” services, while avoiding many of the drawbacks.

• The service provided is “Cluster as a Service” – a real, high performance supercomputer cluster

• Access is simple – if you are at a .edu or a FFRDC, get out your credit card and go computing

• As examples of effective campus bridging:– This service is being supported by the IU National Center for Genome Analysis

Support– IU is providing the open source components of the XSEDE software

environment to provide a “run-like” XSEDE environment that you can access in minutes with a credit card

• Establishing this partnership is possible through the involvement of our key academic partners: University of California Berkeley, University of Virginia, University of Michigan

Page 19: 1 Penguin Computing and Indiana University partner for “above campus” and campus bridging services to the community Craig A. Stewart (stewart@iu.edu) Executive.

19

Absolutely Shameless Plugs

• XSEDE12: Bridging from the eXtreme to the campus and beyond• July 16-20, 2012  |  Chicago• The XSEDE12 Conference will be held at the beautiful Intercontinental

Chicago (Magnificent Mile) at 505 N. Michigan Ave. The hotel is in the heart of Chicago's most interesting tourist destinations and best shopping.

• Watch for Calls for Participation – coming early January

• And please visit the XSEDE and IU displays in the SC11 Exhibition Hallway!

Page 20: 1 Penguin Computing and Indiana University partner for “above campus” and campus bridging services to the community Craig A. Stewart (stewart@iu.edu) Executive.

20

For more information…

• https://podiu.penguincomputing.com/• http://pti.iu.edu/ci/systems/rockhopper

Page 21: 1 Penguin Computing and Indiana University partner for “above campus” and campus bridging services to the community Craig A. Stewart (stewart@iu.edu) Executive.

21

Thanks

• Penguin Computing, Inc. for their willingness to forge new paths with IU• Staff of the Research Technologies Division of University Information Technology

Services, affiliated with the Pervasive Technology Institute, who were involved in the implementation of Rockhopper: George Turner, Robert Henschel, David Y. Hancock, Matthew R. Link, Richard Knepper

• Those involved in campus bridging activities: Guy Almes, Von Welch, Patrick Dreher, Jim Pepin, Dave Jent, Stan Ahalt, Bill Barnett, Therese Miller, Malinda Lingwall, Maria Morris, Gabrielle Allen, Jennifer Schopf, Ed Seidel

• All of the IU Research Technologies and Pervasive Technology Institute staff who have contributed to the development of IU’s advanced cyberinfrastructure and its support

• NSF for funding support (Awards 040777, 1059812, 0948142, 1002526, 0829462, 1062432, OCI-1053575 – which supports the Extreme Science and Engineering Discovery Environment)

• Lilly Endowment, Inc. and the Indiana University Pervasive Technology Institute• Any opinions presented here are those of the presenter and do not necessarily represent

the opinions of the National Science Foundation or any other funding agencies