Jetstream: Accessible cloud computing for the national science and engineering communities
-
Upload
matthew-vaughn -
Category
Technology
-
view
205 -
download
0
Transcript of Jetstream: Accessible cloud computing for the national science and engineering communities
funded by the National Science FoundationAward #ACI-1445604
Jetstream: Accessible cloud computing for the national science and engineering communities Matthew Vaughn(@mattdotvaughn)ORCID 0000-0002-1384-4283Director, Life Science ComputingTexas Advanced Computing CenterPI @ Jetstream | Cyverse | Araport | CODE@TACC
“Cloud computing is a model for enabling ubiquitous, convenient, on-demand network access to a shared pool of
configurable computing resources (e.g., networks, servers, storage, applications
and services) that can be rapidly provisioned and released with minimal
management.” NIST Definition of Cloud Computing
http://www.nist.gov/itl/csd/cloud-102511.cfm
SCIENTIFIC COMPUTING THEN
• C/C++/FORTRAN/PERL/SHELL• MPI• LAPACK/BLAS/PETSC• GRID ENGINE• UNIX*• X86/PPC/SPARC• COPROCESSORS
SCIENTIFIC COMPUTING NOWLANGUAGES
• Python 2 & 3• R• Julia• Perl• Matlab• Java• Scala, Clojure, etc• .NET • C/C++• Swift• Haskell• Go• Javascript
FRAMEWORKS
• MapReduce Hadoop, Storm, Pachyderm, Cloudera• Event & Streaming: Kinesis, Azure Stream Analytics, Camel, Streambase• Deep/Machine Learning: Watson, Azure BI, Tensorflow, Caffe• In-memory parsing: Kognito, Apache Spark• Containers: Docker, Rocket, MESOS, Kubernetes• Cloud: AWS, GCE, OpenStack, vCloud, Azure
HARDWARE
•Many-core computing - 50-100 threads/node*
• Xeon / Xeon Phi• GPU• OpenPower• ARM• ShenWei• Google TPU
• Multi-level memory architecture• Hierarchical storage• FPGAs• Quantum-like systems
Funded by the National Science FoundationAward #ACI-1445604
http://jetstream-cloud.org/
What is Jetstream?A production cloud platform for NSF-sponsored researchers
• Provides on-demand interactive computing and analysis• Enables configurable environments and architectures• Supports computational reproducibility and sharing• Democratizes access to cloud-native software• Focused on ease of use for all adopters
Expands the community of users who benefit from NSF investment in shared cyberinfrastructure
Funded by the National Science FoundationAward #ACI-1445604
http://jetstream-cloud.org/
Leadership class machinesTraditional HPC, HTC systems
Jetstream serves the Long Tail of Science, Engineering, and Education
Funded by the National Science FoundationAward #ACI-1445604
http://jetstream-cloud.org/
Jetstream serves the Long Tail of Science, Engineering, and EducationResearchers, developers, and scientists who
– Need between 1 and a few hundred cores• RIGHT NOW• For the foreseeable future• Not forever
– Want to fully customize the OS and configuration for their research computing environment
– Are working with cloud-native applications & workflows– Use interactive mode for their computing & analytics
Funded by the National Science FoundationAward #ACI-1445604
http://jetstream-cloud.org/
• Jetstream is also useful to– Science gateway operators
• Run web applications, databases, and services (front-end)• Use it for on-demand provisioned compute capacity
– STEM educators teaching a variety of subjects• Create a reference VM appliance• Provision an entire classroom• Minimize need for local IT support• Onboard students into XSEDE ecosystem
Jetstream serves the Long Tail of Science, Engineering, and Education
Funded by the National Science FoundationAward #ACI-1445604
http://jetstream-cloud.org/
21st Century Workforce Development in the Cloud
• Specialized virtual Linux desktops and applications enable research and research education at small colleges and universities– HBCUs (Historically Black Colleges and Universities)– MSIs (Minority Serving Institutions)– Tribal colleges– Higher-education institutions in EPSCoR States
• Democratize access to cloud-native technologies and approaches – no credit card or PO needed
User PerspectivesUser Accomplishments Role
• Learned how to use the shell and how to work with Linux• Mastered using R to develop plots for his manuscript• Published VM to IU Scholarworks to allow reproducible analysis
Laboratory scientist
• Launches an instance and has full sudo access to customize• Developed a software with R and Python library dependencies• She updates it regularly by creating VM image snapshots
Informatics specialist
• Linked several Jetstream instances with Apache Hadoop• Worked with XSEDE ECSS to import existing Amazon image• Built a simple science gateway to allow others to use his tools
Core facility staff
Funded by the National Science FoundationAward #ACI-1445604
http://jetstream-cloud.org/
System Overview
Flavor vCPUs RAM Storage Per Nodem.tiny 1 2 20 46
m.small 2 4 40 23
m.medium 6 16 130 7
m.large 10 30 230 4
m.xlarge 22 60 460 2
m.xxlarge 44 120 920 1
VM Host Configuration• Dual Intel E-2680v3 “Haswell”• 24 physical cores/node @ 2.5
GHz (Hyperthreading on)• 128 GB RAM• Dual 1 TB local disks• 10GB dual uplink NIC• Running Centos7+KVM
Hypervisor
Hardware Specifics
CEPH Storage• 20x Dell 730xd per cloud• 2x10Gbs bonded NIC per 730xd• Running CEPH 0.94.5 Hammer• Configured as OpenStack Storage
• Storage is XSEDE-allocated• Implemented on backend as OpenStack Volumes• Each user gets 10 volumes up to 500GB total storage• Exploring object storage as well but that’s in the future
Platform Overview
Atmosphere APIGlobus Auth
Atmo Services XSEDE Services
OpenStack CEPH
Indiana University
OpenStack CEPH
TACC
OpenStack CEPH
Potentially, Others
Web App
Key components: What is OpenStack?
• Free and open source software for creating private and public clouds
• Started in 2010 by Rackspace and NASA, now managed by OpenStack Foundation
• Widespread adoption across industry and public sector
• Modular architecture with a 6-month update cycle
https://www.openstack.org/
Key components: Everything elseComponent Function
Atmosphere Web application + middleware for providing user-friendly provider-agnostic IaaS
Globus Auth Provides powerful identity and access management functions that are easily integrated into web and mobile applications
XSEDE Services Centralized account management, allocation, and reporting
CEPH Distributed object store and file system designed to provide excellent performance, reliability and scalability
Integration with XSEDE via Globus Auth
Atmosphere Web App uses and Globus Auth implements industry-standard Oauth2• Leaves us flexibility on
identity and access• Globus Auth implements
(in beta) password grant Oauth flow, which means Jetstream access is 100% scriptable with your XSEDE credentials
GateOne Web Shell
• Zero install SSH client• Allows tablets or
Windows to easily use the system
• Supports screen-casting and terminal sharing
Can also use native SSH client with SSH keys
Jetstream is programmable cyberinfrastructure
Web Service APIs
• OpenStack - Official and unofficial clients + libs (i.e. boto)
• EC2* – Integration with AWS-specific code • Atmosphere - Alpha. Getting language
libraries “soon”• Preview @
http://docs.atmospherev2.apiary.io/ Finicky
Automation, Orchestration, and Workflow
• Marathon/MESOS• https://www.youtube.com/v/VzZfwHLmcL0
• Docker Machine + Docker Swarm• Kubernetes• Apache Airavata• CloudMan & Elasticluster• OpenStack MagnumConfiguration Management Tools
• Vagrant / Terraform• Chef• Ansible• Puppet
In Dev
OpenStack API client
Jetstream is like any other OpenStack cloud
• Scriptable access to low-level APIs
• Use language libraries (Python boto, etc.)
• Works* with 3rd party configuration and orchestration tools
openstack image list “show all VM images on this cloud”
Example: Elasticluster
Status: In Development
elasticluster start slurm-js-iu
“Launch a 3-node SLURM cluster with Gluster storage”
Cluster in a box• Slurm, SGE, Torque• Centos, Ubuntu• Hadoop• Gluster, Ceph, NFS• Ansible
-> XSEDE XNIT• compatible clusters
Example: Marathon, MESOS, Docker, Unidata
Funded by the National Science FoundationAward #ACI-1445604
http://jetstream-cloud.org/
What comes next?
• Passed acceptance (handily) in May• OpenStack clouds, CEPH storage, and software components functional
– Substantial software integration and development required post-acceptance. Terra incognita!
• Running in “Early operations mode”– Extra maintenance days– A few rough edges (especially for power users)
• Full production - Sep 1, 2016• More features and capabilities will come• Public roadmap soon
Free tier makes it really easy to get started on public cloud$
Our idea: Let any user with active XSEDE User Portal account use a small (but functional) slice of Jetstream• Get an XSEDE account• Sign in to XSEDE User Portal• Click “Trial Jetstream Access”
button• Get access to Jetstream in
about 30 minutes
Funded by the National Science FoundationAward #ACI-1445604
http://jetstream-cloud.org/
Jetstream in context
• HTC system• GPGPUs• Virtual Clusters• Gateways
• Data Intensive• Portal-based
configuration• Services• Data sets
• Large Memory• Data Intensive
Computing• Managed VM
• Self-service VM• Gateways• Minimal disk• Extensible
COMET WRANGLER BRIDGES JETSTREAM
Funded by the National Science FoundationAward #ACI-1445604
http://jetstream-cloud.org/
Jetstream in context (2)Blue Waters & Stampede 1/2Comet, Bridges, Regional HPC, Public
Cloud$
Funded by the National Science FoundationAward #ACI-1445604
http://jetstream-cloud.org/
SummaryJetstream is a public, production cloud platform
• Offers on-demand interactive shell + VNC• Supports configurable software
environments• Enables configurable computing resources• Encourages work with cloud-native software• Empowers novice, intermediate, & expert
users
Funded by the National Science FoundationAward #ACI-1445604
http://jetstream-cloud.org/
PartnersConstruction
Application / Community LeadsManagement & Operations
Vendors
@mattdotvaughnwww.slideshare.net/mattdotvaughn
@jetstream_cloud
http://use.jetstream-cloud.org/
http://portal.xsede.org
https://github.com/jetstream-cloud
Funded by the National Science FoundationAward #ACI-1445604
http://jetstream-cloud.org/
How can I use Jetstream?• An XSEDE User Portal (XUP) account is required. They are
free! Get one at https://portal.xsede.org• Read the Allocations Overview -
https://portal.xsede.org/allocations-overview• Write a successful allocation request – start with a Startup
or Education request - https://portal.xsede.org/successful-requests
Funded by the National Science FoundationAward #ACI-1445604
http://jetstream-cloud.org/
Where can I get help or learn more?• Production:
– Web app: http://use-jetstream-cloud.org/– User guides: https://portal.xsede.org/jetstream– XSEDE KB: https://portal.xsede.org/knowledge-
base– Email: [email protected]– Campus Champions:
https://www.xsede.org/campus-champions– Training Videos / Virtual Workshops (TBD)
Expanding NSF XD’s reach and impactAround 299,000 researchers, educators, & learners received NSF support in 2012-2013
– Only 1.5% completed a computation, data analysis, or visualization task on XSDEDE program resources
– Less than 3% had an XSEDE Portal account– 70% of researchers surveyed* claimed to be resource
constrainedWhy don’t they use XSEDE systems?
– Activation energy is pretty high– HPC resources are scarce and not well-matched to their needs– They just don’t need that much capability
* https://www.xsede.org/xsede-nsf-release-cloud-survey-report