EUCALYPTUS: An Open Source Infrastructure for Elastic Computing Research
-
Upload
wesley-wallace -
Category
Documents
-
view
22 -
download
3
description
Transcript of EUCALYPTUS: An Open Source Infrastructure for Elastic Computing Research
EUCALYPTUS:An Open Source Infrastructure for
Elastic Computing ResearchRich Wolski
Chris Grzegorczyk, Dan Nurmi, Graziano Obertelli, Shriram Rajagopalan, Sunil
Soman, Lamia Youseff, Dmitrii Zagorodnov
Computer Science Department
University of California, Santa Barbara
Exciting Weather Forecasts
Commercial Cloud Formation
What is a Cloud?
SLAs
Web Services
Virtualization
How do they work?
• What can and cannot easily be hosted in a cloud?
• What extensions or modifications are required to support a wider variety of services and applications?—Scientific computing—Data assimilation—Multiplayer gaming
• How can cloud computing be coupled with other distributed software systems and infrastructure?—How should clouds and mobile devices (e.g. cell phones)
interact?
• Open Source Cloud—Simple—Extensible—Based on widely available and popular technologies—Easy to install and maintain
The Skies are Opening
• Nimbus (Freeman and Keahey, University of Chicago)—Client-side cloud-computing interface to Globus-enabled
TeraPort cluster at U of C—Based on GT4 and the Globus Virtual Workspace Service
– Lots of cool features– Great if local resources are GT4 proficient– Tutorials and documentation in “grid space”
• Enomalism—Start-up company distributing open source —REST APIs—User “dashboard”—Multi-virtulaization support—Lost of extended cloud services—Beta version now available for download from SourceForge
• Elastic Utility Computing Architecture Linking Your Programs To Useful Systems
• Web services based implementation of elastic/utility/cloud computing infrastructure—Linux image hosting ala Amazon
• How do we know if it is a cloud?—Try and emulate an existing cloud: EC2 + S3—Works with command-line tools from Amazon w/o
modification—Enables leverage of emerging EC2 value-added service
venues (e.g. Rightscale)
• Functions as a software overlay—Existing installation should not be violated (too
much)
• “One-button” install using Rocks—“System Administrators are people too.”
Goals for Eucalyptus
• Foster research in elastic/cloud/utility computing —models of service provisioning, scheduling, SLA
formulation, hypervisor portability and feature enhancement, etc.
• Experimentation vehicle prior to buying commercial services—“Tech Preview” using local machines with local
system administration support
• Provide a debugging and development platform for EC2 (and other clouds)—Allow the environment to be set up and tested
before it is instantiated in a for-fee environment
• Provide a basic software development platform for the open source community—E.g. the “Linux Experience”
• Not a designed as a replacement technology for EC2 or any other cloud service
Challenges
• Extensibility—Simple architecture and open internal APIs
• Client-side interface—Amazon’s EC2 interface and functionality (familiar and
testable)
• Networking—Virtual private network per cloud—Must function as an overlay => cannot supplant local
networking
• Security—Must be compatible with local security policies
• Packaging, installation, maintenance—system administration staff is an important constituency
for uptake
Eucalyptus Architecture: WS-Cloud
Client-side APITranslator
Cloud Controller
Cluster Controller Node Controller
Amazon EC2 Interface
Database
EC2 Compatibility
• Interface is based on Amazon’s published WSDL—2008 compliant except for
– static IP address assignment– Security groups
—“Availability” zones correspond to individual clusters—Uses the EC2 command-line tools downloaded from
Amazon—REST interface
• S3 support/emulation: not yet, but on its way—Images accessed by file system name instead of S3 handle
for the moment– Unless user wants to use the actual S3 and pay for the
egress charges
• System administration is different—Eucalyptus defines its own Cloud Admin. tool set for user
accounting and cloud management
Networking
• Eucalyptus does not assume that all worker nodes will have publicly routable IP addresses
—Each cloud allocation will have one or more public IP addresses
—All cloud images have access to a private network interface
• Two types of networks internal to a cloud allocation—Virtual private network
– Uses VDE interfaced to Xen that is set up dynamically– Substantial performance hit within a cluster– Allows a cloud allocation to span clusters
—High-performance private network (availability zone)– Bypasses VDE and uses local cluster network for each
allocation– Runs at “native” network speed (I.e. with Xen)– Cloud allocations cannot span clusters
• Availability zone approach fits with Amazon’s high-level semantics
Virtual Network: Ethernet Overlay
vde vde vde
vde
vde vde vde
vde
ssl
Performance of the Virtual Network
Security
• All Eucalyptus components use WS-security for authentication—Encryption of inter-component communication is not
enabled by default– Configuration option
• Ssh key generation and installation ala EC2 is implemented—Cloud controller generates the public/private key pairs and
installs them
• User sign-up is web based—User specifies a password and submits sign-up request—Cert is generated but withheld until admin. approves
request—User gains access to cert. through password-protected web
page– Similar to EC2 model without the credit cards
Packaging, Installation, and Deployment
• Rocks—“One-button” install per cluster—Requires Rocks V (the most current release) for Xen
support—If you know what you are doing, RPMs can be
extracted and installed manually—Multiple clusters requires a configuration file
– Multi-cluster configuration tools ala Rocks not readily available
• Build-from-source—“Many-button” install
– Instructions, scripts, rsync, and perseverance
• Single-machine “cloud”– All components run in dom0– Need to resolve port-conflicts by hand
What’s it Made Out Of?
• Axis2 and Axis2c version 1.4.0
• Hibernate 3.2.2
• HSQLDB 1.8.0
• jetty 6.1.9
• JiBX (March 30th sourceforge)
• Mule 2.0.1
• Rampart version 1.3
• libvirt version 0.4.2
• socat-1.6.0
• VDE version 2.2.0-pre2
Eucalyptus Public Cloud
• Free, time limited access to a Eucalyptus installation at UCSB—Only installed images can be run (i.e. no image uploading)—4 VM limit—6 hour limit—Reverse firewall
• Configuration—8 Pentium Xeon processors (3.2 GHz)—2.5 GB of memory per image—36 GB of disk space—1 Gb enet interconnect—Local availability zone only (i.e. no VDE)—Debian 4.0, Linux v2.6.18-xen-3.1—Xen 3.2
Demo
EC2 and EPC Throughput
EC2 and EPC RTT
Single Instance
Four Instances
Eight Instances
Version History• Eucalyptus version 1.0 became available for public
release 5/28/08 (Rocks binary only)
• Version is 1.1 shipped 7/1/2008—Bug fixes—Decent WS-security implementation—REST interface—Source code release—Build-from source “guidance” scripts and instructions
• Version 1.2 shipped 8/1/2008—Primarily a bug-fix release—Upgrade mechanism (instead of re-install)
• Version 1.3 shipped 8/23/2008—Amazon changed their client-side tools
Next Releases
• Version 1.4 (expected 11/5/2008)—S3 support uses local file system—Administrator definable SLAs—Cross cluster layer 2 networking—Elastic IPs and security groups, metadata service—User-defined image management and registration
• Version 1.5 (expected 1/1/09)—Elastic Block Store (EBS)—VLAN safe layer 3 networking—Credential federation support—DB managed configuration support—Distributed DB state management (maybe)
• Should be fully 2008 interface compatible in Release 1.5
Next Generation Eucalyptus Networking
• Multiple networking implementations—Open Source + academic environment == overlay or
nothing—Some sites are willing to tolerate a more invasive
networking approach in exchange for performance and scalability
—Three different approaches– Exploit Xen network interface isolation and VLANS
+ software only approach - will make Eucalyptus more Xen dependent
– IP-tables and NATs + high-level software only approach - possible conflicts with existing IP-tables
configuration(s)– Hardware-supported VLANs and trunking
+ fast and scalable - requires on-line access to VLAN configuration
interface
More Plans
• Hypervisor religiosity and secularism—Current implementation uses a subset of the libvirt
interface– Xen, VMWare, kvm
—Eucalyptus + Xen + VMWare “works” but is clearly not the right answer
—HyperV– Initial study makes it look quite doable for
virtualization support– Understanding the networking is next on the list– Port of the Eucalyptus components to .Net
• UCSB Campus Cloud(s)—UC Cyberinfrastructure pilot—Test installation up at California Nanosystems Institute
(CNSI)—Leverage UCSB VMWare installation and Eucalyptus
installation at SDSC—Requires a very rich user accounting system
Ancillary Projects
• Google App Engine—AppDrop will run App Engine inside EC2—Port AppDrop to Eucalyptus—Port App Engine to Hbase and/or Hypertable—Should provide an interesting research vehicle
• Rightscale—Local enterprise focused on providing Ruby-on-Rails
infrastructure for EC2—“Turing Test” for Eucalyptus
– Can Rightscale “tell” that it isn’t talking to EC2?—Requires that the REST interface be solid—Testing now against the EPC
Clouds Versus Grids
• Clouds and Grids are distinct
• Cloud—Full private cluster is provisioned—Individual user can only get a tiny fraction of the total
resource pool—No support for cloud federation except through the client
interface—Opaque with respect to resources
• Grid—Built so that individual users can get most, if not all of the
resources in a single request—Middleware approach takes federation as a first principle—Resources are exposed, often as bare metal
• These differences mandate different architectures for each
Lessons Learned so Far
• Open source for cloud computing constrains design more than we thought it would—More of the technical challenge centers on dealing with
local configuration choices—Multi-cluster service ensemble really isn’t a typical open
source tool– Do we really need a laptop edition?
• Administrators in the “real world” still build clusters by hand—We thought the use of Rocks early on would make us
heroes -- it hasn’t—In HPC space, admin time is *really* expensive
• There are few, if any, cloud configuration tools available—Red Hat, Debian, CentOS, Ubuntu => linux packaging and
deployment—Rocks => cluster packaging and deployment—??? => cloud packaging and deployment?
Thanks, More Information, and Help!
• National Science Foundation—VGrADS Project
• SDSC, CNSI, IU, Rice University
• RightScale.com
• The Eucalyptus Development Team at UCSB is—Chris Grzegorczyk -- [email protected]—Dan Nurmi -- [email protected]—Graziano Obertelli -- [email protected]—Shriram Rajagopalan -- [email protected]—Sunil Soman -- [email protected]—Lamia Youseff -- [email protected]—Dmitrii Zagordnov -- [email protected]
• http://eucalyptus.cs.ucsb.edu