Magellan: Experiences from a Science Cloud Lavanya Ramakrishnan.
-
Upload
hilda-green -
Category
Documents
-
view
215 -
download
1
Transcript of Magellan: Experiences from a Science Cloud Lavanya Ramakrishnan.
Magellan: Experiences from aScience Cloud
Lavanya Ramakrishnan
Magellan Overview
• Mission
Determine the appropriate role for privatecloud computing for mid-range tightly coupled computational models
Layout
• Describe experiences with cloud software stack– Eucalyptus 1.6.2– MapReduce: Hadoop
• Early science use cases and impact onapplication design and development
• Detail specific requirements for scientific use
Experience with Private CloudSoftware
• Eucalyptus (1.6.2)– open source IaaS (infrastructure as a service) software– API compatible with Amazon– support for Elastic Block Store, Elastic IPaddresses
Experiences with Eucalyptus
• Scalability– all VM network traffic is routed through a single cluster controller node
*pro: good for security *con: network bottlenect, restricts scalability
– 750-800 concurrent VMs due to messaging size limit• Image Management
– need system administration skills– need to create, manage and upload correctimages
Experiences with Eucalyptus
• Co-exist with other serivces– Using a number of system services, and assume it have the complete control of the system .
• Allocation and Accounting– hard to ensure fairness since first come first serve
• Logging and Monitoring– limited support : recovery: loss IP address assignment => restart all running instances
Experiences with Hadoop
• File System Access(1)considers only the data locality for a single file and does not handle applications that might havemultiple input sets(2) HDFS also does not expose a POSIX interface, which makes it dicult for legacy applications to leverage the le system directly.
• Configuration(1) Has numberof site-specific and job-specific parameters that are hard to tune to achieve optimal performance.
Application Case Studies
STAR – Streamed real-time data analysisDetails
• STAR performed Real-time analysis of data coming from Brookhaven Nat. Lab• Need on-demand access to computing resources to process realtime data• Clouds as a platform for this application
Application Design andDevelopment
• Image creation and management– system administration skills– determining what goes on image etc
• Data management– need to manage data storage properly
• Performance and reliability needs to be considered
Unique Needs and Features of aScience Cloud
• Science clouds need access to legacy data sets in HPC centers
• Science clouds need MapReduce implementations that account for characteristics of scientific data and analysis methods
• Science clouds need preinstalled, pre-tuned application software stacks.
• Science clouds need customizations for site-specific policies.
Conclusions
• Current day cloud computing solutions havegaps for science– performance, reliability, stability– programming models are difficult for legacy apps
• HPC centers can adopt some of the technologiesand mechanisms– support for data-intensive workloads– allow custom software environments– provide different levels of service