Afgan bosc2010 galaxy_cloud
-
Upload
bosc-2010 -
Category
Technology
-
view
719 -
download
1
Transcript of Afgan bosc2010 galaxy_cloud
Deploying Galaxy on the Cloud
Enis Afgan, Dannon Baker, Nate Coraor, Anton Nekrutenko, James Taylor
Discovery of human heteroplasmic sitesenabled by an accessible interface to
cloud!computing infrastructure
Enis Afgan, Hiroki Goto, Ian Paul, Francesca ChiaromonteKateryna Makova, Anton Nekrutenko, James Taylor
15
CAMPAIGN EMORY BRAND GUIDELINESwww.campaign.emory.edu
Each school and unit will have its own campaign stationery, which includes the logo and marble and is immediately recognizable as part of Campaign Emory.
Stationery
N E L L H O D G S O N W O O D R U F F S C H O O L O F N U R S I N G
Emory University . 1520 Clifton Road, NE . Atlanta, Georgia 30322-4207 . 000.000.0000
C A M P A I G N
Emory UniversityNell Hodgson Woodruff School of Nursing1520 Clifton Road, NEAtlanta, Georgia 30322-4207
C A M P A I G N
AMY DORRILLChief Development Officer
Emory UniversitySchool of Nursing Development1520 Clifton Road, NEAtlanta, Georgia 30322-4207P 404.727.1234E [email protected]
C A M P A I G N
C A M P A I G N
notecard: 7in. x 5in.
letterhead: 8.5in. x 11in.
envelope: No. 10
business card: 3.5in. x 2in. (standard size)
Permissions: you are free to blog or live-blog about this presentation as long as you attribute the work to its authors
Discovery of human heteroplasmic sitesenabled by an accessible interface to
cloud!computing infrastructure
Enis Afgan, Hiroki Goto, Ian Paul, Francesca ChiaromonteKateryna Makova, Anton Nekrutenko, James Taylor
15
CAMPAIGN EMORY BRAND GUIDELINESwww.campaign.emory.edu
Each school and unit will have its own campaign stationery, which includes the logo and marble and is immediately recognizable as part of Campaign Emory.
Stationery
N E L L H O D G S O N W O O D R U F F S C H O O L O F N U R S I N G
Emory University . 1520 Clifton Road, NE . Atlanta, Georgia 30322-4207 . 000.000.0000
C A M P A I G N
Emory UniversityNell Hodgson Woodruff School of Nursing1520 Clifton Road, NEAtlanta, Georgia 30322-4207
C A M P A I G N
AMY DORRILLChief Development Officer
Emory UniversitySchool of Nursing Development1520 Clifton Road, NEAtlanta, Georgia 30322-4207P 404.727.1234E [email protected]
C A M P A I G N
C A M P A I G N
notecard: 7in. x 5in.
letterhead: 8.5in. x 11in.
envelope: No. 10
business card: 3.5in. x 2in. (standard size)
Permissions: you are free to blog or live-blog about this presentation as long as you attribute the work to its authors
Bioinformatics Open Source Conference, July 9, 2010, Boston, MA
Galaxy: accessible analysis system
• Easily integrate new tools
• Consistent tool user interfaces automatically generated
• History system facilitates and tracks multistep analyses
• Exact parameters of a step can always be inspected, and easily rerun
• Work!ow system
Enable accessible, transparent, and reproducible researchhttp://usegalaxy.org/
Cluster
Galaxy Jobs+ Galaxy Jobs+
Job
JobJob
Job
Workstation
Galaxy
Galaxy
Galaxy
Galaxy
Galaxy on the Cloud• Ideal for small labs and individual researchers
• Labs do not have to house compute resources
• Support variable volume of analysis data and computation requirements
• Ready deployment with pre-con"gured reference genomes and tools
• Goal is to keep Galaxy use unchanged but deliver !exibility and job performance improvement
Current Status• Deployment of Galaxy on Amazon Web Services Cloud
• Requires no computational expertise, no infrastructure, no software
• Support for dynamic resource scaling
• Support for dynamic storage
• Automated con"guration of the Galaxy Cloud machine image
• Deploy a Galaxy cluster in minutes!
Deploying Galaxy on the AWS Cloud
1. Create an AWS account and sign up for EC2 and S3 services
2. Use the AWS Management Console to start a master EC2 instance
3. Use the Galaxy Cloud web interface on the master instance to manage the cluster size
2. Start an EC2 Instance
3. Con"gure Your Cluster
(Starting Workers)
4. Grow and Shrink
Grow Storage
1. Stop services
2. Detach volume
3. Snapshot
4. New volume
5. Grow !le system
6. Resume services
Clean Up• Once the need for a given cluster subsides,
- you can always start it back up
• Data is preserved while a cluster is down
• Complete the shut down process by terminating the master instance from the AWS console
What is Coming
• Automatic cluster scaling
- Based on workload customization
• Automatic job splitting/parallelization
Questions&
CommentsTry your own cluster; it takes only 5 minutes and less than $1.
Complete instructions available at http://usegalaxy.org/cloud
A Little More GC Details
Management
Console
Galaxy
Application
1°2°
3°
6°, 8°
9°
Persistent
data
repository
Galaxy Controller
(GC)
Setup services
5°
4°
7°
10°GC-w
GC-w
GC-w
GC-w
GC-w
Persistent storage
Galaxy Image
Galaxy Image
Galaxy Image
Galaxy Image
Galaxy Image
Galaxy Image
Master instance
11°
Cloud or No Cloud?
• Consumption based cost - cost reduction?
• Better utilization of resource
• Management done by cloud provider
• Faster deployment time
• Dynamic scalability
• Not a silver bullet
• Expensive for 24/7 use
• Offers scalability in terms of infrastructure, applications are still sequential
• The data transfer problem?
• Security?
Pros Cons
Enabling Persistence
Galaxy
Tools
Galaxy
Indices
Public EBS snapshots
Galaxy
Tools
Galaxy
Indices
User ACluster 1
User
Data
Galaxy
Tools
Galaxy
Indices
User ACluster 2
User
Data
User ACluster 1
User
DataOn terminateGalaxy
Tools
Galaxy
Indices
User BCluster 1
User
Data
Private EBS volume
Enabling VersioningGC-User A,
Cluster1
GC-User A,
Cluster2
- latest GC used
- snaps IDs
GC-User A,
Cluster1
GC-User A,
Cluster2
GC-default
GC-snaps
GC-default
GC source
- latest
- prev. versions
GC-snaps
Public snap IDs
- latest
- prev. versions
PublicS3 buckets
PrivateS3 bucket
- latest GC used
- snaps IDs