Post on 26-Feb-2020
= Managed for You Standalone
Servers
IaaS PaaS SaaS
Applications
Runtimes
Database
Operating System
Virtualization
Server
Storage
Networking
Windows Azure
Efficiency
Control+Cost
2) Choose image,
then create and
configure VM(s)
for application
1) Choose image, then
create VM for DBMS
and configure DBMS
Library
VM Images
Developer
ApplicationData
Load
Balancer
5)
Configure
load
balancer
6) Manage VMs and
DBMS (e.g., deploying
new OS images in VMs)
3) Provision
database, then
create tables and
add data
4) Install
application
Developer
ApplicationData
Load
Balancer
2) Deploy
application
1) Provision
database, then
create tables and
add data
Windows Azure
Networking
“Red Dog” Front End (RDFE)
DLA Architecture (Old) Quantum10 Architecture (New)
TOR TOR TOR TOR
Spine Spine Spine
…
…
DCR DCR
LBLB
Spine
DC Routers
LB LB
40 Nodes
TOR
L
B
L
B
AGG
Digi
APC
L
B
L
B
AGG
L
B
L
B
AGG
L
B
L
B
AGG
L
B
L
B
AGG
L
B
L
B
AGG
20Racks
DC Router
Access
Routers
Aggregation +
LB
40 Nodes
TOR
Digi
APC
40 Nodes
TOR
Digi
APC
40 Nodes
TOR
Digi
APC
40 Nodes
TOR
Digi
APC
40 Nodes
TOR
Digi
APC
40 Nodes
TOR
Digi
APC
40 Nodes
TOR
Digi
APC
40 Nodes
TOR
Digi
APC
40 Nodes
TOR
Digi
APC
40 Nodes
TOR
Digi
APC
40 Nodes
TOR
Digi
APC
40 Nodes
TOR
Digi
APC
40 Nodes
TOR
Digi
APC
40 Nodes
TOR
Digi
APC
……
20Racks 20Racks 20Racks
…… … …
Server
WordSQL
Server
Datacenter
Exchange
OnlineSQL Azure
TOR TOR TOR TOR TOR
AGGLB LB
Fabric Controller
Role
Images
Role
Images
Role
Images
Role
Images
Image Repository
Maintenance OSParent
OS
Node
PXE
ServerMaintenance OS
Windows Azure
OS
Windows
Azure
OS
FC Host
Agent
Windows Azure Hypervisor
Windows
Deployment
Server
Role BWorker Role
Count: 2Update Domains: 2
Size: Medium
LoadBalancer
www.mycloudapp.net
www.mycloudapp.net
Fabric Controller (Primary)
FC Host Agent
Host Partition
Guest
Partition
Guest
Agent
Guest
Partition
Guest
Agent
Guest
Partition
Guest
Agent
Guest
Partition
Guest
Agent
Physical Node
Fabric Controller (Replica) Fabric Controller (Replica)…
Role Instance Role Instance Role Instance Role Instance
Trust boundary
Image Repository (OS VHDs,
role ZIP files)
• Blobs
• Tables
• Queues
BlobContainerAccount
sally
pictures
IMG001.JPG
IMG002.JPG
movies MOV1.AVI
EntityTableAccount
sally
users
Name =…
Email = …
Name =…
Email = …
photo index
Photo ID =…
Date =…
Photo ID =…
Date =…
MessageQueueAccount
sally
thumbnail jobs
128x128, http://…
256x256, http://…
photo processing jobs
http://…
http://…
Account
Container Blobs
Table Entities
Queue Messages
http://<account>.blob.core.windows.net/<container>
http://<account>.table.core.windows.net/<table>
http://<account>.queue.core.windows.net/<queue>
Front-
End-1
Front-
End-2
Middle
Tier-2
Middle
Tier-1Middle
Tier-3
Front-
End-1
Middle
Tier-1
Front-
End-2
Middle
Tier-2Middle
Tier-3
25 min
Guest
Agent
Connect
Timeout
Guest Agent
Heartbeat
5s
Role
Instance
Launch
Indefinite
Role
Instance
Start
Role
Instance
Ready
(for updates only)
15 min
Role Instance
Heartbeat
15s
Guest Agent
Heartbeat Timeout
10 min
Role Instance
“Unresponsive” Timeout
30s
Load Balancer
Heartbeat
15s
Load Balancer
Timeout
30s
Guest Agent
Role Instance
Front-End
Front-End
Windows
Azure
Storage,
SQL Azure
Load Balancer
Middle-
Tier
Queue
Using queues
Web Role
ASP.NET, WCF, etc.
Worker Role
main(){ … }
1) Receive work
2) Put work in queue
3) Get work from queue
4) Do work
To scale, add more of either
Queues are the application glue• Queues decouple different parts of application, making it
easier to scale app parts independently;
• Flexible resource allocation, different priority queues and separation of backend servers to process different queues.
• Queues mask faults in worker roles.
BLAST (Basic Local Alignment Search Tool)
• Most important software in bioinformatics
• Identify similarity between bio-sequences
Computationally intensive
• Large number of pairwise alignment operations
• A BLAST running can take 700 ~ 1000 CPU hours
• Sequence databases growing exponentially
• GenBank doubles in size every 15 months.
It is easy to parallelize BLAST
• Segment the input
• Segment processing (querying) is pleasingly parallel
• Segment the database
• Needs special result reduction processing
Large volume data
• A normal Blast database can be as large as 10GB
• 100 nodes means the peak storage bandwidth could reach to 1TB
• The output of BLAST is usually 10-100x larger than the input
• Parallel BLAST engine on Azure
• Query-segmentation data-parallel pattern
• split the input sequences
• query partitions in parallel
• merge results together when done
• Follows the general PaaS application model
• Web Role + Queue + Worker
Wei Lu, Jared Jackson, and Roger Barga, AzureBlast: A Case Study of Developing Science Applications on the Cloud, in Proceedings of the 1st Workshop on
Scientific Cloud Computing (Science Cloud 2010), Association for Computing Machinery, Inc., 21 June 2010
A simple Split/Join pattern
Leverage multi-core of one instance • argument “–a” of NCBI-BLAST
• 1,2,4,8 for small, middle, large, and extra large instance size
Task granularity • Large partition load imbalance
• Small partition unnecessary overheads• NCBI-BLAST overhead
• Data transferring overhead.
Best Practice: test runs to profiling and set size to mitigate the overhead
Value of visibilityTimeout for each BLAST task, • Essentially an estimate of the task run time.
• too small repeated computation;
• too large unnecessary long period of waiting time in case of the instance failure.
BLAST task
Splitting task
BLAST task
BLAST task
BLAST task
…
Merging Task
Task size vs. Performance• Benefit of the warm cache effect
• 100 sequences per partition is the best choice
Instance size vs. Performance• Super-linear speedup with larger size
worker instances
• Primarily due to the memory capability.
Task Size/Instance Size vs. Cost• Extra-large instance generated the best
and the most economical throughput
• Fully utilize the resource
Web Portal
Web
Service
Job registration
Job Scheduler
Worker
Worker
Worker
Global
dispatch
queue
Web Role
Azure Table
Job Management Role
Azure Blob
Database
updating Role
…
Scaling Engine
Blast databases,
temporary data, etc.)
Job Registry
NCBI databases
BLAST task
Splitting task
BLAST task
BLAST task
BLAST task
…
Merging Task
Discovering Homologs • Discover the interrelationships of known protein sequences
“All against All” query• The database is also the input query
• The protein database is large (4.2 GB size)• Totally 9,865,668 sequences to be queried
• Theoretically, 100 billion sequence comparisons
Performance estimation• Based on the sampling-running on one extra-large Azure instance
• Would require 3,216,731 minutes (6.1 years) on one desktop
One of largest BLAST jobs• This scale of experiments usually are infeasible to most scientists
• Allocated a total of ~4000 instances • 475 extra-large VMs (8 cores per VM), four datacenters, US (2), Western and North Europe
• 8 deployments of AzureBLAST• Each deployment has its own co-located storage service
• Divide 10 million sequences into multiple segments• Each will be submitted to one deployment as one job for execution
• Each segment consists of smaller partitions
• When load imbalances, redistribute the load
50
62
6262
6262
5062
3/31/2010 6:14 RD00155D3611B0 Executing the task 251523...
3/31/2010 6:25 RD00155D3611B0 Execution of task 251523 is done, it took 10.9mins
3/31/2010 6:25 RD00155D3611B0 Executing the task 251553...
3/31/2010 6:44 RD00155D3611B0 Execution of task 251553 is done, it took 19.3mins
3/31/2010 6:44 RD00155D3611B0 Executing the task 251600...
3/31/2010 7:02 RD00155D3611B0 Execution of task 251600 is done, it took 17.27 mins
3/31/2010 8:22 RD00155D3611B0 Executing the task 251774...
3/31/2010 9:50 RD00155D3611B0 Executing the task 251895...
3/31/2010 11:12 RD00155D3611B0 Execution of task 251895 is done, it took 82 mins
North Europe Data Center, totally 34,256 tasks processed
All 62 compute nodes lost tasks and
then came back in a group. This is
an Update domain
~30 mins
~ 6 nodes in one group
35 Nodes experience blob
writing failure at same time
West Europe Datacenter; 30,976 tasks are completed, and job was killed
A reasonable guess: the
Fault Domain is working
IaaS PaaS Relational NoSQL BlobsIaaS
Amazon
Microsoft
VMware
OpenStack
Computing StoragePublicPrivate
Key
Cloud Platform
Software
Cloud Platform
Service
Microsoft Private
Cloud
(for hosters)
IaaS PaaS Relational NoSQL
Computing Storage
BlobsIaaS
PublicPrivate
Amazon
Microsoft
VMware
OpenStack
Key
Cloud Platform
Software
Cloud Platform
Service
Microsoft Private
Cloud
Core cloud technologies
implemented in System
Center 2012
vCloud
IaaS PaaS Relational NoSQL
Computing Storage
BlobsIaaS
PublicPrivate
Amazon
Microsoft
VMwarevCloud
DataCenter
(for hosters)
OpenStack
Key
Cloud Platform
Software
Cloud Platform
Service
Microsoft Private
Cloud
Core cloud technologies
implemented in vSphere and
vCloud Director
Microsoft Private
Cloud
(for hosters)
Web/Worker Roles SQL
AzureTables Blobs
IaaS PaaS Relational NoSQL
Computing Storage
BlobsIaaS
PublicPrivate
Amazon
Microsoft
VMware vCloud
OpenStack
Key
Cloud Platform
Software
Cloud Platform
Service
Microsoft Private
CloudMicrosoft Private
Cloud Persistent VM
Roles
vCloud
DataCenter
(for hosters)
Cloud Foundry Cloud Foundry Cloud Foundry
IaaS PaaS Relational NoSQL
Web/Worker Roles SQL
AzureTables
Computing Storage
Blobs
Blobs
IaaS
PublicPrivate
Amazon
Microsoft
VMware vCloud
OpenStack
Key
Cloud Platform
Software
Cloud Platform
Service
Microsoft Private
CloudMicrosoft Private
Cloud Persistent VM
Roles
vCloud
DataCenter
(for hosters)
Elastic Compute
Cloud (EC2)
Relational
Database Service
(RDS)
SimpleDB Simple Storage
Service (S3)
Elastic
Beanstalk DynamoDB
IaaS PaaS Relational NoSQL
Web/Worker Roles SQL
AzureTables
Computing Storage
Blobs
Blobs
IaaS
PublicPrivate
Amazon
Microsoft
VMware vCloud Cloud Foundry Cloud Foundry
OpenStack
Key
Cloud Platform
Software
Cloud Platform
Service
Microsoft Private
CloudMicrosoft Private
Cloud Persistent VM
Roles
Cloud Foundry
vCloud
DataCenter
(for hosters)
IaaS
Elastic Compute
Cloud (EC2)
PaaS Relational NoSQL
Web/Worker Roles SQL
AzureTables
Relational
Database Service
(RDS)
SimpleDB
Computing Storage
Blobs
Simple Storage
Service (S3)
Blobs
Elastic
BeanstalkEucalyptus
IaaS
PublicPrivate
Amazon
Microsoft
VMware vCloud Cloud Foundry Cloud Foundry
OpenStack
Key
Cloud Platform
Software
Cloud Platform
Service
Microsoft Private
CloudMicrosoft Private
Cloud Persistent VM
Roles
DynamoDB
Cloud Foundry
vCloud
DataCenter
(for hosters)
OpenStack
Compute
OpenStack
Compute
(for hosters)
OpenStack
Object Storage
(for hosters)
IaaS
Elastic Compute
Cloud (EC2)
PaaS Relational NoSQL
Web/Worker Roles SQL
AzureTables
Relational
Database Service
(RDS)
SimpleDB
Computing Storage
Blobs
Simple Storage
Service (S3)
Blobs
Elastic
Beanstalk
IaaS
PublicPrivate
Amazon
Microsoft
VMware vCloud Cloud Foundry Cloud Foundry
OpenStack
Key
Cloud Platform
Software
Cloud Platform
Service
Microsoft Private
CloudMicrosoft Private
Cloud Persistent VM
Roles
DynamoDB
Cloud Foundry
Eucalyptus
vCloud
DataCenter
(for hosters)
Interesting to people
building applications:
Developers
Interesting to people
running applications:
Operations
IaaS PaaS
Running Existing
Web Apps/Sites
Running Standard
Packaged Apps
Running a Standard
DBMS
High Performance
Computing and Big Data
VMs for a
Dev/Test Lab
Running New
Cloud-Native Apps
Disaster Recovery
Virtual Data Center
(VMs for On-Demand Use)
Developer Operations
Best Suited For
Reliability
Provided By
Main Benefits
Typical Buyer
Examples
Typical
Management Tools
A much bigger
market today
Leaders in Gartner
Magic Quadrant for
Public Cloud IaaS
IaaS TypeHypervisorOffering
CSC VMware OperationsCloudCompute
DeveloperHyper-VMicrosoftWindows Azure Persistent
VM Role
Amazon Xen DeveloperElastic Compute Cloud
(EC2)
HP KVM DeveloperCloud Compute
Rackspace Xen DeveloperCloud Servers
IBM KVM DeveloperSmartCloud
Enterprise
Terremark VMwareOperations,
Developer
Enterprise Cloud, vCloud
Express
Savvis VMware OperationsSymphony VPDC
Bluelock VMware OperationsBluelock Virtual
Datacenters
GoGrid Xen DeveloperCloud Servers
Storage
Languages/ Frameworks
Offering
HerokuRuby/Rails, JavaScript/ Node.js,
Java, …
Relational (MySQL, Postgres, …),
NoSQL (Redis, …)Heroku
Heroku runs on EC2 and is
owned by Salesforce
Amazon Java/ServletsRelational (RDS),
NoSQL (SimpleDB, DynamoDB), …Elastic Beanstalk
Beanstalk is a simple extension to
EC2
Google Java, Python, GoRelational (CloudSQL),
NoSQL (Datastore), BlobsApp Engine
App Engine has undergone many
recent changes
Salesforce Apex/AppForce FrameworkNoSQL
(Database.com)AppForce
Pricing is per user, not based on
resources used
Oracle Java/Java EE (WebLogic) Relational (Oracle DBMS)Oracle Public Cloud Announced October 2011
Relational (SQL Azure),
NoSQL (Tables), BlobsC# and VB/.NET, PHP,
JavaScript/Node.js, …Microsoft
Windows Azure
Web/Worker Roles
Designed to be a fully PaaS
platform
Engine YardRuby/Rails,
PHPRelational (MySQL),
NoSQL (Redis)
EngineYard Cloud,
Orchestra PHP
Runs on EC2; enterprise version
runs on Terremark
Comments
LongJumpJava and JavaScript/ LongJump
FrameworkNoSQL (Proprietary)LongJump Cloud
Applications Platform
Runs on Rackspace; also sells
PaaS software separately
IBMNone; focused on tools for
deploying/managing appsRelational (DB2)
IBM SmartCloud
Application ServicesAnnounced October 2011;
not really a PaaS platform