Data Laden Clouds Trends and Insights Roger Barga, PhD Architect eXtreme Computing Group, MSR...
-
Upload
lesley-tilton -
Category
Documents
-
view
225 -
download
1
Transcript of Data Laden Clouds Trends and Insights Roger Barga, PhD Architect eXtreme Computing Group, MSR...
Data Laden CloudsTrends and Insights
Roger Barga, PhDArchitect
eXtreme Computing Group, MSR
[email protected] http://research.microsoft.com/barga
• Technology trends• Riding the exponentials
• Convergent invisibility• NUIs and computing on behalf
• Client+Cloud experiences• Opportunity for Data and Analytics
• Cloud infrastructure challenges• Packaging, hardware, software, security
• Thoughts on the future
Presentation Outline
It’s Easy To Forget That Not Very Long Ago …
• There were few or no experiences with …• Web sites, email, spam, phishing, computer
viruses• e-commerce, digital photography or telephony
• Cell phones were rare and expensive• A portable cassette player was still cool• HiFi was more common than WiFi• A “friend” was someone you actually knew
The future depends on vision and context …
Pre-PC Era(1980)
PC Era(1995)
Internet Era(2000)
Consumer Era(Today+)
21st century implicit and natural computing• Increasingly natural interfaces• Embedded intelligence in everyday objects• Ubiquitous network access and cloud
services
Computing Eras: Paucity To Plethora
MainframeEra
What Has Changed? • System on a chip designs• Powerful mobile devices
• Graphics processing units • High quality graphics
• Explosive data growth• Ubiquitous sensors and
media
• Inexpensive embedded computing• Everyday smart objects, …
• Wireless spectrum pressure• Mobile device growth
• New software models• Social networks, clients+clouds
…LPIA x86
LPIA x86
DRAM ctlr
DRAM ctlr
OoO x86
LPIA x86
LPIA x86
1 MB cache
1 MB cache
LPIA x86
LPIA x86
1 MB cacheGPU GPU
PCIe ctlr
PCIe ctlr NoC NoC 1 MB
cache1 MB cache
LPIA x86
LPIA x86
1 MB cacheGPU GPU
LPIA x86
LPIA x86
1 MB cache
1 MB cache
LPIA x86
LPIA x86
DRAM ctlr
DRAM ctlr
OoO x86
LPIA x86
LPIA x86
DRAM ctlr
DRAM ctlr
DRAM ctlr
DRAM ctlr
LPIA x86
LPIA x86
LPIA x86
LPIA x86
1 MB cache
1 MB cache
1 MB cache
1 MB cache
LPIA x86
LPIA x86
LPIA x86
LPIA x86
1 MB cache
1 MB cache
1 MB cache
1 MB cache
LPIA x86
LPIA x86
LPIA x86
LPIA x86
1 MB cache
1 MB cache
1 MB cache
1 MB cache
LPIA x86
LPIA x86
PCIe ctlr NoC NoC NoC NoC NoC NoC PCIe
ctlrLPIA x86
LPIA x86
1 MB cache
1 MB cache
1 MB cache
1 MB cache
LPIA x86
LPIA x86
LPIA x86
LPIA x86
1 MB cache
1 MB cache
1 MB cache
1 MB cache
LPIA x86
LPIA x86
LPIA x86
LPIA x86
1 MB cache
1 MB cache
1 MB cache
1 MB cache
LPIA x86
LPIA x86
LPIA x86
LPIA x86 Custom accelerationLPIA
x86LPIA x86
Server
Desktop
MobileLPIA
x861 MB cache
1 MB cache
DRAM ctlr
LPIA x86
1 MB 1 MB cache
PCIe ctlr
GPU
GPUcache
• Multidisciplinary challenges are the present and future… and the tools must empower, not frustrate
• A computation task has four characteristic demands:• Networking – delivering questions and answers• Computation – transforming information to produce new information• Data access – access to information needed by the computation• Data storage – long term storage of information
• The ratios among these and their costs are critical
New applications and systems will arise… if we create the right environment
Orders of Magnitude Always Matter
• Your car drives and navigates for you… and also parks the car (already a feature on some cars)
• Your sound system only plays music you love… because it knows about every song you’ve ever heard
• Your phone only rings when you want to answer… because it knows your emotional state and social context
• All your family memories are recorded automatically… via MEMS-based sensors and solid state storage
• Your body calls an ambulance when you’re ill… via implanted, biologically powered diagnostic sensors
• Your DNA sample and lifestyle determine personalized treatment… because genotype-phenotype models are specific
• Your office adjusts its behavior to your needs… because it knows what you want to do
Imagine a Future Where …
8
Successful Technologies Are Invisible
SpreadsheetsWord processors
DOS1981-1995
NUI2012-…
AnticipatoryHuman-centric
TO
DA
Y
Location-based appsSocial networks
CLIENT+CLOUD2006-present
INTERNET1993-present
EmailWeb browsers
GUI1985-present
Desktop publishingMultimedia
Enhanced GUI NUI
Gestures
Voice
Environment
ContextTasks
Expressions
Multi-touch
Speech
Handwriting
Single Touch
Versus
An FFT?• No, it’s an algorithm
A rendering pipeline?• No, it’s a software library
A feature recognition system?• No, it’s a building block
Our notion of “application” is increasingly complex• Many integrated and interoperating components
Our tools must enable creativity accordingly, creating experiences
What Is An Application?Microsoft Kinect
Working at your command
Working on your behalf
Fixed Portable Specialty Mobile
Create the Experience
The CloudThe Clients
Intelligent Objects
15
Client
Cloud
HybridExperiences
AugmentedInteraction
ContextAwareness
EnvironmentAwareness
AnticipatoryProcessing
AdaptiveBehavior
Public DataServices
Trust & SecurityServices
Private DataServices
SensoryInputs
The Future of Experiences
Computer Rooms: Cloud COGS MatterWhat’s A Cloud?
17
2005 2006 2007 2008 2009 2010
Chicago and Dublin Generation 3
Data Center Co-Location Generation
1
Modular Data Center Generation 4
Quincy and San Antonio Generation 2
ContainersServer IT PACRack
Facility PAC
Microsoft’s Data Center Evolution And Economics
Deployment Scale Unit
Time to MarketLower TCO
Scalability & Sustainability
Density & Deployment
Capacity
Discovery and Innovation in 2020
In the last two decades advances in computing technology, from processing speed to network capacity and the Internet, have revolutionized the way scientists work.
From sequencing genomes to monitoring the Earth's climate, many recent scientific advances would not have been possible without a parallel increase in computing power –
and with revolutionary technologies such as the quantum computer edging towards reality, what will the relationship between computing and science bring us over the next 15 years?
http://research.microsoft.com/towards2020science
”
“
19
70M
1M
14M
High Performance Data-intensive Capacity
80%
20%14M
1M
Scientists & Engineers
55M Little to no access to high performance data-intensivecapacity
Lack of Broad Access
New Bytes of Information in 2010Source: IDC, as reported in The Economist, Feb 25, 2010
20
1.2 x 1021
21Sources: The Economist, Feb ‘10; IDC
By 2016 the New Large Synoptic Survey Telescope in Chile will acquire 140 terabytes in 5 days - more than Sloan acquired in 10 years
In 2000 the Sloan Digital Sky Survey collected more data in its 1st week than was collected in the entire history of Astronomy
The Large Hadron Collider at CERN generates 40 terabytes of data every second
22
Economics of Storage
22Source: Wired Magazine April 2010; Figures represented in USD
2000
Disk Storage(per gigabyte)
Web Storage (per gigabyte)
2001200220032004200520062007200820092010
$44.56 $1,250$0.07 $0.15
But remember,… free storage is like free puppies
• Hypothesis-driven• “I have an idea, let me verify it.”
• Exploratory• “What correlations can I glean?”
• Different tools and techniques• Rapid exploration of alternatives• Data volume and complexity are assets• … and challenges
• Simplicity really matters
Social Implications of the Data Deluge
Let Researchers Be Researchers…
Most researchers do not want to be system administrators
They don’t want to learn to use supercomputers
They want to focus on their research
They use standard tools: spreadsheets, statistical packages, desktop visualizationProgramming = modifying a few parameters in a trusted scripting language
Let Researchers Be Researchers…
BUT …
The data deluge means they must solve problems 10000 times the capacity of their desktop
Research is now interdisciplinarySharing access to large data collections and analysis tools is the future
A paradigm shift is coming
• Magnify client power via the cloud• Data analysis and computation
• Persist and share data in the cloud• Multidisciplinary data fusion and scale
• Leverage via client tools/metaphors• Analysis acceleration (Excel, toolkits, codes)• Remote rendering and client visualization• Data provenance, collaboration …
Seamless Client Plus CloudCompute Blob
Storage …Table Storage
NCBI BLAST
BLAST (Basic Local Alignment Search Tool) • One of the most important software in bioinformatics• Identify similarity between bio-sequences
Computationally intensive• Large number of pairwise alignment operations• A normal BLAST running could take 700 ~ 1000 CPU hours
For most biologists, two choices to run large jobs• Build a local cluster • Submit jobs to NCBI or EBI (long job queue times)
R. palustris as a platform for H2 productionIdentify key drivers for producing hydrogen, promising
alternative fuel – understand R. palustris well enough to be able to improve its H2 production;
Characterize a population of strains and use integrative genomics approaches to dissect the molecular networks of H2 production;
BLAST to query 16 strains to sort out genetic relationships• Each strain, estimated ~5,000 proteins • Jobs kicked off NCBI clusters before completion• Against NCBI non-redundant proteins in ~30 min• Against ~5,000 proteins from another strain < 30 sec• Publishable result in one day for roughly $150.
Eric Schadt, Pac Bio and Sam Phattarasukol Harwood Lab, UW
NCBI BLAST on Windows Azure• Parallel BLAST engine on Azure
• Query-segmentation data-parallel pattern• split the input sequences• query partitions in parallel• merge results together when done
• Follows the general suggested application model for Window Azure • Web Role + Queue + Worker
• With three special considerations• Batch job management• Task parallelism on an elastic Cloud• Large data-set management
AzureBLAST Task-Flow
A simple split/Join pattern
Leverage multi-core of one instance • argument “–num_threads” of NCBI-BLAST
Task granularity • Large partition load imbalance • Small partition unnecessary overheads• NCBI-BLAST overhead• Data transferring overhead.
Best Practice: test runs to profile and set size to mitigate the overhead
BLAST task
Splitting task
BLAST task
BLAST task
BLAST task
…
Merging Task
Micro-Benchmarks Inform DesignTask size vs. Performance• Benefit of the warm cache effect• 100 sequences per partition is the best
choice
Instance size vs. Performance• Super-linear speedup with larger size
worker instances• Primarily due to the memory capability.
Task Size/Instance Size vs. Cost• Extra-large instance generated the best
and the most economical throughput• Fully utilize the resource
All-Against-All ExperimentDiscovering Homologs • BLAST Uniref100, non-redundant protein sequence database• Discover the interrelationships of known protein sequences
“All against All” query• The database is also the input query• The protein database is large (4.2 GB size)
• Total of 9,865,668 sequences to be queried• Theoretically, 100 billion sequence comparisons!
Performance estimation• Estimated completion, 3,216,731 minutes (6.1 years) on 8 core VM
One of biggest BLAST jobs as far as we know• This scale of experiment is usually infeasible to most researchers
Our Approach• Allocated a total of ~4000 instances • 475 extra-large VMs (8 cores per VM)
• 8 deployments of AzureBLAST• Each deployment has its own co-located storage service
• Divide 10 million sequences into multiple segments• Each will be submitted to one deployment as one job for execution• 300,000 tasks on 3500 cores on Azure (70,000 bp or 35 sequences per
task)
Cloud System Upgrades
North Europe Data Center, totally 34,256 tasks processed
All 62 nodes lost tasks and then came back together. This is an update domain
~30 mins
~ 6 nodes in one group
35 Nodes experience blob
writing failure at the same time
Failures HappenWest Europe Datacenter; 30,976 tasks are completed, and job was killed
Reasonable guess: Fault Domain is
working
Release of BLAST on Windows Azure• Open source release of NCBI BLAST on Windows Azure
(CTP)• Installation guide and users guide;• Developers guide in preparation; • Support the release for the next year – feature requests, fixes,
…• Free access to NCBI reference data sets on Windows Azure,
auto update; http://research.microsoft.com/azure
• Software can be installed and used immediately, customized for your institution (logos, private database, group databases), extend source
• Releasing result data from “all-against-all” run• BLAST Uniref100, non-redundant protein sequence database• Discover the interrelationships of known protein sequences• Available Dec. 1st, 2010.
Microsoft Client+Cloud Partnership
• Azure cloud services• Storage and computing
• Tier one support• Hardware and Azure software
• Hosted data sets• Multidisciplinary data analysis
• Technical engagement team• Community collaborations• Application support
One step of a worldwide program
HPC and Clouds: Twins Separated At Birth
• Similar technology issues• Node and system architectures • Communication fabrics• Storage systems and analytics• Physical plant and operations• Programming models• Reliability and resilience
• Differing culture and sociology• Design and operations• Management and philosophy
Cloud/HPC Hardware Comparison
Predominate differences today• Network architecture and SAN storage
Dan Reed’s hypotheses• Convergence is coming
Attribute HPC Cloud
Processor High-end x86 x86
Memory/Node 1-8 GB 8 GB+
Local Disk Scratch only Permanent storage
SAN Storage Common Rare
Tertiary Storage Common Rare
Interconnect Infiniband or 10 GigE 1 GigE/10GigE
Network Flat Hierarchical
• Environmental responsibility• Managing under a 100 MW envelope• Adaptive systems management
• Provisioning 100,000 servers• Hardware: at most one week after delivery• Software: at most a few hours
• Resilience during a blackout/disaster• Data center failure• Service rollover for 20M customers
• Programming the entire facility• Power, environmentals, provisioning• Component tracking, resilience, …
Cloud Scaling: Lessons for HPC Exascale
Consistency• Weak consistency is
goodComponent failure• Failure as a first class
objectSystemic resilience• Upgrade during
operation• Never go down
Rethinking Node Architecture
Windows AzureLive Services
Applications
Applications
SQL Azure
OthersWindowsMobile
WindowsVista/XP
WindowsServer
.NET Services
Fabric
Storage
Config
Compute
Application
Windows Azure
• Break the LAN hierarchy• Multiple paths, commodity components• High bisection bandwidth
• We build WAN islands, not continents• Isolated facilities with limited connectivity
• Change the landscape• Serious, multiple terabit WANs• Many lambdas entering a facility• Fused node/LAN/WAN infrastructure
Rethinking LAN/WAN Networking
• People and hardware need not mix• Hardware cooling standards are conservative• Reliable at high temperature/humidity
• Optimize for efficiency• Cooling is (often) unnecessary• Design for ambient environments
• Energy reliability is (often) unnecessary• Design for power outages
• Use larger building blocks• Accept component failures
Rethinking Packaging and Cooling
Temperature
Hum
idity
Truncated Life Models
Performance andFailure Data
MarkovPerformability
Models
TCO andProvisioning
DesiredLifetime
UtilityThreshold
Elapsed Time
Perf
orm
ance
• Factory sealed units (FRUs)• Over-provisioned for failure• Dynamic reconfiguration• Real-time, adaptive control
Rethinking Reliability: Fail In Place
• Power redundancy is a major cost• Batteries to supply up to 15 minutes
• Use multiple sites, based on energy cost and carbon footprint• Electrical grid, solar, wind, fuel cell, …• Workload dispatching based on models
• Real-time optimization and prediction• Workload demand• Weather and seasonal models• Auction-based energy pricing• Infrastructure
• UPS, optical fiber and computing
Rethinking Energy Provisioning
• Draw the right bounding box• It defines the problem you solve
• Understand your workload• Use only the hardware you need
• Metrics reward and punish• Choose carefully what you measure
• Embrace component failure• Hardware is cheap and readily recyclable
• Machines and people do not mix well• Consider sealing hardware at the factory
• Engage multidisciplinary solutions• Mechanical, electrical, economic, social …
• Culture shapes behavior• Implicit versus explicit costs
Some Research/Design Thoughts
47
END-TO-END TRUST
INTELLIGENT MANAGEMENT
GLOBAL POLICY FRAMEWORK
HOLISTIC DESIGN
NEW EXPERIENCES
End-to-End Perspective
© 2010 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions,
it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
Questions?...