Making Infrastructure Invisible

34
Making Infrastructure Invisible Dan Reed Corporate Vice President

Transcript of Making Infrastructure Invisible

Making Infrastructure Invisible

Dan Reed

Corporate Vice President

2

• Multidisciplinary infrastructure challenges

• Simplify, simplify, simplify

• One perspective on the future

Presentation Outline

“The future is here, it is just not evenly distributed.”

William Gibson

3

Rethink the nature of computing at extreme scale, from

alternative, quantum computing models, through the

transformative effects of manycore parallelism on programming

systems and architectures, to massive cloud computing designs

that each drive consumer, business and social applications and

create value for Microsoft.

MSR eXtreme Computing Group (XCG)

extreme: (ik-ˈstrēm) • either of the two limits of a scale or range • of a high or the highest degree or intensity • a maximum or minimum value of a function

4

Pre-PC Era (1980) PC Era

(1995)

Internet Era (2000)

Consumer Era (Today+)

• Implicit computing (21st century)

• Natural interfaces and computing on behalf

• Embedded intelligence

• Adaptive, mobile context and client plus cloud

• Number of cores/person infinity

The Many Computing Eras

Mainframe Era

5

Successful Technologies Are Invisible

6

Intuitive Discovery: Invisible Infrastructure

Scholarly communications Domain-specific

services

Instant messaging

Identity

Document store

Blogs & social networking

Mail

Notification

Search books citations

Visualization and analysis services

Storage/data services

Compute Services and virtualization

Project management

Reference management

Knowledge management

Knowledge discovery

Source: Tony Hey

7

• Bulk computing is almost free

• … but software and power are not

• Inexpensive sensors are ubiquitous

• … but scientific data fusion remains difficult

• Moving lots of data is {still} hard

• … because we’re missing trans-terabit/second networks

• People are really expensive!

• … and robust software remains extremely labor intensive

• Research challenges are changing

• … and social engineering is not our forte

Today’s Truisms (2009)

8

• Complex models

• Multidisciplinary interactions

• Wide temporal and spatial scales

• Large multidisciplinary data

• Real-time steams

• Structured and unstructured

• Distributed communities

• Virtual organizations

• Socialization and management

• Diverse expectations

• Client-centric and infrastructure-centric

Changing Nature of Research

http://research.microsoft.com/en-us/collaboration/fourthparadigm/

MSR External Research

9

• Easy to use software

• Low entry cost and high productive

• Insatiable demand

• Cycles, storage, support

• Distributed acquisition/deployment

• Duplicative, non-shared infrastructure

• Distributed cost structures

• Power, space, staff, staff, hardware

• Long-term sustainability

• Decades rather than months/years

Research Infrastructure Challenges

10

• Dynamic adaptive, on-demand system

• Changes in response to weather

• Responds to user inputs

• Steers remote observing technologies

LEAD: Adaptive Weather Prediction Fetch data

Prepare data

Post-process data

Decision?

Ensemble details

User initiated

Arriving data

Need more runs

Need more data

Ensemble runs …

Source: LEAD Team

11

Kepler

Triana

BPEL

Ptolemy II

So Many Choices …

12

• Many workflows with deadlines

• Severe weather events

• Disaster response, …

• Complex workflow sensitivity

• Faults are the norm

• Distributed systems

• Services and resources

• Largely unsuccessful

• Research solutions

• Data consolidation

• Retry/over-provisioning

Scientific Workflow Reliability

13

• Programming in the large

• Issues do not scale linearly

• Eventual consistency wins

• CAP theorem

• Component failure

• Failure as a first class object

• Systemic resilience

• Upgrade during operation

• Never go down

Rethinking Software

MBS Online

Office Labs

Sharepoint.Microsoft.com

14

.NET Services

Windows Azure Live Services

Applications

Applications

SQL Azure

Others Windows Mobile

Windows Vista/XP

Windows Server

Fabric

Storage

Config

Compute

Application

Windows Azure

15

Switches

Windows Azure Fabric Controller

Highly-available Fabric Controller

Out-of-band communication – hardware control In-band communication

– software control

WS08 Hypervisor

VM VM

VM

Control VM

Service Roles Control

Agent

WS08

Node can be a VM or a physical machine

Load-balancers

16

The Data Explosion Experiments Archives Literature Simulations

Petabytes Doubling every

2 years

Consumer

The Challenge Enable Discovery. Deliver the capability to mine, search and analyze this data in near real time

The Response A massive private sector build-out of data centers

17

• Hypothesis-driven

• “I have an idea, let me verify it.”

• Exploratory

• “What correlations can I glean?”

• Different tools and techniques

• Rapid exploration of alternatives

• Testing a single idea

Social Implications of the Data Deluge

18

The Data “Pipeline”: Creating Invisibility

Data Gathering Discovery and

Browsing

Science Exploration Domain-Specific

Analyses

Scientific Output

“Raw” data includes sensor

output, data downloaded

from agency or

collaboration web sites,

papers (especially for

ancillary data)

“Raw” data browsing for

discovery (do I have

enough data in the right

places?), cleaning (does

the data look obviously

wrong?), and light weight

science via browsing

“Science variables” and data

summaries for early science

exploration and hypothesis

testing. Similar to discovery

and browsing, but with

science variables computed

via gap filling, units

conversions, or simple

equation.

“Science variables”

combined with models,

other specialized code, or

statistics for deep science

understanding.

Scientific results via

packages such as MatLab

or R2. Special rendering

package such as ArcGIS.

Paper preparation.

Source: Catherine Van Ingen, MSR

19

Query plan LINQ query

• LINQ: .NET Language Integrated Query

• Declarative SQL-like programming with C# and Visual Studio

• Easy expression of data parallelism

DryadLINQ: From LINQ to Dryad

Dryad

select

where

logs

Automatic query plan generation

Distributed query execution by Dryad

var logentries = from line in logs where !line.StartsWith("#") select new LogEntry(line);

Source: Yuan Yu et al

20

• Storage is cheap (<$1K/TB)

• Storage management is not

• opX > 100 capX

• Goal: opX << capX

Free Storage: Like Free Puppies

21

Fabric

Compute Storage

Application

Blobs Queues HTTP

• SQL Azure also

• SQL in the cloud

Windows Azure Storage Service

Tables

22

• Similar technology issues

• Node and system architectures

• Communication fabrics

• Storage systems and analytics

• Physical plant and operations

• Reliability and resilience

• Differing culture and sociology

• Programming models

• Design and operations

• Management and philosophy

HPC and Clouds: Twins Separated At Birth

23

Exascale Exponentials

User Base

Pe

ak P

erf

orm

ance

Desktop/mobile Exascale

lim ( ) 0Perf

User Perf

Danger, danger

24

Technical Computing: Creating Invisibility

Petascale/Exascale/…

Mobile/desktop computing

Laboratory clusters

University infrastructure

National infrastructure

Data, data, data

Data, data, data

102

109

Cap

able

Use

rs

25

• High value does not imply high utilization

• Rapid response is often more important

• Relaxing utilization enables new opportunities

• High utilization is unnecessary

• Remember Little’s Law

n = XR

High Performance: Not The Only Story

26

• Three groups with differing needs • Heroes (Einstein)

• Mainstream (Joe)

• Entry/Novice (Elvis)

• Each with differing needs

• Parallel heroes “Neurosurgery? No problem!

Hand me that whiskey bottle

and a screwdriver.”

• Mainstream • Typical user or scientist

• Entry/novice • Casual developer or new researcher

Expectations and Needs Vary Widely

27

• $/teraflop-year

• Declining rapidly

• $/terabyte-year

• Also declining rapidly

• $/developer-year

• Rising (even for Elvis)

• Applications

• Outlive systems by many years

• Are rising in complexity

Economic Divergence/Optimization

Time

Cost

Hardware

People

28

• Very few users love technology itself

• Clusters and parallel programming

• Distributed services, grids or clouds

• Data models and databases

• Desktop/mobile acceleration

• Seamlessly accessible

• Standard metaphors/tools

• Intentional not imperative

Research Empowerment

29

• High level toolboxes

• Domain-specific capabilities

• Scripting languages

• Rapid exploration

• Declarative specification

• Intent, not mechanism

• User-centric design

• Human optimized

Simplify, Simplify: Thoreau Was Right

30

Windows Azure: Client Plus Cloud

Windows Azure

Applications

.NET Services

Live Services

SQL Azure

Applications

Others Windows Mobile

Windows Vista/XP/7

Windows Server

www.azure.com

31

Compute

Blob Storage

… Table

Storage

• Seamless integration

• Local and remote data access

• Computation acceleration and scheduling

• Collaboration and sharing

Client to Cloud: Accelerating Desktop Tools

Azure table access via Excel plugin

32

Some Observations …

• Co-locate when possible

• (Usually) higher reliability

• (Usually) lower cost

• Lower access latency

• Robust trumps experimental

• Infrastructure stability

• Success is sustainability

• Human productivity dominates

• Not resource utilization

• Declarative beats imperative

• Focus on the goal

• Domain-specific benefits

• Leverages user expertise

• Success is invisibility

33

Convergence: It’s Happening Now

Manycore

HPC

Clouds

34

© 2009 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.

The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to

be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation.

MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.