Accelerating Discovery in Science and Engineering

31
Accelerating Discovery in Science and Engineering Fabrizio Gagliardi Director – EMEA & LATAM Technical Computing Microsoft Corporation

description

Accelerating Discovery in Science and Engineering. Fabrizio Gagliardi Director – EMEA & LATAM Technical Computing Microsoft Corporation. Introduction. Some personal introductory remarks Progress in grid computing Microsoft progress in HPC Microsoft technology for science - PowerPoint PPT Presentation

Transcript of Accelerating Discovery in Science and Engineering

Page 1: Accelerating Discovery in Science and Engineering

Accelerating Discovery in Science and Engineering Accelerating Discovery in Science and Engineering

Fabrizio GagliardiDirector – EMEA & LATAMTechnical ComputingMicrosoft Corporation

Fabrizio GagliardiDirector – EMEA & LATAMTechnical ComputingMicrosoft Corporation

Page 2: Accelerating Discovery in Science and Engineering

2

IntroductionIntroduction

• Some personal introductory remarks• Progress in grid computing• Microsoft progress in HPC• Microsoft technology for science• Engagements in science• Conclusions

Page 3: Accelerating Discovery in Science and Engineering

3

Some personal introductory remarks

Some personal introductory remarks

• I am again here: since 2001 I have not missed this event a single time!

• Happy to be associated with the pioneering work of Poland in HPC, networking and Grid computing

• Honoured to witness the present success• Good opportunity to review the progress of my activity

since last year• Last year I spoke about e-infrastrcuture, Grids and

Microsoft plans for Science • Let’s review the progress now

Page 4: Accelerating Discovery in Science and Engineering

4

Progress in grid computingProgress in grid computing• Microsoft has sponsored GGF16 and GGF17 and took

the initiative of proposing a HPC profile within the OGSA WG; a Data Management profile is also being discussed

• On the application side we were prime sponsor at HealthGrid in Valencia with a key note by David Heckerman (AIDS vaccine research)

• Rapid adoption from IT industry is essential for the future of Grid technology : GGF and EGA have merged in the Open Grid Forum (OGF) and held the first conference in Washington early September this year

• Industry is now represented in a board of directors: all major vendors including Microsoft (Tony Hey)

• Microsoft is also participating in the AdCom (myself) and in some of the WGs (OGSA and Security)

Page 5: Accelerating Discovery in Science and Engineering

5

Progress in grid computing 2Progress in grid computing 2

• Major issues which still remain to bring grid computing from academy to industry and commerce are:

• Security• Interoperability• Easy to integrate and use• Reliability of the infrastructure• Adequate new business models

• Microsoft is now considering most of those issues in the context of OGF

Page 6: Accelerating Discovery in Science and Engineering

6

Microsoft progress in HPCMicrosoft progress in HPC

• Windows Computer Cluster Software released• Microsoft HPC institutes successful experience around

the world

Page 7: Accelerating Discovery in Science and Engineering

Microsoft Compute Cluster Server

What it does : • Solution for High-Performance Computing application at a medium-low range of the scale• Simplified administration and job management• Built-in job scheduler and MPI lib• Four basic job scheduling policies supported in V1

Key advantages:

• Fully integrated cluster solution• Interoperability with Unix systems• Leverages existing Windows infrastructure and security

DDatataa

Inp

Inp

ut

ut

JobJob Policy, Policy, reportsreports

Man

ag

em

en

tM

an

ag

em

en

t

DB/FS

High speed, low latency High speed, low latency interconnectinterconnect

UserUser

Job

Job

AdminAdmin

Job MgmtJob Mgmt

Resource MgmtResource Mgmt

Cluster MgmtCluster Mgmt

SchedulingScheduling

Desktop AppDesktop App

Cmd lineCmd line

Head NodeHead Node

Node ManagerNode Manager

User AppUser AppJob ExecutionJob Execution

Active Active DirectoryDirectory

MPIMPI

AdminAdmin Console Console

Cmd lineCmd line

User ConsoleUser Console

Page 8: Accelerating Discovery in Science and Engineering

8

Cornell Theory CenterIthaca, NY USA

University of TennesseeKnoxville, TN USA

University of VirginiaCharlottesville, VA USA

University of UtahSalt Lake City, UT USA

TACC – University of TexasAustin, TX USA

Southampton UniversitySouthampton, UK

HLRS – University of StuttgartStuttgart, Germany

Shanghai Jiao Tong UniversityShanghai, PRC

Tokyo Institute of TechnologyTokyo, Japan

Nizhni Novgorod University

Nizhni Novgorod, Russia

Institutes for High Performance Computing

Page 9: Accelerating Discovery in Science and Engineering

9

HPC Market Trends

Page 10: Accelerating Discovery in Science and Engineering

10

Top 500 Supercomputer TrendsTop 500 Supercomputer Trends

Industry usage rising

Clusters over 50%

x86 is leading

GigE is gaining

Page 11: Accelerating Discovery in Science and Engineering

11

Supercomputing Goes PersonalSupercomputing Goes Personal1991 1998 2005

System Cray Y-MP C916 Sun HPC10000 Small Form Factor PCs

Architecture 16 x Vector4GB, Bus

24 x 333MHz Ultra-SPARCII, 24GB, SBus

4 x 2.2GHz Athlon644GB, GigE

OS UNICOS Solaris 2.5.1 Windows Server 2003 SP1

GFlops ~10 ~10 ~10

Top500 # 1 500 N/A

Price $40,000,000 $1,000,000 (40x drop) < $4,000 (250x drop)

Customers Government Labs Large Enterprises Every Engineer & Scientist

Applications Classified, Climate, Physics Research

Manufacturing, Energy, Finance, Telecom

Bioinformatics, Materials Sciences, Digital Media

Page 12: Accelerating Discovery in Science and Engineering

12

Technology challengesTechnology challenges

Moore’s law continues but power consumption and heat dissipation are reaching their limits

Memory and data access gap widen

Applications become more data intensive

Moore’s law continues but power consumption and heat dissipation are reaching their limits

Memory and data access gap widen

Applications become more data intensive

Page 13: Accelerating Discovery in Science and Engineering

13

The Future: Supercomputing on a ChipThe Future: Supercomputing on a Chip

IBM Cell processor256 Gflops today4 node personal cluster => 1 Tflops32 node personal cluster => Top100

MS Xbox3 custom PowerPCs + ATI graphics processor1 Tflops today$3008 node personal cluster => “Top100” for $2500 (ignoring all that you don’t get for $300)

Intel many-core chips“100’s of cores on a chip in 2015” (Justin Rattner, Intel: http://www.hpcwire.com/hpc/629783.html )“4 cores”/Tflop => 25 Tflops/chip

IBM Cell processor256 Gflops today4 node personal cluster => 1 Tflops32 node personal cluster => Top100

MS Xbox3 custom PowerPCs + ATI graphics processor1 Tflops today$3008 node personal cluster => “Top100” for $2500 (ignoring all that you don’t get for $300)

Intel many-core chips“100’s of cores on a chip in 2015” (Justin Rattner, Intel: http://www.hpcwire.com/hpc/629783.html )“4 cores”/Tflop => 25 Tflops/chip

Page 14: Accelerating Discovery in Science and Engineering

14

The Microsoft project in BarcelonaThe Microsoft project in Barcelona

Microsoft is interested in helping computer scientists to develop new computing architectures with a high level of parallelism

Mateo Valero and his BSC centre in Barcelona are leaders in this field in Europe

Microsoft will collaborate with BSC to research and develop an entirely new parallel computing ecosystem

http://www.hpcwire.com/hpc/633342.html

Microsoft is interested in helping computer scientists to develop new computing architectures with a high level of parallelism

Mateo Valero and his BSC centre in Barcelona are leaders in this field in Europe

Microsoft will collaborate with BSC to research and develop an entirely new parallel computing ecosystem

http://www.hpcwire.com/hpc/633342.html

Page 15: Accelerating Discovery in Science and Engineering

15

Microsoft Technical Computing:Microsoft Technical Computing:Radical Computing

Research in potential breakthrough technologies

Advanced Computing for Science and Engineering

Application of new algorithms, tools and technologies to scientific and engineering problems

High Performance Computing and toolsApplication of high performance clusters and database technologies to industrial applications

Application of existing and new tools for science

Radical ComputingResearch in potential breakthrough technologies

Advanced Computing for Science and Engineering

Application of new algorithms, tools and technologies to scientific and engineering problems

High Performance Computing and toolsApplication of high performance clusters and database technologies to industrial applications

Application of existing and new tools for science

Page 16: Accelerating Discovery in Science and Engineering

Can “Here and Now” technologies accelerate discovery?

Can “Here and Now” technologies accelerate discovery?

Can “Business” Tools and techniques for dealing with

be used in scientific research to allow researchers to be scientists and not computer scientists…

Page 17: Accelerating Discovery in Science and Engineering

17

ComputationalModeling

Real-worldData

Interpretation& Insight

PersistentDistributed

Data

Workflow,Data Mining& Algorithms

Page 18: Accelerating Discovery in Science and Engineering

ComputationalModeling

Real-worldData

Interpretation& Insight

PersistentDistributed

Data

Workflow,Data Mining& Algorithms

Page 19: Accelerating Discovery in Science and Engineering

The Problem for the e-ScientistThe Problem for the e-Scientist

Data ingest

Managing a petabyte

Common schema

How to organize it?

How to reorganize it?

How to coexist & cooperate with others?

Data ingest

Managing a petabyte

Common schema

How to organize it?

How to reorganize it?

How to coexist & cooperate with others?

Data Query and Visualization Data Query and Visualization tools tools

Support/trainingSupport/training PerformancePerformance

Execute queries in a minute Execute queries in a minute Batch (big) query schedulingBatch (big) query scheduling

Experiments &Instruments

Simulationsfacts

facts

answers

questions

?Literature

Other Archives facts

facts

Page 20: Accelerating Discovery in Science and Engineering

Visual Programmin

g

PersistentDistributed

Storage

Page 21: Accelerating Discovery in Science and Engineering

Distributed Computatio

n

Interoperability & Legacy Support via

Web Services

Page 22: Accelerating Discovery in Science and Engineering

Live Documents

Searching &

Visualization

Reputation& Influence

Page 23: Accelerating Discovery in Science and Engineering

23

Faster Time to Insight

Better integration to existing Windows infrastructure

Integrated and familiar development environment

Faster Time to Insight

Better integration to existing Windows infrastructure

Integrated and familiar development environment

Page 24: Accelerating Discovery in Science and Engineering

Data acquisition from Data acquisition from source systems and source systems and integrationintegration

Data transformation Data transformation and synthesisand synthesis

Data enrichment, Data enrichment, with business logic, with business logic, hierarchical viewshierarchical views

Data discovery via Data discovery via data miningdata mining

Data presentation Data presentation and distributionand distribution

Data access for Data access for the massesthe masses

IntegrateIntegrate AnalyzeAnalyze ReportReport

ResearchResearch

Page 25: Accelerating Discovery in Science and Engineering

Water Content at 5 cm

y = 0.4712x

R2 = 0.70390.0

0.1

0.2

0.3

0.4

0.5

0.6

0.0 0.1 0.2 0.3 0.4 0.5 0.6

Tonzi

Vai

ra

Water Content at 20 cm

y = 0.5854x

R2 = 0.9163

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.0 0.1 0.2 0.3 0.4 0.5 0.6

Tonzi

Vai

ra

Comparison of soil moistureComparison of soil moisture

Thanks to Gretchen Miller – UC Berkeley& Catharine Van Ingen (MSR)

Page 26: Accelerating Discovery in Science and Engineering

BusinessBusinessIntelligenceIntelligence

SharePoint Products and TechnologiesMicrosoft Office SharePoint Server 2007SharePoint Products and TechnologiesMicrosoft Office SharePoint Server 2007

CollaborationCollaboration

SearchSearch

PortalPortalBusinessBusinessFormsForms

PlatformPlatformServicesServices

Workspaces, Mgmt,Workspaces, Mgmt,Security, Storage,Security, Storage,

Topology, Site ModelTopology, Site Model

ContentContentManagementManagement

Server-based Excel Server-based Excel spreadsheets and data spreadsheets and data visualization, Report visualization, Report Center, BI Web Parts, Center, BI Web Parts,

KPIs/DashboardsKPIs/Dashboards

Integrated document Integrated document management, records management, records

management, and Web management, and Web content management content management

with policies and with policies and workflowworkflow

Rich and Web Rich and Web forms based forms based

front-ends, LOB front-ends, LOB actions, actions,

enterprise SSOenterprise SSO

Docs/tasks/calendars, Docs/tasks/calendars, blogs, wikis, e-mail blogs, wikis, e-mail integration, project integration, project management “lite”, management “lite”, Outlook integration, Outlook integration,

offline docs/listsoffline docs/lists

Enterprise scalability,Enterprise scalability,contextual relevance, contextual relevance,

rich rich people and business people and business

data searchdata search

Enterprise Portal Enterprise Portal template, Site template, Site Directory, My Directory, My Sites, social Sites, social networking, networking,

privacy controlprivacy control

Page 27: Accelerating Discovery in Science and Engineering

Excel 2007Excel 2007

Design and Design and authorauthor

BrowserBrowser

High quality web rendingHigh quality web rending

Zero-footprintZero-footprint

Interactive: Set parameters, Interactive: Set parameters, sort, filter, exploresort, filter, explore

Limit to browser accessLimit to browser access

View and View and Interact Interact

CustomCustomapplicationsapplications

Set values, perform calculations, get Set values, perform calculations, get updated values via web servicesupdated values via web services

Retrieve full workbook fileRetrieve full workbook file

Programmatic AccessProgrammatic Access

Open in Excel for rich Open in Excel for rich exploration and analysisexploration and analysis

Open snapshotsOpen snapshots

Excel 2007Excel 2007

Export/Snapshot into ExcelExport/Snapshot into Excel

Spreadsheets stored in Spreadsheets stored in document librariesdocument libraries

Spreadsheet calculation and Spreadsheet calculation and rendering rendering

External data retrieval and cachingExternal data retrieval and caching

100% calculation fidelity100% calculation fidelity

SharePoint platform and Excel SharePoint platform and Excel servicesservices

Publish Publish SpreadsheetsSpreadsheets

Excel ServicesOverviewExcel ServicesOverview

Page 28: Accelerating Discovery in Science and Engineering

.NET & Visual Studio

F#

Iron Python

SQL Sever

SQL Server analysis Services

Windows Workflow

SharePoint Server 2007

Instant Messenger

ConferenceXP

Academic Live, Onfolio…

.NET & Visual Studio

F#

Iron Python

SQL Sever

SQL Server analysis Services

Windows Workflow

SharePoint Server 2007

Instant Messenger

ConferenceXP

Academic Live, Onfolio…

Development:

Data:

Workflow:

Collaboration:

Publications:

Development:

Data:

Workflow:

Collaboration:

Publications:

Page 29: Accelerating Discovery in Science and Engineering

Questions to our scientist colleagues?

Questions to our scientist colleagues?Can these tools/technologies provide value/insight to scientists?

What’s missing?Ie. on HPC, analysis, etc?

How best to test/integrate these technologies?

How to communicate these ideas?Conferences, Workshops, Website?

Sharecode, Samples

Can these tools/technologies provide value/insight to scientists?

What’s missing?Ie. on HPC, analysis, etc?

How best to test/integrate these technologies?

How to communicate these ideas?Conferences, Workshops, Website?

Sharecode, Samples

Page 30: Accelerating Discovery in Science and Engineering

ConclusionsConclusionsIndustry is moving HPC to commodity

Microsoft is world leader in commodity computing and will play a major role in scientific and technical computing solutions

Key figures in scientific computing such as Burton Smith, Tony Hey have recently joined the company in senior strategic positions

We are interested in getting your opinion and collaborating with you to develop the most productive computing environment for science

Thanks again for the invitation and see you next year!!!

Industry is moving HPC to commodity

Microsoft is world leader in commodity computing and will play a major role in scientific and technical computing solutions

Key figures in scientific computing such as Burton Smith, Tony Hey have recently joined the company in senior strategic positions

We are interested in getting your opinion and collaborating with you to develop the most productive computing environment for science

Thanks again for the invitation and see you next year!!!

Page 31: Accelerating Discovery in Science and Engineering

More info:More info:

Windows HPwww.windowshpc.net

Data miningwww.sqlserverdatamining.com/

Develop without Borders Challengewww.developwithoutborders.com

Technical Computing Blog http://blogs.msdn.com/eScience

Windows HPwww.windowshpc.net

Data miningwww.sqlserverdatamining.com/

Develop without Borders Challengewww.developwithoutborders.com

Technical Computing Blog http://blogs.msdn.com/eScience