Developing a Strategy for e-Science Indiana University Malcolm Atkinson Director e-Science Institute...

40
Developing a Strategy for e-Science Indiana University Malcolm Atkinson Director e-Science Institute UK e-Science Envoy www.nesc.ac.uk 5 th February 2008

Transcript of Developing a Strategy for e-Science Indiana University Malcolm Atkinson Director e-Science Institute...

Page 1: Developing a Strategy for e-Science Indiana University Malcolm Atkinson Director e-Science Institute UK e-Science Envoy  5 th February 2008.

Developing a Strategy for

e-Science

Indiana University

Malcolm AtkinsonDirector e-Science Institute

UK e-Science Envoy

www.nesc.ac.uk5th February 2008

Page 2: Developing a Strategy for e-Science Indiana University Malcolm Atkinson Director e-Science Institute UK e-Science Envoy  5 th February 2008.

Outline

• What is e-Science• What we gained from an e-Science

initiative• Why we need a strategy• What should the strategy achieve• What computing research do we need

• Theory & pioneering steer each other• Realistic models• Sustainable farming for the e-Science Ecosphere

• The global challenge

Page 3: Developing a Strategy for e-Science Indiana University Malcolm Atkinson Director e-Science Institute UK e-Science Envoy  5 th February 2008.

Definition of e-Science

Computing has become a fundamental tool in all research disciplines, which often proceed by assembling and managing large data collections and exploiting computer models and simulations (a topic called e-Science)

e-Science is the invention and app lication of computer-enabled methods to achieve new, better, faster or more efficient research in any discipline. It draws on advances in computing science, computation and digital communications. As such it has been an important tool for researchers for many decades. The data deluge and the scale and complexity of todayÕs research challenges have greatly increased its importance for researchers. As a consequence, in 2001 the UK led the world by initiating a coordinated e-Science research programme to stimulate the development of e-Science across all fields of research. That investment, £250 million, has developed assets on which the Strategy for Century-of Information Research will build.

Page 4: Developing a Strategy for e-Science Indiana University Malcolm Atkinson Director e-Science Institute UK e-Science Envoy  5 th February 2008.

Strengths of e-Science

Researchusinge-Science

Researchenabling

e-Science

Communities and e-Infrastructure supporting research and innovation

Page 5: Developing a Strategy for e-Science Indiana University Malcolm Atkinson Director e-Science Institute UK e-Science Envoy  5 th February 2008.

e-Science Centres in the UKe-Science Centres in the UKe-Science Centres in the UKe-Science Centres in the UK

OxfordOxford

EdinburghEdinburgh

BelfastBelfast

CambridgeCambridgeSTFC DaresburySTFC Daresbury

ManchesterManchester

LeSCLeSC

NewcastleNewcastle

SouthamptonSouthampton

CardiffCardiff

STFC RALSTFC RAL

GlasgowGlasgow

LeicesterLeicester

UCLUCL

BirminghamBirmingham

White RoseGrid

White RoseGrid

LancasterLancaster

ReadingReading

Access GridSupport Centre

Access GridSupport Centre

Digital Curation CentreDigital Curation Centre

National GridService

National GridService

National Centrefor e-Social

Science

National Centrefor e-Social

Science

National Centre forText Mining

National Centre forText Mining

National Institutefor Environmental

e-Science

National Institutefor Environmental

e-Science

Open MiddlewareInfrastructure Institute

Open MiddlewareInfrastructure Institute

SheffieldSheffieldSheffieldSheffield

YorkYorkYorkYork

LeedsLeedsLeedsLeeds

Coordinated by:Directors’ Forum

& NeSC

Coordinated by:Directors’ Forum

& NeSC

Page 6: Developing a Strategy for e-Science Indiana University Malcolm Atkinson Director e-Science Institute UK e-Science Envoy  5 th February 2008.

Web: www.omii.ac.uk Email: [email protected]

OMII-UK: For all kinds of users

Taverna: effortless workflows for scientists

OGSA-DAI: data integrationfor service providers

PAG: AG video-conferencingfor anyone

Campus Grid Toolkit: easy toinstall grid for job submission

SAGA: abstraction & code mobility

Page 7: Developing a Strategy for e-Science Indiana University Malcolm Atkinson Director e-Science Institute UK e-Science Envoy  5 th February 2008.

NGS & Partners, 2007

Page 8: Developing a Strategy for e-Science Indiana University Malcolm Atkinson Director e-Science Institute UK e-Science Envoy  5 th February 2008.

ESI Themes

Slide from Dr Anna Kenway

Theme 8: Trust and Security in Virtual Communities

Theme 4: Spatial Semantics for Automating Geographic Information Processes

Theme 5: Distributed Programming Abstractions

Theme 6: e-Science in the Arts and Humanities

Theme 7: Neuroinformatics and Grid Techniques to Build a Virtual Fly Brain

Theme 9: Provenance

Page 9: Developing a Strategy for e-Science Indiana University Malcolm Atkinson Director e-Science Institute UK e-Science Envoy  5 th February 2008.

Outline

• What is e-Science• What we gained from an e-Science initiative

•Why we need a strategy• What should the strategy achieve• What computing research do we need

• Theory & pioneering steer each other• Realistic models• Sustainable farming for the e-Science

Ecosphere

• The global challenge

Page 10: Developing a Strategy for e-Science Indiana University Malcolm Atkinson Director e-Science Institute UK e-Science Envoy  5 th February 2008.

Official UK Research Goals

Page 11: Developing a Strategy for e-Science Indiana University Malcolm Atkinson Director e-Science Institute UK e-Science Envoy  5 th February 2008.

Tremendous global challenges

QuickTime™ and aPhoto - JPEG decompressor

are needed to see this picture.

QuickTime™ and aPhoto - JPEG decompressor

are needed to see this picture.

QuickTime™ and aPhoto - JPEG decompressor

are needed to see this picture.

QuickTime™ and aPhoto - JPEG decompressor

are needed to see this picture.

QuickTime™ and aPhoto - JPEG decompressor

are needed to see this picture.

Page 12: Developing a Strategy for e-Science Indiana University Malcolm Atkinson Director e-Science Institute UK e-Science Envoy  5 th February 2008.

Scale, Urgency, Complexity, …

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Page 13: Developing a Strategy for e-Science Indiana University Malcolm Atkinson Director e-Science Institute UK e-Science Envoy  5 th February 2008.

The 21st Century

This is the century of information

PM G. Brown, University of Westminster, 25 October 2007

• We can collect it• We can generate it• Can we move it?• We can store it• Can we use it? • Dramatic increase in data from sensors

• Dramatic drop in cost of computation• Web-scale effects• Ubiquitous digital communications• Community intelligence• Global challenges• Transforming research, design, diagnosis, social behaviour, …

Page 14: Developing a Strategy for e-Science Indiana University Malcolm Atkinson Director e-Science Institute UK e-Science Envoy  5 th February 2008.

セキュリティ

GRID/ ペタコン

ユビキタス

ITS

ではない 情報系アンブレラ

…And then there is now the Information Explosion

988EB(2010)

161EB(2006 by IDC)

= 1ZB

Slide: Satoshi Matsuoka

Page 15: Developing a Strategy for e-Science Indiana University Malcolm Atkinson Director e-Science Institute UK e-Science Envoy  5 th February 2008.

Outline

• What is e-Science• What we gained from an e-Science initiative• Why we need a strategy

•What should the strategy achieve

• What computing research do we need • Theory & pioneering steer each other• Realistic models• Sustainable farming for the e-Science

Ecosphere

• The global challenge

Page 16: Developing a Strategy for e-Science Indiana University Malcolm Atkinson Director e-Science Institute UK e-Science Envoy  5 th February 2008.

High-Level Goals for CIR

• New world-leading research• New methods & new technology

• High impact (transformative)• Sustained rapid transfer from invention to wide use• Much wider engagement => More Research &

Innovation• Cultural changes• Effective transfer between business & academia

• Cost effective• Shared e-Infrastructure (Cyberinfrastructure)• Shared support for developing advances in

Tools Services Trust

Page 17: Developing a Strategy for e-Science Indiana University Malcolm Atkinson Director e-Science Institute UK e-Science Envoy  5 th February 2008.

Elements of CIR

• Establish an Office of Strategic Coordination of Century-of-Information Research

• Support the continuous innovation of research methods

• Provide easily used, pervasive and sustained e-Infrastructure for all research

• Enlarge the productive research community who exploit the new methods fluently

• Generate capacity, propagate knowledge and develop a culture via new curricula

Page 18: Developing a Strategy for e-Science Indiana University Malcolm Atkinson Director e-Science Institute UK e-Science Envoy  5 th February 2008.

Enable extreme e-Science

• Sustain support for interdisciplinary teams• Breakthroughs depend on talented research leaders• Plus strong supporting teams

• Provide an environment of composable components• Significant advances from familiar components• Composed in new ways

• Provide powerful tools and services• With licence to experiment

• Inject energy through challenges & long-term funds

Page 19: Developing a Strategy for e-Science Indiana University Malcolm Atkinson Director e-Science Institute UK e-Science Envoy  5 th February 2008.

CIR Sustain method invention

Applied Scientist

e-Scientist

Researcher communitiesusing e-Science Methods

e-Science e-Infrastructure

Com

puter Science

EvidenceMethodsModels

&challenges

AlgorithmsModels

NotationsMethods

Technology

Supports

Cha

lleng

esId

eas

Mod

els

Test

s U

ses

Dep

loys

Eva

luat

esA

dapt

s

Infrastructure Provisionand Support

InfrastructureDevelopment

Adoption

Challenges

Challenges & supportsOperational data

Slide from John Darlington with modificationsReal invention has more complex interactions

Page 20: Developing a Strategy for e-Science Indiana University Malcolm Atkinson Director e-Science Institute UK e-Science Envoy  5 th February 2008.

CIR Enable fluent mass use

Applied Scientist

e-Scientist

Researcher communitiesusing e-Science Methods

e-Science e-Infrastructure

Com

puter Science

EvidenceMethodsModels

&challenges

AlgorithmsModels

NotationsMethods

Technology

Supports

Cha

lleng

esId

eas

Mod

els

Test

s U

ses

Dep

loys

Eva

luat

esA

dapt

s

Infrastructure Provisionand Support

InfrastructureDevelopment

Adoption

Challenges

Challenges & supportsOperational data

Page 21: Developing a Strategy for e-Science Indiana University Malcolm Atkinson Director e-Science Institute UK e-Science Envoy  5 th February 2008.

Balancing Three Strands of CIR• Pioneering

• Invention of new data & computational methods• Advances in the ways they are used• Advances in the technology that supports them

• Provision• e-Infrastructure or Cyberinfrastructure

support, consultancy, training, tools, services Curated digital data resources, Computation, Communication networks & CSCW

• Education & cultural change• Preparing graduates to flourish in the digital economy• Developing a culture & trust that enables data sharing

Page 22: Developing a Strategy for e-Science Indiana University Malcolm Atkinson Director e-Science Institute UK e-Science Envoy  5 th February 2008.

scientists

LocalWeb

Repositories

Digital Libraries

Graduate Students

Undergraduate Students

Virtual Learning Environment

Technical Reports

Reprints

Peer-Reviewed Journal &

Conference Papers

Preprints &

Metadata

Certified Experimental

Results & Analyses

experimentation

Data, Metadata Provenance WorkflowsOntologies

The social process of science

Slide: DaveDe Roure

Page 23: Developing a Strategy for e-Science Indiana University Malcolm Atkinson Director e-Science Institute UK e-Science Envoy  5 th February 2008.

Web Services RESTful APIs cmd lines ssh http

Web Browser Mobile phone iPod Car Equipment PDA

P2P

mashups

workflows

services

applicationsSubjectICT experts Computer

Scientists

Software Companies

Workflowtools

Ruby on Rails

ecosystem

Scientists

open sourceSoftwareEngineers

nesc

Slide: DaveDe Roure

Page 24: Developing a Strategy for e-Science Indiana University Malcolm Atkinson Director e-Science Institute UK e-Science Envoy  5 th February 2008.

scientists

LocalWeb

Repositories

Graduate Students

Undergraduate Students

Virtual Learning Environment

Technical Reports

Reprints

Peer-Reviewed Journal &

Conference Papers

Preprints &

Metadata

Certified Experimental

Results & Analyses

experimentation

Data, Metadata Provenance WorkflowsOntologies

Digital Libraries

The social process of science 2.0

Slide: DaveDe Roure

Page 25: Developing a Strategy for e-Science Indiana University Malcolm Atkinson Director e-Science Institute UK e-Science Envoy  5 th February 2008.

Organised Sharing?

• Application researchers choose / lead• Leads to diversity & little fluent use

• Sustaining & improving community effects• Group & subject community cultures

• Sharing advantageous• Costs of development, deployment, operations• Costs of improvement, scaling & green computing

• What and when to share• Low-level services & libraries shared across

disciplines• Curated digital resources across discipline groups• Tools may be discipline specific or widely used

Page 26: Developing a Strategy for e-Science Indiana University Malcolm Atkinson Director e-Science Institute UK e-Science Envoy  5 th February 2008.

Developing Trust

• Researchers share networks & computing• And trust them

• Will they trust a shared storage service?• How would you build such trust

• System, model and data complexity increase• How can we build trust in the results they give?

• Much data is personal, medical or financial• Blunders happen• How do we get the public to trust research use

Page 27: Developing a Strategy for e-Science Indiana University Malcolm Atkinson Director e-Science Institute UK e-Science Envoy  5 th February 2008.

Education and Training• Training

• Targeted• Immediate goals• Specific skills• Building a workforce

• Education• Pervasive• Long term and sustained• Generic conceptual models• Developing a culture

• Both are needed

Organisation

Skilled Workers

TrainingServices & Applications

Invests

PreparesDevelop

Strengthens

Society

Graduates

EducationInnovation

Invests

PreparesCreate

Enriches

Page 28: Developing a Strategy for e-Science Indiana University Malcolm Atkinson Director e-Science Institute UK e-Science Envoy  5 th February 2008.

Outline

• What is e-Science• What we gained from an e-Science initiative• Why we need a strategy• What should the strategy achieve

• What computing research do we need • Theory & pioneering steer each other• Realistic models• Sustainable farming for the e-Science

Ecosphere

• The global challenge

Page 29: Developing a Strategy for e-Science Indiana University Malcolm Atkinson Director e-Science Institute UK e-Science Envoy  5 th February 2008.

Matrix to analyse e-Science

Observation

Modelling

Analysis

Action

Collaboration

Ant

hrop

olog

yA

rcha

eolo

gyA

stro

nom

yB

iolo

gy

Bio

chem

istry

Che

mis

tryD

emog

raph

yE

cono

mic

sE

ngin

eerin

gG

eogr

aphy

ScholarshipDesignDiagnosisExploitation

Page 30: Developing a Strategy for e-Science Indiana University Malcolm Atkinson Director e-Science Institute UK e-Science Envoy  5 th February 2008.

Climate, Observation

• Satellite & ground based imaging, ocean buoys, atmospheric, ocean & coastal surveys, robotic mobile devices, distributed urban, rural & river sensors

• Past from trees, corals, ice, sediments, geology, …

• Long-term phenomena• Observations decades to

centuries• Data used for centuries

• Large & sustained data flows

• Economic long-term data storage / management

• Complexity, variety of data>40 ISO standards (OGC+)

• Stability & change, calibration & normalisation

• Sufficient coverage & resolution

• Speed for exceptional environmental events (E3)

• Dependable accuracy• Data discovery,

understanding metadata & ontologies

Source: Next Generation Science for Planet Earth: NERC strategy 2007-20012

Page 31: Developing a Strategy for e-Science Indiana University Malcolm Atkinson Director e-Science Institute UK e-Science Envoy  5 th February 2008.

Climate, Modelling

• Many interacting subsystems: solar, atmospheric physics & chemistry, oceans, air+water interface, cryosphere, air+ice interface, biosphere+ air+land interface, land surface, fires, volcanoes, human activity, …

• Interacting models• Multiple versions• Large (global) team efforts• Dependent on many

parameters (estimates)• No one understands fully

even one model

• Constructing trusted models - mathematics to hindcasting

• Composing models• Combining data &

observation• Computational power• Managing & using data

produced• Curation, cataloguing &

metadata• Managing & tracking

model revision• Rapid execution for E3• Making models usable

Page 32: Developing a Strategy for e-Science Indiana University Malcolm Atkinson Director e-Science Institute UK e-Science Envoy  5 th February 2008.

Climate, Analysis

• Identify & bring together multiple data sets

• Transform them to align & expose information

• Statistical comparisons

• Visualisations

• Finding, accessing & transforming data

• Moving data reduction steps to data

• Necessary data movement• Tools that cope with the

scale: statistics, data handling & visualisation

• Curating, cataloguing results

• Agreeing trusted analysis methods

• Automating analysis• Stability & change

Page 33: Developing a Strategy for e-Science Indiana University Malcolm Atkinson Director e-Science Institute UK e-Science Envoy  5 th February 2008.

Climate, Action

• Scholarship: • papers, contribution

to national & international reports.

• Advice & policy:• planning for &

response to E3• Planning agriculture,

epidemiology & coastal retreat

• Public outreach• Prediction services

• Traditional quality of results / arguments• With 10-year time to

truth• Cross-discipline for

• Socio-economic impact data

• Privacy & ethics• Recognition &

responsibility• Many model & data

sources & contributors• Rational debate about

validity and significance of results• Multi-disciplinary effects

Page 34: Developing a Strategy for e-Science Indiana University Malcolm Atkinson Director e-Science Institute UK e-Science Envoy  5 th February 2008.

Climate, Collaboration

• Already• International (UN,

INSPIRE, scholarly) collaboration

• Economic, social & political drivers

• Usual CSCW• Skype, Blogs, tele/video

conferencing, wiki, facebook, telepresence, OptIPort, …

• Shared data resources• Quality metadata

• Shared code development & testing

• Ontologies & standards• Multi-site computational

steering & spatio-temporal visualisation

• Business case to support the research

Page 35: Developing a Strategy for e-Science Indiana University Malcolm Atkinson Director e-Science Institute UK e-Science Envoy  5 th February 2008.

Climate, pervasive

• How do you build & sustain the business case• Stern report helps

• How do you provide security without inhibiting collaboration, open inspection, alternative interpretations

• Cost reductions• Pooling data collection• Pooling storage• Sharing responsibilities• Pooling model

development

• But diversity for safety

• Security• Prevent damage to

data• Prevent misuse of

resources

Page 36: Developing a Strategy for e-Science Indiana University Malcolm Atkinson Director e-Science Institute UK e-Science Envoy  5 th February 2008.

Please join me in the Matrix

• Populate columns you care about• Music, fine art, chemistry, linguistics, …

• Integrate & digest the list of requirements• Identify the current barriers• Think up strategies for overcoming them

• Start communities following those strategies

Page 37: Developing a Strategy for e-Science Indiana University Malcolm Atkinson Director e-Science Institute UK e-Science Envoy  5 th February 2008.

Outline

• What is e-Science• What we gained from an e-Science initiative• Why we need a strategy• What should the strategy achieve

• What computing research do we need • Theory & pioneering steer each other• Realistic models• Sustainable farming for the e-Science Ecosphere

•The global challenge

Page 38: Developing a Strategy for e-Science Indiana University Malcolm Atkinson Director e-Science Institute UK e-Science Envoy  5 th February 2008.

Data is the Key

• e-Science is different• We are responsible for our data• We curate it / select it / throw it away• Our program executions build & reshape it

• We need a safe model for fluent mass use• Transactional & idempotent• Safety - avoiding accidental data loss / corruption• Realistic - nothing is perfect: S/W, H/W, People,

Organisations

• We need eXtreme e-Science• Smart engineers working with extreme care• Ramp & flow between mass use and eXtreme e-

Science

• Foundation requires careful engineering

How do we manage it?How do we move it?How do we protect it?How do we trust it?

Page 39: Developing a Strategy for e-Science Indiana University Malcolm Atkinson Director e-Science Institute UK e-Science Envoy  5 th February 2008.

Careful Engineering Requires• Good quality models

• Specifying realistic target behaviours• Stochastic Pi Calculus?• Computer scientists, mathematicians & statisticians wanted

• Benchmarks & Measurement• Long-term, multi-purpose & realistic scale • Agreed measurement against the models• Shaped by & shaping standards• Foundations for trust

• Engineering effort• Collaborative & Competitive worldwide• Expect incremental progress not magic• We’ve come a long way

• We have much further to go

Page 40: Developing a Strategy for e-Science Indiana University Malcolm Atkinson Director e-Science Institute UK e-Science Envoy  5 th February 2008.

Questions

Photographer: Kathy Humphry