Data Requirements for Climate and Carbon Research

25
Data Requirements for Climate and Carbon Research David C. Bader Chief Scientist, DOE Climate Change Prediction Program Presentation Material Courtesy of: John Drake, ORNL Don Middleton, National Center for Atmospheric Research and CCPP Scientists UCRL-PRES-202932

description

Data Requirements for Climate and Carbon Research. David C. Bader Chief Scientist, DOE Climate Change Prediction Program Presentation Material Courtesy of: John Drake, ORNL Don Middleton, National Center for Atmospheric Research and CCPP Scientists UCRL-PRES-202932. - PowerPoint PPT Presentation

Transcript of Data Requirements for Climate and Carbon Research

Page 1: Data Requirements for Climate and Carbon Research

Data Requirements for Climate and Carbon

Research David C. Bader

Chief Scientist, DOE Climate Change Prediction ProgramPresentation Material Courtesy of:

John Drake, ORNLDon Middleton, National Center for Atmospheric Research

andCCPP ScientistsUCRL-PRES-202932

Page 2: Data Requirements for Climate and Carbon Research

Why is DOE Interested in Why is DOE Interested in Climate?Climate?

Page 3: Data Requirements for Climate and Carbon Research
Page 4: Data Requirements for Climate and Carbon Research

Courtesy Warren Washington

Page 5: Data Requirements for Climate and Carbon Research

IPCC Fourth Assessment IPCC Fourth Assessment RunsRunsPlan: • Commitment Runs : B1 Scenario simulated with ensemble calculations• Dedicated 6 (32way) nodes of Cheetah for Sept-Nov 2003• Dedicated 12 (32way) nodes of Cheetah for Mar-July 2004• Cray X1 Science runs in 2004

Page 6: Data Requirements for Climate and Carbon Research
Page 7: Data Requirements for Climate and Carbon Research

Baseline NumbersBaseline Numbers

• T42 CCSM (current, 280km)T42 CCSM (current, 280km)– 7.5GB/yr, 100 years -> .75TB7.5GB/yr, 100 years -> .75TB

• T85 CCSM (140km)T85 CCSM (140km)– 29GB/yr, 100 years -> 2.9TB29GB/yr, 100 years -> 2.9TB

• T170 CCSM (70km)T170 CCSM (70km)– 110GB/yr, 100 years -> 11TB110GB/yr, 100 years -> 11TB

Page 8: Data Requirements for Climate and Carbon Research

Diagonstic Analysis of Coupled Models

Courtesy PCMDI

Courtesy CCSM/AMWG

Page 9: Data Requirements for Climate and Carbon Research

Tools

Page 10: Data Requirements for Climate and Carbon Research

The Earth System The Earth System GridGrid

• U.S. DOE SciDAC funded R&D effortU.S. DOE SciDAC funded R&D effort• Build an “Earth System Grid” that enables Build an “Earth System Grid” that enables

management, discovery, distributed management, discovery, distributed access, processing, & analysis of access, processing, & analysis of distributed terascale climate research datadistributed terascale climate research data

• A “Collaboratory Pilot Project”A “Collaboratory Pilot Project”• Build upon ESG-I, Globus ToolkitBuild upon ESG-I, Globus Toolkit, ,

DataGrid technologies, and DataGrid technologies, and deploydeploy• Potential broad application to other areasPotential broad application to other areas

http://www.earthsystemgrid.org

Page 11: Data Requirements for Climate and Carbon Research

ESG TeamESG Team• ANLANL

– Ian Foster (PI)Ian Foster (PI)– Veronika NefedovaVeronika Nefedova– (John Bresenhan)(John Bresenhan)– (Bill Allcock)(Bill Allcock)

• LBNLLBNL– Arie ShoshaniArie Shoshani– Alex SimAlex Sim

• ORNLORNL– David BernholdteDavid Bernholdte– Kasidit ChanchioKasidit Chanchio– Line PouchardLine Pouchard

• LLNL/PCMDILLNL/PCMDI– Bob DrachBob Drach– Dean Williams (PI)Dean Williams (PI)

• USC/ISIUSC/ISI– Anne ChervenakAnne Chervenak– Carl KesselmanCarl Kesselman– (Laura Perlman)(Laura Perlman)

• NCARNCAR– David BrownDavid Brown– Luca CinquiniLuca Cinquini– Peter FoxPeter Fox– Jose GarciaJose Garcia– Don Middleton (PI)Don Middleton (PI)– Gary StrandGary Strand

Page 12: Data Requirements for Climate and Carbon Research
Page 13: Data Requirements for Climate and Carbon Research

Capacity-related Capacity-related ImprovementsImprovements

Increased turnaround, model development, ensemble of runs

Increase by a factor of 10, linear data

• Current T42 CCSMCurrent T42 CCSM– 7.5GB/yr, 100 years -> .75TB * 10 = 7.5GB/yr, 100 years -> .75TB * 10 =

7.5TB7.5TB

Page 14: Data Requirements for Climate and Carbon Research

Capability-related Capability-related Improvements Improvements Spatial Resolution: T42 -> T85 -> T170

Increase by factor of ~ 10-20, linear data Temporal Resolution: Study diurnal cycle, 3 hour data

Increase by factor of ~ 4, linear data

CCM3 at T170 (70km)

Page 15: Data Requirements for Climate and Carbon Research

Capability-related Capability-related Improvements Improvements Quality: Improved boundary layer, clouds, convection, ocean physics, land model, river runoff, sea ice

Increase by another factor of 2-3, data flat

Scope: Atmospheric chemistry (sulfates, ozone…), biogeochemistry (carbon cycle, ecosystem dynamics),middle Atmosphere Model…

Increase by another factor of 10+, linear data

Page 16: Data Requirements for Climate and Carbon Research

Model Improvements cont.Model Improvements cont.Grand Total:

Increase compute by a Factor O(1000-10000)

Page 17: Data Requirements for Climate and Carbon Research

ESG: ChallengesESG: Challenges• Enabling the simulation and data Enabling the simulation and data

management teammanagement team• Enabling the core research community in Enabling the core research community in

analyzing and visualizing resultsanalyzing and visualizing results• Enabling broad multidisciplinary Enabling broad multidisciplinary

communities to access simulation resultscommunities to access simulation resultsWe need integrated scientific work environments that enable smooth WORKFLOW for knowledge development: computation, collaboration & collaboratories, data management, access, distribution, analysis, and visualization.

Page 18: Data Requirements for Climate and Carbon Research

ESG: StrategiesESG: Strategies• Move data a minimal amount, keep it close to Move data a minimal amount, keep it close to

computational point of origin when possiblecomputational point of origin when possible– Data access protocols, distributed analysisData access protocols, distributed analysis

• When we must move data, do it fast and with a When we must move data, do it fast and with a minimum amount of human interventionminimum amount of human intervention– Storage Resource Management, fast networksStorage Resource Management, fast networks

• Keep track of what we have, particularly what’s Keep track of what we have, particularly what’s on deep storageon deep storage– Metadata and Replica CatalogsMetadata and Replica Catalogs

• Harness a federation of sites, web portalsHarness a federation of sites, web portals– Globus Toolkit -> The Earth System Grid -> The Globus Toolkit -> The Earth System Grid -> The

UltraDataGridUltraDataGrid

Page 19: Data Requirements for Climate and Carbon Research

HRM aka “DataMover”HRM aka “DataMover”• Running well across DOE/HPSS systemsRunning well across DOE/HPSS systems• New component built that abstracts New component built that abstracts

NCAR Mass Storage SystemNCAR Mass Storage System• Defining next generation of requirements Defining next generation of requirements

with climate production groupwith climate production group• First “real” usageFirst “real” usage

Page 20: Data Requirements for Climate and Carbon Research

OPeNDAPOPeNDAP

An Open Source Project for a An Open Source Project for a Network Data Access ProtocolNetwork Data Access Protocol

(originally DODS, the (originally DODS, the Distributed Oceanographic Distributed Oceanographic

Data System)Data System)

Page 21: Data Requirements for Climate and Carbon Research

ESG: Metadata ServicesESG: Metadata Services

METADATAEXTRACTION

METADATADISPLAY

METADATABROWSING

METADATAQUERY

ESG CLIENTS API & USER INTERFACES

Data &MetadataCatalog

Dublin CoreDatabase

COARDSDatabase

mirrorDublin CoreXML Files

COMMENTSXML Files

METADATA HOLDINGS

METADATAANNOTATION

METADATAVALIDATION

METADATA ACCESS(update, insert, delete, query)

SERVICE TRANSLATIONLIBRARY

CORE METADATA SERVICES

METADATAAGGREGATION

METADATADISCOVERY

METADATA & DATA REGISTRATION

PUBLISHING

HIGH LEVEL METADATA SERVICES

SEARCH & DISCOVERY ADMINISTRATION BROWSING & DISPLAY

ANALYSIS & VISUALIZATION

Page 22: Data Requirements for Climate and Carbon Research

Collaborations & Collaborations & RelationshipsRelationships

• CCSM Data Management GroupCCSM Data Management Group• The Globus ProjectThe Globus Project• Other SciDAC Projects: Climate, Security & Policy for Other SciDAC Projects: Climate, Security & Policy for

Group Collaboration, Scientific Data Management Group Collaboration, Scientific Data Management ISIC, & High-performance DataGrid ToolkitISIC, & High-performance DataGrid Toolkit

• OPeNDAP/DODS (multi-agency)OPeNDAP/DODS (multi-agency)• NSF National Science Digital Libraries Program NSF National Science Digital Libraries Program

(UCAR & Unidata THREDDS Project)(UCAR & Unidata THREDDS Project)• U.K. e-Science and British Atmospheric Data CenterU.K. e-Science and British Atmospheric Data Center• NOAA NOMADS and CEOS-gridNOAA NOMADS and CEOS-grid• Earth Science Portal group (multi-agency, intnl.)Earth Science Portal group (multi-agency, intnl.)

Page 23: Data Requirements for Climate and Carbon Research
Page 24: Data Requirements for Climate and Carbon Research
Page 25: Data Requirements for Climate and Carbon Research