Data center requirements and services

13
S. Kindermann (DKRZ) Stephan Kindermann, Michael Lautenschlager, Katharina Berger, Tobias Weigel, Hans Dieter Hollweg Deutsches Klimarechenzentrum (DKRZ) DKRZ Data center requirements and services

Transcript of Data center requirements and services

Page 1: Data center requirements and services

S.Kindermann(DKRZ)

StephanKindermann,MichaelLautenschlager,KatharinaBerger,TobiasWeigel,HansDieterHollweg

DeutschesKlimarechenzentrum(DKRZ)

DKRZ

Datacenterrequirementsandservices

Page 2: Data center requirements and services

S.Kindermann(DKRZ)

Overview

§  Update:thenewdatainfrastructurehosJngenvironmentatDKRZ

§  ESGF:DKRZdatalifecycleservices§  LTA/WDCC–ESGFintegraJon§  Qualityassurance§  Datanearprocessing§  TowardsPIDbasedservices

§  CMIP6atDKRZ

212/8/15

Page 3: Data center requirements and services

S.Kindermann(DKRZ)

DKRZdatacenterupdate

312/8/15

(pre-shutdown)ESGFinfrastructure

4datanodes

Indexnode

NoseparateDTNs

CERALTAinfrastructure

1CERA/ESGFdatanode

GPFS

HPSS

OracleDBcluster

CERAportal

Mistral

CERA/ESGFportal

2datanodes

LTA(Oracle)

LTAdatanode

2DTNs

LUSTRE

roNFS

DFN:2x3..5GBHH:2x10GB

HPC+InteracJvenodes+visualizaJonnodes

Openstackcloud

all:behindfirewall

VMs(XEN)

„na>onalMIPdatacache“•  managementetc.tbd

Migra>ontonewintegratedHPC/datasystem•  separateDTNs(starJng2016)•  establishmentofa„naJonalMIPdataanalysiscache“•  datacloudtosupportdataingestprocess

frommid2015from2016un>lend2015

Page 4: Data center requirements and services

S.Kindermann(DKRZ)

DKRZlongtermarchivalanddatacitaJon

Mayorusecase§  ReplicaJon§  Supportdataevalua>on§  QualityAssurance§  LongTermArchival§  DOIassignment§  ExposureasESGFdatanode

412/8/15

containercache

containerserver

CERA(Oracle)

LTA(HPSS)

CERAPortal/DDC..

ESGFQADOI

Process

replicaJonversioning

NaJonalclimatedatanode(MIPcache)

Datanearprocessing

ESGFshutdown

ESGFDatanode COGportalWPS

ingest

Page 5: Data center requirements and services

S.Kindermann(DKRZ)

WDCC/CERA/HPSSßàESGFintegraJon

512/8/15

ImprovedsystemforCMIP6:•  FUSEbasedmounJngofDKRZHPSS/cachelegacysystem•  ExtracJonofCERAmetadataforESGFmapfile•  „standard“„standard“ESGFpublicaJoninan„offlinemode“

Opera>onalforCMIP5•  CERAmetadata(Oracle)àESGFindex•  ThreddsserverwithESGFsecurityfilter+HPSSdatacontainerserveràESGFdatanode

à FutureCOGportalvisibilityof(nonCMIP)WDCCLTAprojectdata

containercache

containerserver

CERA(Oracle)

LTA(HPSS)

FUSE

MapfilegeneraJon

COGportal

LTAESGFDatanode

ESGFSolrindex

ESGFIndexnode

ESGFPublisher

Postgres/THREDDS

Page 6: Data center requirements and services

S.Kindermann(DKRZ)

(CMIPdata)QualityAssuranceSoiware

612/8/15

mainFile

NetCDFFile

CFCon

ven>

onsT

ables

ProjectConfigura>on&Tables

User-m

odifiedDirec>ves

NC-APIM-DStore

CFConv.Checks

Annota>ons

QA

Time

Data

Consistencybetweensub-temporalfiles

DRSCV

VariableRequirements(CMOR)

ProjectRules

CFConven>onsCheck• Versions:1.4-1.6

• 8-9Chaptersofrules

• tablebasedconfig(area-type,cf-standard-name,stand-region-name,..)

•  Sourcecode: hlps://github.com/h-dh/QA-DKRZ

•  Pre-packagedversions:condabased,dockerbased

•  Documenta>on:hlp://qa-dkrz.readthedocs.org/en/latest/qa-user-manual.html

Completelyre-structuredandmodularized:•  FlexibleconfiguraJon•  UsedheavilyforCORDEX–willsupport

CMIP6•  Separatecf-checkermodule

Page 7: Data center requirements and services

S.Kindermann(DKRZ)

NaJonalMIPdataanalysiscache/node„Adhoc“approachàtransparentsolu>on:§  Dataneededàhelpdeskàdatamanager§  ROmountedonHPCdataanalysisnodes§  SupportfordataanalysisVMdeployment§  Supportfortooldependencymanagement(installrecipes,conda,docker)§  WPSframeworktosupportwebservicedeployments

§  Birdhouse(hlps://github.com/bird-house)§  conda/dockersupport§  SupportforhomeinsJtuJon(test-)deployments

712/8/15

ESGF

replicaJonversioning

NaJonalclimatedatanode(MIPcache)

Datanearprocessing

WPS

ingest

Page 8: Data center requirements and services

S.Kindermann(DKRZ)

Stablefile/collecJonmanagement!?

812/8/15 812/8/15

containercache

containerserver

CERA(Oracle)

LTA(HPSS)

CERAPortal/..

ESGFQADOI

Process

replicaJonversioning

NaJonalclimatedatanode(MIPcache)

Datanearprocessing

ESGFDatanode COGportalWPS

ingest

Page 9: Data center requirements and services

S.Kindermann(DKRZ)

TowardsPIDbasedservices

912/8/15

Mo>va>on:StableESGFdataspacebasedonPIDinfrastructure

Collabora>ons:•  ePIC:DKRZpartneràprefixregistraJon•  EUDAT:DKRZleadsPIDtaskàAPI•  RDA:DKRZco-chairsPITandcollecJonsWGs•  Envri+:PIDsinenvironmentalsciences

NextESGFsteps:•  Test-Environment(PIDsystem+publisher)•  Scalable,stablePIDassigment:•  CMORintegraJon,CDNOTinvolvement•  PIDAPI/ESGFpublisherintegraJon•  HighavailablemessagequeuingsystemintegraJon

Page 10: Data center requirements and services

S.Kindermann(DKRZ)

Summary

Longtermarchivalusecaseà ESGFintegraJonà QualityAssuranceà PIDassigmentearlyindatalifecycleà earlycitaJonandDOIassignmentà futurePIDbaseddatamanagementservicesà futurePIDbasedenduserservicesà futurePIDbasedprovenancesupport

1012/8/15

Page 11: Data center requirements and services

S.Kindermann(DKRZ)

..

ThankYou

1112/8/15

Page 12: Data center requirements and services

S.Kindermann(DKRZ)

DKRZservices

Newdevelopments§  NewintegratedHPC/DataSystem

installedin2015,~50PByteLustre

§  Storagecloud(openstack)§  Communitydataanalysiscacheandplarorm

ESGF:§  WDCC/HPSS/ESGFdatanode

§  WPScomputeplarormbirdhouse

§  TowardsPID/earlycita>onservices

1212/8/15

dataingest

Page 13: Data center requirements and services

S.Kindermann(DKRZ) 1312/8/15

(Early)DataCitaJon(DM+ESGF)§  ImpactonCMIP6datamanagement(DM)

andESGFgovernance(ESGF)§  Requestfrommodellinggroupsforadata

citaJonreferencejustaierESGFdatapublicaJon

§  CMIP6datapublicaJonworkflow:

iCAS2015 1312/8/15

CMIP6citaJongranulariJesarecollecJonlevels:§  Simula>on§  Model