Responsive Storage: Home Automation for Research Data … · 2018-09-02 · Home Automation for...

8
Responsive Storage: Home Automation for Research Data Management Ryan Chard Postdoc Fellow, Argonne National Laboratory

Transcript of Responsive Storage: Home Automation for Research Data … · 2018-09-02 · Home Automation for...

Page 1: Responsive Storage: Home Automation for Research Data … · 2018-09-02 · Home Automation for Research Data Management Ryan Chard Postdoc Fellow, Argonne National Laboratory. ...

ResponsiveStorage:HomeAutomationfor

ResearchDataManagement

RyanChardPostdocFellow,ArgonneNationalLaboratory

Page 2: Responsive Storage: Home Automation for Research Data … · 2018-09-02 · Home Automation for Research Data Management Ryan Chard Postdoc Fellow, Argonne National Laboratory. ...

TheProblem- Datagenerationratesareexploding

- Complexanalyticsprocesses

- Thedatalifecycleofteninvolvesmultipleorganisations,machines,andpeople

Thiscreatesasignificantstrainonresearchers

ØBestmanagementpractises(cataloguing,sharing,purging,etc.)canbeoverlooked

ØUsefuldatamaybelost,siloed,andforgotten

Page 3: Responsive Storage: Home Automation for Research Data … · 2018-09-02 · Home Automation for Research Data Management Ryan Chard Postdoc Fellow, Argonne National Laboratory. ...

RIPPLE:AprototyperesponsivestoragesolutionTransformstaticdatagraveyardsintoactive,responsivestoragedevices

• Automatedatamanagementprocessesandenforcebestpractices

• Event-driven:actionsareperformedinresponsetodataevents

• Usersdefinesimpleif-trigger-then-actionrecipes

• Combinerecipesintoflowsthatcontrolend-to-enddatatransformations

• Passivelywaitsforfilesystemevents(verylittleoverhead)

• Filesystemagnostic– worksonbothedgeandleadershipplatforms

Page 4: Responsive Storage: Home Automation for Research Data … · 2018-09-02 · Home Automation for Research Data Management Ryan Chard Postdoc Fellow, Argonne National Laboratory. ...

RIPPLEArchitectureAgent:

- Sits locally on the machine

- Detects & filters filesystem events

- Facilitates execution of actions

- Can receive new recipes

Service:

- Serverless architecture

- Lambda functions process events

- Orchestrates execution of actions

RippleAgent

SQLite

Filesystem

Docker,PBS,

SLURM,…

LambdaFunctions

ProcessMonitor

ObserversSNS Topics

ExternalServices

Page 5: Responsive Storage: Home Automation for Research Data … · 2018-09-02 · Home Automation for Research Data Management Ryan Chard Postdoc Fellow, Argonne National Laboratory. ...

RIPPLEAgentPythonWatchdogobserverslistenforevents- inotify,polling,forfilesystemevents(create,delete,etc.)- GlobusTransferAPIforevents(transfer,create,delete)

RecipesarestoredlocallyinaSQLitedatabase

Localandcloud-basedactions- Dockercontainersandsubprocesses actonlocalfiles(metadataextraction,dispatch

jobs,etc.)- AWSLambdaperformsothertasks(Globustransfers,createsharedendpoints,send

emails,invokeotherLambdafunctionsetc.)

RippleAgent

SQLite

Filesystem

Docker,PBS,

SLURM,…

ProcessMonitor

Observers

Page 6: Responsive Storage: Home Automation for Research Data … · 2018-09-02 · Home Automation for Research Data Management Ryan Chard Postdoc Fellow, Argonne National Laboratory. ...

RIPPLERecipesIFTTT-inspiredprogrammingmodel:

Triggers describewheretheeventiscomingfrom(filesystemcreateevents)andtheconditionstomatch(/path/to/monitor/.*.h5)

Actions describewhatservicetouse(e.g.,globus transfer)andargumentsforprocessing(source/dest endpoints).

Page 7: Responsive Storage: Home Automation for Research Data … · 2018-09-02 · Home Automation for Research Data Management Ryan Chard Postdoc Fellow, Argonne National Laboratory. ...

Scenario:LargeSynopticSurveyTelescopeDevelopedarepresentativetestbedoftheLSSTstoragerequirements

• Automaticallypropagatedatabetweenstoragetiersandfacilities

• InvokeDockercontainerstoextractmetadataandmaintainafilecatalog

• Compressandarchivefiles

• Recoverdeleted/corruptedfileswhendeleteandmodificationeventsoccurCustodial Store

(Chile)

Archive: ANL’s Sparrow

Archiver

Landing

Magnetic

Forwarder

File Catalog

File Catalog

Custodial Store (NCSA)

Landing

Magnetic

Archive

metadataminidgzip

catalog....

1.

2.3.

4.

6.

7.

Page 8: Responsive Storage: Home Automation for Research Data … · 2018-09-02 · Home Automation for Research Data Management Ryan Chard Postdoc Fellow, Argonne National Laboratory. ...

Scenario:AdvancedLightSourceDeployedRippleonanALSandNERSCmachinetoautomatedataanalysis

• AtALS: DetectnewheartbeatbeamlinedataandinitiatetransfertoNERSC

• AtNERSC: Extractmetadata,createsbatch file,dispatchanalysisjobto

Edisonqueue,detectresultandtransferbacktoALS

• AtALS: createasharedendpoint,notifycollaboratorsofresultviaemail