Responsive Storage: Home Automation for Research Data … · 2018-09-02 · Home Automation for...
Transcript of Responsive Storage: Home Automation for Research Data … · 2018-09-02 · Home Automation for...
![Page 1: Responsive Storage: Home Automation for Research Data … · 2018-09-02 · Home Automation for Research Data Management Ryan Chard Postdoc Fellow, Argonne National Laboratory. ...](https://reader034.fdocuments.in/reader034/viewer/2022050423/5f9229eafe63a2799d0fa4b3/html5/thumbnails/1.jpg)
ResponsiveStorage:HomeAutomationfor
ResearchDataManagement
RyanChardPostdocFellow,ArgonneNationalLaboratory
![Page 2: Responsive Storage: Home Automation for Research Data … · 2018-09-02 · Home Automation for Research Data Management Ryan Chard Postdoc Fellow, Argonne National Laboratory. ...](https://reader034.fdocuments.in/reader034/viewer/2022050423/5f9229eafe63a2799d0fa4b3/html5/thumbnails/2.jpg)
TheProblem- Datagenerationratesareexploding
- Complexanalyticsprocesses
- Thedatalifecycleofteninvolvesmultipleorganisations,machines,andpeople
Thiscreatesasignificantstrainonresearchers
ØBestmanagementpractises(cataloguing,sharing,purging,etc.)canbeoverlooked
ØUsefuldatamaybelost,siloed,andforgotten
![Page 3: Responsive Storage: Home Automation for Research Data … · 2018-09-02 · Home Automation for Research Data Management Ryan Chard Postdoc Fellow, Argonne National Laboratory. ...](https://reader034.fdocuments.in/reader034/viewer/2022050423/5f9229eafe63a2799d0fa4b3/html5/thumbnails/3.jpg)
RIPPLE:AprototyperesponsivestoragesolutionTransformstaticdatagraveyardsintoactive,responsivestoragedevices
• Automatedatamanagementprocessesandenforcebestpractices
• Event-driven:actionsareperformedinresponsetodataevents
• Usersdefinesimpleif-trigger-then-actionrecipes
• Combinerecipesintoflowsthatcontrolend-to-enddatatransformations
• Passivelywaitsforfilesystemevents(verylittleoverhead)
• Filesystemagnostic– worksonbothedgeandleadershipplatforms
![Page 4: Responsive Storage: Home Automation for Research Data … · 2018-09-02 · Home Automation for Research Data Management Ryan Chard Postdoc Fellow, Argonne National Laboratory. ...](https://reader034.fdocuments.in/reader034/viewer/2022050423/5f9229eafe63a2799d0fa4b3/html5/thumbnails/4.jpg)
RIPPLEArchitectureAgent:
- Sits locally on the machine
- Detects & filters filesystem events
- Facilitates execution of actions
- Can receive new recipes
Service:
- Serverless architecture
- Lambda functions process events
- Orchestrates execution of actions
RippleAgent
SQLite
Filesystem
Docker,PBS,
SLURM,…
LambdaFunctions
ProcessMonitor
ObserversSNS Topics
ExternalServices
![Page 5: Responsive Storage: Home Automation for Research Data … · 2018-09-02 · Home Automation for Research Data Management Ryan Chard Postdoc Fellow, Argonne National Laboratory. ...](https://reader034.fdocuments.in/reader034/viewer/2022050423/5f9229eafe63a2799d0fa4b3/html5/thumbnails/5.jpg)
RIPPLEAgentPythonWatchdogobserverslistenforevents- inotify,polling,forfilesystemevents(create,delete,etc.)- GlobusTransferAPIforevents(transfer,create,delete)
RecipesarestoredlocallyinaSQLitedatabase
Localandcloud-basedactions- Dockercontainersandsubprocesses actonlocalfiles(metadataextraction,dispatch
jobs,etc.)- AWSLambdaperformsothertasks(Globustransfers,createsharedendpoints,send
emails,invokeotherLambdafunctionsetc.)
RippleAgent
SQLite
Filesystem
Docker,PBS,
SLURM,…
ProcessMonitor
Observers
![Page 6: Responsive Storage: Home Automation for Research Data … · 2018-09-02 · Home Automation for Research Data Management Ryan Chard Postdoc Fellow, Argonne National Laboratory. ...](https://reader034.fdocuments.in/reader034/viewer/2022050423/5f9229eafe63a2799d0fa4b3/html5/thumbnails/6.jpg)
RIPPLERecipesIFTTT-inspiredprogrammingmodel:
Triggers describewheretheeventiscomingfrom(filesystemcreateevents)andtheconditionstomatch(/path/to/monitor/.*.h5)
Actions describewhatservicetouse(e.g.,globus transfer)andargumentsforprocessing(source/dest endpoints).
![Page 7: Responsive Storage: Home Automation for Research Data … · 2018-09-02 · Home Automation for Research Data Management Ryan Chard Postdoc Fellow, Argonne National Laboratory. ...](https://reader034.fdocuments.in/reader034/viewer/2022050423/5f9229eafe63a2799d0fa4b3/html5/thumbnails/7.jpg)
Scenario:LargeSynopticSurveyTelescopeDevelopedarepresentativetestbedoftheLSSTstoragerequirements
• Automaticallypropagatedatabetweenstoragetiersandfacilities
• InvokeDockercontainerstoextractmetadataandmaintainafilecatalog
• Compressandarchivefiles
• Recoverdeleted/corruptedfileswhendeleteandmodificationeventsoccurCustodial Store
(Chile)
Archive: ANL’s Sparrow
Archiver
Landing
Magnetic
Forwarder
File Catalog
File Catalog
Custodial Store (NCSA)
Landing
Magnetic
Archive
metadataminidgzip
catalog....
1.
2.3.
4.
6.
7.
![Page 8: Responsive Storage: Home Automation for Research Data … · 2018-09-02 · Home Automation for Research Data Management Ryan Chard Postdoc Fellow, Argonne National Laboratory. ...](https://reader034.fdocuments.in/reader034/viewer/2022050423/5f9229eafe63a2799d0fa4b3/html5/thumbnails/8.jpg)
Scenario:AdvancedLightSourceDeployedRippleonanALSandNERSCmachinetoautomatedataanalysis
• AtALS: DetectnewheartbeatbeamlinedataandinitiatetransfertoNERSC
• AtNERSC: Extractmetadata,createsbatch file,dispatchanalysisjobto
Edisonqueue,detectresultandtransferbacktoALS
• AtALS: createasharedendpoint,notifycollaboratorsofresultviaemail