Australia’s Integrated Marine Observing System (IMOS)

16
Australian Ocean Data Network utilises Amazon Web Service Batch processing for gridded data Sebastien Mancini, Roger Proctor, Peter Blain, Kate Reid University of Tasmania 5 November 2018 Australia’s Integrated Marine Observing System (IMOS)

Transcript of Australia’s Integrated Marine Observing System (IMOS)

Australian Ocean Data Network utilises Amazon Web Service Batch processing for gridded dataSebastien Mancini, Roger Proctor, Peter Blain, Kate ReidUniversity of Tasmania

5 November 2018

Australia’s Integrated Marine Observing System (IMOS)

12 minutes, plus 3 minutes for questions

1. About IMOS

2. Gridded datasets

3. Cloud computing and AWS Batch

4. Helpdesk and user statistics

5. Future improvements

Outline of the talk

IMOS Facilities

https://portal.aodn.org.au

All data discoverable, accessible, usable and reusable

IMOSgriddeddatasets

• SeaSurfaceTemperature(SST)products• OceanColourproducts• Satellitealtimetryproducts• Coastalradarproducts• Climatology• Bathymetry

Whatdoestheuserwant?

• Visualisedatabeforedownloading• RetrieveatimeseriesataparticularlocationanddownloaddatainCSVformat

• SubsetandaggregatedataandgenerateoutputfileinnetCDFformat

• MostofthemdonotwanttoaccessdataonTHREDDS

Whatdoestheuserwant?MostdownloadeddatasetsontheAODNPortalfortheperiod01/2016to04/2018usingGoogleAnalyticsRank EventLabel TotalEvents1 IMOS- AustralianNationalMooringNetwork(ANMN)Facility- Currentvelocitytime-series 9442 IMOS- ArgoProfiles 8533 IMOS- AustralianNationalMooringNetwork(ANMN)Facility- Temperatureandsalinitytime-series 7764 IMOS- SRSSatellite- SSTL3S- 01daycomposite- nighttime 7655 IMOS- SRS- MODIS- 01day- Chlorophyll-aconcentration(OC3model) 6796 IMOS- AustralianNationalFacilityforOceanGliders(ANFOG)- delayedmodegliderdeployments 5017 IMOS- SRSSatellite- SSTL3S- 06daycomposite- daytime 4588 IMOS- OceanCurrent- Griddedsealevelanomaly- Delayedmode 439

9IMOSNationalReferenceStation(NRS)- Salinity,Carbon,Alkalinity,OxygenandNutrients(Silicate,Ammonium,Nitrite/Nitrate,Phosphate) 420

10 IMOS- SRSSatellite- SSTL3S- 1monthcomposite- dayandnighttimecomposite 40411 IMOS- SRSSATELLITE- SSTL3S- 01daycomposite- dayandnighttimecomposite 37612 IMOS- OceanCurrent- Griddedsealevelanomaly- Nearrealtime 36815 IMOS- SRS- MODIS- 01day- OceanColour- SST 32417 IMOS- SRSSATELLITE- SSTL3S- 03daycomposite- nighttime 28019 IMOS- SRSSatellite- SSTL3S- 03daycomposite- dayandnighttimecomposite 22720 IMOS- SRSSatellite- SSTL3S- 01daycomposite- daytime 20922 IMOS- SRSSatellite- SSTL3S- 06daycomposite- dayandnighttimecomposite 16923 IMOS- SRS- MODIS- 01day- Chlorophyll-aconcentration(GSMmodel) 15726 MARVL3- Australianshelftemperaturedataatlas 14427 IMOS- SRSSatellite- SSTL3S- 1monthcomposite- daytime 141

Serverlessarchitecture

• Serverdetailsgetabstractedaway• Serversonlyrunwhenneeded• Leaveservermanagementtoacompanythatdoesthatastheirbreadandbutter

• Focusonwhatmattersinstead- thecodeanddata

Image:Creator:JohnVoo,Url:https://www.flickr.com/photos/138248475@N03/Licence:CCBY2.0

ArchitectureoverviewofAWSbatch

AODNPortal

UserorAdmin

Helpdesk:QueueMonitoring

Helpdesk:QueueMonitoring

Jobdetails:• E-mailaddress• Datasetcollection• Filtersapplied

Userstatistics:Sumologic dashboard

Userstatistics:FebruarytoSeptember2018

240uniqueusers

42userspermonth

1600downloadsovertheentireperiod

350downloadsinJune

32faileddownloadsovertheentireperiod

Processtimeof<8minfora1000jobs10%jobshaveanaveragedprocessedtimeof900min

Queuetimeof<7minfora1000jobs10%jobshaveanaveragedqueuedtimeof550min

Costcomparison:

AWSBatch Normalapproach

EC2SPOTinstance+localstorage+Lambda

EC2instance(ifreserved)+Localstorage

80$permonth 350$permonth

Additionalcostforbothapproaches:• Storageforoutputfile• S3operations(dataaccess)

vs

VendorLock-in

+majorityof aggregationcodewrittenasagenericlibrary/utility

- requesthandlerLambdahassomeAWSspecificcode,butcouldberefactoredtoamoregenericAPIhandler

- statusserviceLambdaisquitespecifictoAWSBatch/S3currently,butcouldbemademoreabstract

- thecomponentsaregluedtogetherbyCloudFormationanddomakeassumptionsaboutrunningonLambda,howeverthemajorityoftheclassesaregeneric,soifrunningoutsideofAWSwasarequirement,theprojectcouldberefactoredtomakeconceptslikestorageandAPIsmoregeneric

Futureimprovements

• Implementcloudcomputingtootherelementsofourinfrastructure:• Developmentofourapplicationstack(AODNPortal,Geonetwork,Geoserver,Geowebcache,ncWMS …)

• Improvesubsetting/aggregatingcode:• Downloadmultiplepointtimeseriesatthesametime• Improveefficiencytoproduceresultquicker

• Useofsystemanalyticstoimprovequeuedesign• Multiplequeuesdependingonsizeofjobs,typeofusers…

• ApplyAWSBatchforothertypeofdatadownloads:• LargeCSVdownloadsusingWFSrequestsfromGeoserver• SubsetGeotiff (e.g.Bathymetrydata)