1st BDE SC5 pilot: rationale, components and reusability

11
1 ST BDE SC5 PILOT: RATIONALE, COMPONENTS AND REUSABILITY NCSR–Demokritos 11/10/2016

Transcript of 1st BDE SC5 pilot: rationale, components and reusability

Page 1: 1st BDE SC5 pilot: rationale, components and reusability

1ST BDE SC5 PILOT:RATIONALE, COMPONENTS AND REUSABILITYNCSR–Demokritos11/10/2016

Page 2: 1st BDE SC5 pilot: rationale, components and reusability

Overview

¥ Downscaling¥ NetCDF, big data and abstraction¥ BDE SC5 pilot #1 functionality¥ Next steps

18-oct.-16

Page 3: 1st BDE SC5 pilot: rationale, components and reusability

Downscaling¥ Downscaling of climatic and / or meteorological data:

o Essential first step for any further analysis, assessment or processing in climate and related domains

Page 4: 1st BDE SC5 pilot: rationale, components and reusability

NetCDF Format, Model, Abstraction¥ Numerical array data format¥ Embedded metadata / variables, attributes¥ Dimensions¥ De facto standard for climate, weather and other Earth

observation datao ESGFo Australia’s National Environmental Research Data Interoperability

Platform (NERDIP)¥ Transparent big data connectors to move from and to

NetCDF format and file abstractions18-oct.-16

Page 5: 1st BDE SC5 pilot: rationale, components and reusability

WRF: Weather Research and Forecasting Model

¥ Widely used and available¥ Operational forecasting and atmospheric – weather

and climate – research ¥ Open source / public domain

18-oct.-16

Page 6: 1st BDE SC5 pilot: rationale, components and reusability

Specification¥ Supplement climate research community with big

data technologyo Discrepancy between big-data and data-intensive

advances and research practiceo Rigid policies at research sites – need a more flexible

approach to technology¥ To be used in conjunction with institutional

infrastructure already in use18-oct.-16

Page 7: 1st BDE SC5 pilot: rationale, components and reusability

BDE SC5 Pilot I Components

CassandraMetadata & data lineage

Hive/HadoopRaw data &

analytics

WRF ModelInstitutional resource

connectors

NetCDFInterfacing and transformation, Semagrow tools

SC5 1st Pilot

Page 8: 1st BDE SC5 pilot: rationale, components and reusability

Operations Implemented¥ Operations

o Data ingestion (NetCDF files)v Both manually, for bootstrapping, as well as after downscaling

o Data export (NetCDF files)v Selection of variables / time slices

o Start and monitor WRF-based downscaling on institutional resourcesv If requested results already exist, they are retrievedv If not, WRF is started

o Maintain data lineage records on BDE platformv Monitoring and further analysis v Subset of W3C PROV, http://www.w3.org/TR/prov-overview18-oct.-16

Page 9: 1st BDE SC5 pilot: rationale, components and reusability

Sample Analytics ¥ Climate-change indices / analytics (indicative)

o Number of summer days, frost days o Tropical nights o Monthly minimum value of daily maximum temperatureo Precipitation-based statistics

¥ Analytics for other applicationso Comfort indices (temperature, humidity)o Risk for forest fires (wind speed, temperature, humidity)o Atmospheric pollution (wind speed, vertical gradient of temperature, heat

fluxes)18-oct.-16

Page 10: 1st BDE SC5 pilot: rationale, components and reusability

Hangout and Evaluation

¥ Carried out an online hands-on and evaluation session (12 July 2016)

¥ Python UI for components¥ Most promising components warranting further

future development:o Tools to enable analyticso Data lineage

18-oct.-16

Page 11: 1st BDE SC5 pilot: rationale, components and reusability

Conclusions and Next Steps¥ Big data technologies can aid climate and weather

research o Advances on climate research feeds into a number of

societal challenges and areas of interest¥ Data abstraction and data lineage are generic

components which will enable further progresso We may contextualise and investigate further during

the 2nd pilot18-oct.-16