ICHEC - Observation systems, technologies and big data

13
Observation Systems, Technologies and Big Data Alastair McKinstry EPA Climate Workshop, 19 September 2013

description

Presnetation by Alastair McKinstry

Transcript of ICHEC - Observation systems, technologies and big data

Page 1: ICHEC - Observation systems, technologies and big data

Observation Systems, Technologies and Big Data

Alastair McKinstry EPA Climate Workshop, 19 September 2013

Page 2: ICHEC - Observation systems, technologies and big data

EPA Climate Workshop, 19 September 2013 2

ICHEC overview

National Technology Centre Established in 2005 Hosted by NUI Galway

Mandate: HPC & Big Data/Data Analytics Industry engagement Platform Science & Technology

25 staff in Dublin & Galway Mix of software developers,

domain specialists 4 in Climate/Environmental area

Page 3: ICHEC - Observation systems, technologies and big data

Old vs New: a x1000 step change New: •  Move the work to the data

–  100+ GB/day –  20-60m resolution, 12-15

bands

3 EPA Climate workshop, 19 September 2013

Old: •  Everybody downloads the

data •  e.g. data on 50km

grid. Few MB/day. •  1-3 bands.

Page 4: ICHEC - Observation systems, technologies and big data

Big Data: Networking

•  ICHEC and HEAnet have 10gb links – Not affordable on commercial rates – Used in CMIP5 data project with eINIS –  Point-to-Point with European partners

•  Move one copy to Ireland, process it at an “Exploitation portal” –  Share workflows. –  Processing triggered on data arrival

EPA Climate workshop, 19 September 2013 4

Page 5: ICHEC - Observation systems, technologies and big data

Big Data: Compute

•  Workflows are no longer a “hobby” task –  Not on a simple PC at 20-50m, but …

•  GPGPUs/ Intel MIC Accelerators: –  80 Tflop/s of capability on upcoming ICHEC system –  C.f. 40 Tflop/s needed to process EUMETSAT data

•  Shared workflows: atmospheric correction, QA •  ICHEC has portal experience: BDI, Bioportal, •  Automated: repeatability.

SFI Review – Royal Irish Academy, Dublin – 21st October 2010 5

Page 6: ICHEC - Observation systems, technologies and big data

Curation: an unsolved problem

•  What to keep? •  Useful to Ireland:

–  Products, raw data not archived at primary sites – Archiving “just Ireland” gives valuable time series

ICHEC could provide a platform for this: –  Funding needed from Beneficiaries or agencies. –  Lack of sustainability a problem (C4I, CMIP5) – Curation needs human work: data scientists.

SFI Review – Royal Irish Academy, Dublin – 21st October 2010 6

Page 7: ICHEC - Observation systems, technologies and big data

Processing in Ireland ?

•  Some products may not be produced upstream –  E.g. Algal blooms for North

Atlantic •  Need rapid processing

of raw data •  Critical for aquaculture •  Time critical.

–  May pave way for ground station for later satellites

EPA Climate workshop, 19 September 2013 7

Page 8: ICHEC - Observation systems, technologies and big data

Data Fusion

•  Combining Remote Sensing data with other datasets: –  Ground truthing –  Precipitation, soil

moisture (SMOS), runoff, river gauges, …

•  Needs consistent data, interoperability: –  Technical limitations –  Orgs. To make data

available to each other: collaborations

EPA Climate workshop, 19 September 2013 8

Page 9: ICHEC - Observation systems, technologies and big data

Combining with models

•  Experience with weather and climate •  Coupling models and data

assimilation key science skills at ICHEC

•  “Virtual Ireland” : assimilating observations and model data for

•  Pollution control: e.g. ICOS •  Flooding, hydrology •  Policy analysis

EPA Climate workshop, 19 September 2013 9

Page 10: ICHEC - Observation systems, technologies and big data

Other datasets

•  Not just Remote Sensing: – Make other datasets available: same grids, etc.

•  Model data, observations,

–  Somewhere for users to upload data: •  Indexed, Archived, remapped to new formats •  Data scientists who understand metadata and the

science behind the data

EPA Climate workshop, 19 September 2013 10

Page 11: ICHEC - Observation systems, technologies and big data

Citizen Science

•  Data to the citizen: – A portal for making datasets available: – Making WxS layers available for GIS, Google

Earth, … –  Enable “mashups”, analysis apps.

•  From the citizen: –  Smart apps for uploading observations,

measurements

EPA Climate workshop, 19 September 2013 11

Page 12: ICHEC - Observation systems, technologies and big data

Global opportunities

•  Commercial spinoffs: tech. startups looking for testbeds of global opportunities –  Promote tech. sector in Ireland, not just

exploitation of data in Ireland e.g. showcase big databases, fast networks

EPA Climate workshop, 19 September 2013 12

Page 13: ICHEC - Observation systems, technologies and big data

New value in old data

•  The big investment has been made –  Ireland’s contribution to ESA, –  “Random” Datasets in public sector, academia

•  Applications in: – Agriculture –  Policy and planning – Tourism

EPA Climate workshop, 19 September 2013 13