I’ve found the data; it’s free and open access. Now what? Gilberto Câmara National Institute...

Post on 26-Dec-2015

214 views 0 download

Tags:

Transcript of I’ve found the data; it’s free and open access. Now what? Gilberto Câmara National Institute...

I’ve found the data; it’s free and open access. Now what?

Gilberto CâmaraNational Institute for Space Research (INPE)Brazil

Geospatial data catalogue

Source: [Bai and Di, 2011]

The hard-wired map metaphor

Cantino planisphere (1502)

Map metaphors live in GIS

GeospatialDatabase

Desktop GIS Web service

Birds do it… bees do it… even educated fleas do it… Let’s do it…

Distribution Model Algorithm Distribution map

Tem

pera

ture

Precipitation

Environmental data

Ecological niche modelling

Speciesinfo

Speciesinfo Precipitation

Soil

Temperature

Environmental data

openModeller

Bioclim NeuralNetworks

GARP

Specimens

Modelling algorithmsopenopenModellerModeller

Natural disasters

Risk AnalysesRisk Analyses

Analysis

On-line data feedOn-line data feed

ModelsSatellite/RadarDCP

Rain totalFixed time and irregular – alertPoint dataOne file per DCP

Grid 4kmTotal rain 1hTotal rain 24hCurrent (mm/h)Binary file

ETA 40, 20, 5 KmEnsemble 40 KmTotal rain 72h72 filesASCII grid file

Natural Disasters Monitoring and Alert System

Até 10%

10 - 20%

20 – 30%

30 – 40%

40 – 50%

50 – 60%

60 – 70%

70 – 80%

80 – 90%

90 – 100%

Amazonia (4.000.000 km2 = size of Europe)

Deforestation in Amazonia

Daily warnings of newly deforested large areas

Real-time Deforestation Monitoring

166-112

116-113

116-112

30 Tb of data500.000 lines of code

150 man/years of software dev200 man/years of interpreters

How much it takes to survey Amazonia?

Data Access Hitting a Wall Current science practice based on data download

How do you download a petabyte?

Data Access Hitting a Wall Current science practice based on data download

How do you download a petabyte?You don’t! Move the software to the archive

Virtual Observatory

17

“If data is online, the internet is the world’s best telescope” (Jim Gray)

How many clouds do we need?

19

What happened here in the last 10 years?

source: INPE

< Corn > sugarcane ->

Are biofuels replacing food production in Brazil?

24% 26%30%

37% 41% 38%

26%

12%1% 1%

3% 3%3%

3%

7%17%

48%85%

98% 98%

1% 1%1%

1%

1%

1%

71% 70%65%

59%51%

44%

26%

3%1%

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

2000 2001 2002 2003 2004 2005 2006 2007 2008 2009

Área Agrícola Cana-de-açúcar Citrus Pastagem Vegetação Arbórea

Are biofuels replacing food production in Brazil?

24% 26%30%

37% 41% 38%

26%

12%1% 1%

3% 3%3%

3%

7%17%

48%85%

98% 98%

1% 1%1%

1%

1%

1%

71% 70%65%

59%51%

44%

26%

3%1%

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

2000 2001 2002 2003 2004 2005 2006 2007 2008 2009

Área Agrícola Cana-de-açúcar Citrus Pastagem Vegetação Arbórea

3 Tb of data behind this!

How much processing should be in the cloud?

Standard API? WPS?

23

Could this analysis be done in the cloud?

source: INPE

< Corn > sugarcane ->

Data chain in Earth System Sciencefonte: NASA

source: USGS

Getting to the Data

Requires solving the spatial semantics problem

Tentative solutions catalogues, metadata, SDIs, ontologies, web services, semantic reference

systems, linked open-data, ....

Communicating location is easy

Deforestation hotspots in Amazonia

Weather

source: WMO

11,000 land stations (3000 automated)900 radiosondes, 3000 aircraft 6000 ships, 1300 buoys5 polar, 6 geostationary satellites

Communicating about data is feasible

Communicating concepts is hard

Image source: WMO

vulnerability? climate change? poverty?

degradation

We’re bad at representing meaning

deforestation? degradation? disturbance?

Communicating concepts is hard

When did the Aral Sea reach the tipping point?

Communicating change is very hard

Objects exist, events occur (mount Etna 2002 eruption)

Observations allow us to get the measure of external reality

WMO’s global observing system

WMO GRIB: simple and cleanCode Parameter Units.052 Relative humidity % 053 Humidity mixing ratio kg/kg 054 Precipitable water kg/m2 055 Vapour pressure Pa 056 Saturation deficit Pa 057 Evaporation kg/m2 058 Cloud Ice kg/m2 059 Precipitation rate kg/m2/s 060 Thunderstorm probability % 061 Total precipitation kg/m2 076 Cloud water k g/m2 ..

When did the large flood occur in Angra?

When did the large flood occur in Angra? When precipitation was > 10mm/hour for 5 hours

Coverage set (hourly precipitation grid)

Cover change set (precipitation > 10

mm/hour)

When did the large flood occur in Angra?

CoverageSet p1 (“Precipitation”).

CoverChangeSet s1 = extract (p1 > 10, time1, time2)

TimeSeries t1 = intersect (s1, geom (“Angra”)

How many walruses reached Baffin island?

How many walruses reached Baffin island? Those whose trajectories touched Baffin isld

moving objects

trajectories

How many walruses reached Baffin island?

MovingObjectSet m1(“walruses”)

Trajectories t1= extract(m1,time1,time2)

Trajectories t2 = reach(t1, geom (“Baffin”))

When was this area converted from food to biofuel production?

Coverage set (remote sensing

images)

Time Series (vegetation

index)

When was this area converted from food to biofuel production? When the vegetation index peaked once a year.

Coverage set (remote sensing

images)

Time Series (vegetation

index)

When was this area converted from food to biofuel production?

CoverageSet c1 (“Cerrado”).

TimeSeries ts1 = extract (c1, “VegIndex”)

for year = y1, yn do

time1 = year*52 + 1

time2 = time1 + 52

TimeSeries t2 = onepeak(ts1, time1, time2)

Time t1 = first (t2)

A new kind of geospatial analysis engine?

TerraLib: spatio-temporal database as a basis for innovation

Visualization (TerraView)

Spatio-temporalDatabase (TerraLib)

Modelling (TerraME)

Data Mining(GeoDMA)Statistics (aRT)

We need a new generation of GI appliancesConnect data brokering, sources, analysis We need many clouds with remote processingDescribe observations, not eventsAllow users to process the data

Conclusions