07 chris davis

38
Challenges and Opportunities of Linked Open Energy Data Chris Davis http://enipedia.tudelft.nl [email protected]

description

Presentatie op Appsforenergy 2013, Take the challenge !

Transcript of 07 chris davis

Page 1: 07 chris davis

Challenges and Opportunities of Linked Open Energy Data

Chris Davishttp://enipedia.tudelft.nl

[email protected]

Page 2: 07 chris davis

Who am I?

● Postdoc Energy & Industry, TBM, TU Delft

● Focus on Industrial Ecology, Open Data, Collaborative Software, Modeling, Visualization, Analytics, etc.

Page 3: 07 chris davis

Motivations

● Energy and sustainability are some of the most important topics of the 21st century

● Need both aggregated and fine-grained data

● Research can be data intensive● There's a lot out there, but

connecting it is tedious● Researchers often duplicate effort● It would be great to revolutionize

how we deal with this data

Page 4: 07 chris davis

Information wants to be free because it has become so cheap

to distribute, copy, and recombine - too cheap to meter.

Stewart Brand

There's a Tension...

Page 5: 07 chris davis

It wants to be expensive because it can be

immeasurably valuable to the recipient.

Stewart Brand

There's a Tension...

Page 6: 07 chris davis

That tension will not go away. It leads to endless wrenching debate

about price, copyright, “intellectual property,” and the moral rightness of casual distribution,

because each round of new devices makes the tension worse, not better.

Stewart Brand

There's a Tension...

Page 7: 07 chris davis

If you cling blindly to the expensive part

of the paradox, you miss all the action

going on in the free part.

The pressure of the paradox forces information

to explore incessantly.Stewart Brand

There's a Tension...

Page 8: 07 chris davis

Pirolli & Card (2005) The Sensemaking Process and Leverage Points for Analyst Technology as Identified Through Cognitive Task Analysis

Page 9: 07 chris davis

Pirolli & Card (2005) The Sensemaking Process and Leverage Points for Analyst Technology as Identified Through Cognitive Task Analysis

Data Collectors

Data Scientists, Statisticians, Researchers

Policy/Decision Makers

Page 10: 07 chris davis

=-

A Metaphor for Open Data...A Metaphor for Open Data...

Page 11: 07 chris davis

It's about Resource Efficiency

● Information is a resource just as much as physical resources

● ...however, it ideally gets better the more that it is used● Data quality is (partly) a function of the amount of

attention it gets● Structure leads to benefits, but requires effort – figure

out what has most value to the community

Page 12: 07 chris davis

Inspiration from Pokemon...

Page 13: 07 chris davis

http://www.youtube.com/watch?v=XpvQNn0n_Qw

Page 14: 07 chris davis
Page 15: 07 chris davis

OpenStreetMap (last 90 days)

http://www.itoworld.com/map/129

Page 16: 07 chris davis

enipedia.tudelft.nl/maps

Page 17: 07 chris davis

Enipedia.tudelft.nl

Page 18: 07 chris davis

18

Page 19: 07 chris davis

19

Page 20: 07 chris davis

20

About data quality...

Page 21: 07 chris davis

21

A tale of one (or four?) power stations and seven data sets

Page 22: 07 chris davis

22

How the European Commission manages data

Large Combustion Plants Directivehttp://ec.europa.eu/environment/air/pollutants/stationary/lcp/legislation.htm

Page 23: 07 chris davis

I wish...

Page 24: 07 chris davis
Page 25: 07 chris davis
Page 26: 07 chris davis

Kraftwerk (Anlagennummer: 0001)

Who is this? (EU ETS Data)

Page 27: 07 chris davis

Who is this?

● Name: Kraftwerk (Anlagennummer: 0001)● Account Holder: Felix Schoeller jr. Foto und Spezialpapiere GmbH

& Co KG● address1: Fabrikstraße 1● address2: Felix Schoeller jr. Foto und Spezialpapiere GmbH & Co.

KG● City: Weißenborn/Erzgeb.● CountryCode: DE● InstallationIdentifier: 891● InstallationName: Kraftwerk (Anlagennummer: 0001)● MainActivityTypeCode: 1● MainActivityTypeCodeLookup: Combustion installations with a

rated thermal input exceeding 20 MW● PermitIdentifier: 14310-0300● ZipCode: 09600

Page 28: 07 chris davis

Inspiration

http://www.flickr.com/photos/maxbraun/98688824/ http://www.flickr.com/photos/acme/229065626/

Page 29: 07 chris davis

Matching Entities

0001 09600 1 909 anlagenkonto anlagennummer co erzgeb fabrikstrasse felix foto gmbh jr kg kraftwerk schoeller spezialpapiere technocell und weissenborn

09600 1 co erzgeb fabrikstrasse felix foto gmbh jr kg schoeller spezialpapiere und weissenborn werk

49086 burg co felix foto gmbh gretesch jr kg osnabruck schoeller spezialpapiere und

0001 09600 1 909 anlagenkonto anlagennummer co erzgeb fabrikstrasse felix foto gmbh jr kg kraftwerk schoeller spezialpapiere technocell und weissenborn

https://en.wikipedia.org/wiki/Claude_Shannonhttp://en.wikipedia.org/wiki/Self-information

Page 30: 07 chris davis

30

The current data management practices results in:

Unintentionally Anonymized Open Data

Optimized for Inefficient Maintenance

and an Uphill Battle to Enforce

Principles of Data Integrity

Page 31: 07 chris davis

It's power laws all the way down

● Both contributors & data● Challenge is aligning the two

Page 32: 07 chris davis

Officially Curated vs. Crowdsourced Data

Page 33: 07 chris davis

Officially Curated vs. Crowdsourced Data

Page 34: 07 chris davis

34

Officially Curated vs. Crowdsourced Data

● Crowdsourcing generally OK for easily verifiable data● Officially curated data needed for comprehensive, hard

to verify data, small specialized communities● Crowdsourced data is only possible because of revision

control.

Page 35: 07 chris davis

How to Measure Data Quality?

DataQuality

ResearcherSkill/Experience

# Viewers/Editors

Ease of IndependentVerification

= X X

Low Editor Diversity

High Editor Diversity

Page 36: 07 chris davis

36

How to Measure Data Quality?

● Eric Raymond – “With many eyes all bugs are shallow”● But... not all eyes are evenly distributed

Page 37: 07 chris davis

Big Data (?)

Page 38: 07 chris davis

38

Questions?

Chris Davishttp://enipedia.tudelft.nl

[email protected]