Elizabeth Churchill, "Data by Design"

49
Elizabeth F. Churchill Data by Design

description

 

Transcript of Elizabeth Churchill, "Data by Design"

Page 1: Elizabeth Churchill, "Data by Design"

Elizabeth F. Churchill

Data by Design

Page 2: Elizabeth Churchill, "Data by Design"

Design/Science of participation

(1) Science through (platforms for mediated communication) TMSP

(2) Science on (social science contributions about fundamentals of psychology/communication/collaboration/cooperation)

“Hubble telescope” of social science

WE NEED TO ADDRESS THE DESIGN OF DATA (FOR) SCIENCE ISSUE DIRECTLY

Page 3: Elizabeth Churchill, "Data by Design"

On (1) – TMSP via SMPs

Awareness Conversation and content exchange good;

content storage, indexing and search poor Content sharing

Malleable as well as stable content Coordination

Long and short term Collaborative production

Lightweight to complex Longevity

Currently questionable….

Page 4: Elizabeth Churchill, "Data by Design"

Cooperative activities, centralised

Collective action, decentralised

Collective action, centralised

Page 5: Elizabeth Churchill, "Data by Design"

On (2)- Sciences of the social

Data quality descriptive/predictive; observed/understood;

local/universal; reactive/proactive; stand-alone/replicated

Science quality Data stability/longevity, TOS, content and

social responsibility

WE NEED TO ADDRESS THE DESIGN OF DATA (FOR) SCIENCE ISSUE DIRECTLY

Designers : Statisticians : Computer scientists : Data Scientists : Social scientists

Page 6: Elizabeth Churchill, "Data by Design"

Focus on (2)

Mike Loukideshttp://radar.oreilly.com/2010/06/what-is-data-science.html

Page 7: Elizabeth Churchill, "Data by Design"

On Data Science

“What differentiates data science from statistics is that data science is a holistic approach. We’re increasingly finding data in the wild, and data scientists are involved with gathering data, massaging it into a tractable form, making it tell its story, and presenting that story to others.”

The first step of any data analysis project is “data conditioning,” or getting data into a state where it’s usable.

Page 8: Elizabeth Churchill, "Data by Design"

On Data Science

The most meaningful definition I’ve heard: “big data” is when the size of the data itself becomes part of the problem.

The need to define a schema in advance conflicts with reality of multiple, unstructured data sources, in which you may not know what’s important until after you’ve analyzed the data.

Page 9: Elizabeth Churchill, "Data by Design"

On Data Science

Data scientists … come up with new ways to view the problem, or to work with very broadly defined problems: “here’s a lot of data, what can you make from it?”

The future belongs to the companies who figure out how to collect and use data successfully.

…and the scientists?

Page 10: Elizabeth Churchill, "Data by Design"

Business logic is not science logic

Page 11: Elizabeth Churchill, "Data by Design"

http://www.forbes.com/sites/onmarketing/2012/06/28/social-media-and-the-big-data-explosion/

Page 12: Elizabeth Churchill, "Data by Design"

Data – the ‘this is the dataset’ problem

Page 13: Elizabeth Churchill, "Data by Design"

Verbeeldingskr8 on Flickr

Page 14: Elizabeth Churchill, "Data by Design"

Interface elements

….lead to data, inviting action and inviting information

Page 15: Elizabeth Churchill, "Data by Design"

Facebook

Page 16: Elizabeth Churchill, "Data by Design"
Page 17: Elizabeth Churchill, "Data by Design"

Like! Like? Agree! Disagree! (bookmarked)Hello Sherry

Page 18: Elizabeth Churchill, "Data by Design"

Dating

Page 19: Elizabeth Churchill, "Data by Design"
Page 20: Elizabeth Churchill, "Data by Design"

profile creation

explicit versus passive “personalisation”

Page 21: Elizabeth Churchill, "Data by Design"
Page 22: Elizabeth Churchill, "Data by Design"
Page 23: Elizabeth Churchill, "Data by Design"

Anxiety, self reflection, identity….

Eva Illouz

Page 24: Elizabeth Churchill, "Data by Design"

Flickr

Page 25: Elizabeth Churchill, "Data by Design"
Page 26: Elizabeth Churchill, "Data by Design"

Recording and Sharing

DocumentingPersonal and Collective Memory

CompetitionStatus

AffiliationGroup Membership

LearningEmulating

AwarenessNear and Far

Curiosity/Voyeurism

Page 27: Elizabeth Churchill, "Data by Design"

Flickr – Photo sharing by user location

Page 28: Elizabeth Churchill, "Data by Design"

The Library of Congress, the Powerhouse Museum, the Smithsonian, New York Public Library, and Cornell University Library

Page 29: Elizabeth Churchill, "Data by Design"
Page 30: Elizabeth Churchill, "Data by Design"
Page 31: Elizabeth Churchill, "Data by Design"
Page 32: Elizabeth Churchill, "Data by Design"

http://www.flickr.com/photos/powerhouse_museum/2980051095/

Page 33: Elizabeth Churchill, "Data by Design"

http://www.museumsandtheweb.com/mw2011/papers/rethinking_evaluation_metrics_in_light_of_flic

Page 34: Elizabeth Churchill, "Data by Design"

Data longevity

“Like all Commons members, the other qualitative measure we value highly is the sheer inventiveness of Flickr members who engage with the photographs.

Currently, Cornell saves links to examples of reuse on delicious (http://www.delicious.com) and displays them as a feed on its website.

Page 35: Elizabeth Churchill, "Data by Design"
Page 36: Elizabeth Churchill, "Data by Design"
Page 37: Elizabeth Churchill, "Data by Design"
Page 38: Elizabeth Churchill, "Data by Design"
Page 39: Elizabeth Churchill, "Data by Design"
Page 40: Elizabeth Churchill, "Data by Design"
Page 41: Elizabeth Churchill, "Data by Design"
Page 42: Elizabeth Churchill, "Data by Design"
Page 43: Elizabeth Churchill, "Data by Design"
Page 44: Elizabeth Churchill, "Data by Design"

Business logic is not science logic

Page 45: Elizabeth Churchill, "Data by Design"

Design/Science of participation

(1) Science through (platforms for mediated communication)

TMSP

(2) Science on (social science contributions about fundamentals of collaboration/cooperation)

“Hubble telescope” of social science

Page 46: Elizabeth Churchill, "Data by Design"

Reflections on requirements

Stability – the existence of content in an accessible (and hopefully the same) format over time

Science requires Consistency: consistently re-code the same data in the same way over a period

of time Reproducibility: the tendency for a group of coders to classify categories

membership in the same way Accuracy: or the extent to which the classification of a text corresponds to a

standard or norm statistically. Validity

correspondence of the categories to the conclusions, avoiding ambiguity and addressing multiple possible classifications

Proof: trust in the inferential procedures and clarity of what level of implication is allowed. i.e. do the conclusions follow from the data or are they explainable due to some other phenomenon

Generalizability of results to a theory Cross-setting comparative interventions

Page 47: Elizabeth Churchill, "Data by Design"

On (2)- Sciences of the social

Data quality descriptive/predictive; observed/understood;

local/universal; reactive/proactive; stand-alone/replicated

Science quality Data stability/longevity, TOS, content and

social responsibility

WE NEED TO ADDRESS THE DESIGN OF DATA (FOR) SCIENCE ISSUE DIRECTLY

Designers : Statisticians : Computer scientists : Data Scientists : Social scientists

Page 48: Elizabeth Churchill, "Data by Design"

Questions?

[email protected]

xeeliz on Twitter

Page 49: Elizabeth Churchill, "Data by Design"

Acknowledgements

On dating: Elizabeth Goodman; on Flickr: Shyong (Tony) Lam, on instrumentation and analysis: David Ayman Shamma & M. Cameron Jones; on Flickr Commons: George Oates

Flickr photographers: Marina Noordegraaf (Verbeeldingskr8), Tim Jagenberg, Nicolas Nova