Scraping the Social Graph with Ushahidi and SwiftRiver

Post on 12-May-2015

1.814 views 1 download

Tags:

description

As delivered by Jon Gosier at Georgetown University on June 28th.

Transcript of Scraping the Social Graph with Ushahidi and SwiftRiver

SCRAPING THE SOCIAL GRAPHCRISIS MONITORING WITH SOCIAL MEDIA

Georgetown Universityjongos@gmail.com

@jongos

About UshahidiUshahidi is a free, open-source platform used for crowdsourcing and visualizing data geospatially. It was born out of the 2008 election unrest when founders Juliana Rotich, Erik Hersman, Ory Okolloh and David Kobia wanted to allow Kenyan citizens a way to SMS reports of incident to know what was occurring around them. This was one of the earliest uses of crowdsourcing for crisis response.

Notable UsesUshahidi has been deployed in major global crisis scenarios, allowing organizations to draw situational awareness from the crowd. To date it ’s been downloaded over 15,000 times.

Some of the more notable deployments include recently in Egypt, the Haiti earthquakes, the fires in Russia, the Queensland floods in Australia.

The ChallengeA s t h e a m o u n t s o f d a t a aggregated by Ushahidi users grows, they face a common problem. How do they effectively manage this realtime data? How can we help them discover credible and actionable info from the deluge of reports they’ll get from the public? The SwiftRiver initiative was created to begin to answer some of these questions for Ushahidi deployers.

USHAHIDI HAITI

OILSPILL CRISIS MAP

UCHAGUZI

RUSSIAN FIRES “HELP MAP”

PAKREPORT

TUBESTRIKE CROWDMAP

PRAGUEWATCH

HARASSMAP

U-SHAHID

CHRISTCHURCH

SINSAI.INFO

“It’s not information overload. It’s filter failure.”

- Clay Shirky

PLATFORM GOALS

Consider the context, relevance defined by the user

Offer an opt-in global database of trust and authority

Algorithms augment, but not define, human decision making

Work across media channels (Twitter, Email, Feeds, SMS)

Be accessible (offline/online/mobile)

Index massive amounts of the mobile/social web

KNC AWARD & RIVER ID

final component of the veracity algorithm

needs to be able to scale massively

changing the backend (Hadoop & Mongo DB)

research by data scientists

use-cases at scale and iterative improvements

THIS IS A DATA PROBLEM

PROGRESS

7,000+ downloads in 6 months

7,000+ API Users

100,000+ Lines of code

5 APIs and 2 Apps

Data Items Processed - 70,000,000 (liberal extrapolation)

Sweeper - User Interface

NETWORK DYNAMICS

Good crowdsourcing campaigns build upon the existing ties between people and their networks. There’s a natural mult-iplier, where the people in the original network become nodes for new networks and so on.

❖ Participation is permission❖ Consent is not carte blanche❖ Clarity is critical❖ Trust is Earned or Burned❖ Transparency is hard to teach

EARNING TRUST

❖ Protection of data is different than the protection of people/identity❖ Standards like HTTPS or SSL❖ Encryption❖ Anonymity is not a given (TOR Project)❖ The usual fail-points are still threats (weak passwords, compromised servers, careless employees)

PRIVACY

❖ Verify factual occurrences (location, time, date)❖ Verify contributor identity (who?)❖ Verify contributor credentials

VALIDATION

Everything beyond these three points is an educated guess. Anyone looking to game the campaign will only be affective if they are able to compromise the aforementioned.

❖ Ease of participation❖ Low risk of failure or shame❖ Social Capital ❖ Repute & Accolade❖ Barter❖ Strategic Spending ($)❖ Data Sharing❖ Altruism & Charity

MOTIVATION

THANKS!Knight News Challenge

jg@swiftly.org@swiftriver