Leveraging the Power of Social Media
-
Upload
leon-derczynski -
Category
Data & Analytics
-
view
374 -
download
8
description
Transcript of Leveraging the Power of Social Media
Leveraging the Power of Social Media
Leon Derczynski
Natural Language Processing GroupDepartment of Computer Science
Faculty of EngineeringUniversity of Sheffield
work in the field of “computational linguistics”
focus on turning textinto “understanding”
and “decision support”
the “AI effect”
Pamela McCorduck
artificial intelligence is less impressive when we know how it works
the “AI effect”
Pamela McCorduck
artificial intelligence is less impressive when we know how it works
..so this talk won't have deep technical detail
language
(note huge evolutionary advancement)
??
social media
social media – a poster child for big data
big data:
promises new insights
is (was) a cool buzzword
causes headaches
what is it?
V: velocity
twitter: 255 000 000 users / month
Facebook: 1 280 000 000 users / month
VV: volume
reddit:34 000 000 posts / month
twitter:650 000 000 messages / month
VVV: variety
there are many online social networks
we need one of these
there are many online social networks
we need one of these
artificial intelligence
“Human knowledge is expressed in language. So computational linguistics is very important.”
- Mark Steedman
Start: sequence of bytes
[naturallanguage
processinggoeshere]
End: actionable knowledge
why bother programming at all?
why bother programming at all?
… let the computer program itself!
machine learning:
make decisions about tasks based on things you've seen before
a little bit like human learning
give text and examples of what we want done
machine learns to from these examples
understanding language
social media text is surprisingly formal
they see me rollin
- a typo?
they see me rollinthey hatin
- perhaps not. G-dropping mapped from speech
they see me rollinthey hatinpatrollin
- incidentally,this linguistic phenomenon is a good predictor of
education level
they see me rollinthey hatinpatrollin
tryna catch me ridin dirty
- a new style! flawless; not a single mistake
omb x
- surely they mean “omg”?
omb ✔
- the keys are like, right next to each other
Xreally? this guy?
Shall we go out for dinner this evening?
Ey yo wen u gon let me tap dat
spelling ability distribution in net slang users
with spelling ability distribution in non-slang users
Do you feel luccy, punk?
Do you feel luccy, punk?
challenge 1: what language is this anyway
je bent Jacques cousteau niet die een nieuwe soort heeft ontdekt, het is duidelijk, ze bedekken hun gezicht. Get over it
RT @TomPIngram: VIVA LAS VEGAS 16 - NEWS #constantcontact http://t.co/VrFzZaa7
challenge 2: pls type better
I wonde rif Tsubasa is okay..
- misplaced space = two new words
no homwork tonight.. suprising??
- maybe there should be!
challenge 3: finding names
derekx is a person
milesx might be a person
Marie Clairex should not be a person
Exodus Porter x probably an OK person, but actually a beer
challenge 3: finding names
Spicy Pickle Jr. x apparently actually a person
challenge 3: finding names
Spicy Pickle Jr. x apparently actually a person
???
old news
social media defends against earthquakes2010
Japanese and US quake response times:
down from ~20s to ~17.5s
social media predicts epidemics2012
exhibit a: one dead crow
social media mentions of dead crows predict WNV in humans
''There's a dead crow in my garden''
social media predicts you getting flu2012
@mari: i think im sick ugh..
great potential for misuse :)
this november:
social media dispatches fire engines2014
trust
if hospitals and fire stations act based on tweets,
wrong information is extra-harmful
rumoursspeculation
misinformationdisinformation
who can you trust online?Imagine a lie detector for politicians / Fox News
responsibility
1. Collect tweets2. ????3. Profit!
how long do we keep them for?
- “15 years is OK, right?” - NSA
what do we store and process?
- “just metadata, it's harmless” - GCHQ
(from Kurt Opshal's slides at the Chaos Communication Congress, photo by Marion Marschalek)
bias
newsstyle
socialmedia
most of our language AI was trained on news text
the bias is:
- middle class- white
-working age- educated
- male- 1980s/1990s- from the US
- journalist- following AP guidelines
your phone rewards you if you talk and write like
(ok.. sort of)
your phone rewards you if you talk and write like
(ok.. sort of)
.. and punishes you when you don't.
(not cool!)
twitter bias is different
- not German or Nordic- are young(ish)
lower requirements
- you can publish even if you're not a journalist- still operates beyond the 1990s
some new requirements
- you do need access to the internet...- ...and twitter (对不起,中国人 )
the big picture
we're racing ahead and improving life quality
there is immense value in “trivia”
understanding social media lets ushelp people better
understanding social media lets ushelp people better
Thank you!
Leon Derczynski