Credibility, Identity Resolution, and Privacy on Online Social Media

34
Credibility, Identity Resolution, and Privacy on Online Social Media Indo Belarus Bilateral Workshop on Cyber Security CDAC Noida Nov 16, 2016 Ponnurangam Kumaraguru (“PK”) Associate Professor ACM Distinguished Speaker fb/ponnurangam.kumaraguru, @ponguru

Transcript of Credibility, Identity Resolution, and Privacy on Online Social Media

Page 1: Credibility, Identity Resolution, and Privacy on Online Social Media

Credibility, Identity Resolution, and Privacy on Online Social

Media Indo Belarus Bilateral Workshop on Cyber Security

CDAC Noida Nov 16, 2016

Ponnurangam Kumaraguru (“PK”)Associate Professor

ACM Distinguished Speakerfb/ponnurangam.kumaraguru, @ponguru

Page 2: Credibility, Identity Resolution, and Privacy on Online Social Media

2

Who am I? Associate Professor, IIIT-Delhi Ph.D. from School of Computer Science,

Carnegie Mellon University (CMU) Research interests

- Social Computing, Computational Social Science, Complex Networks pertaining to Human Behavior, specifically in the context of Security & Privacy

Co-ordinate and manage Precog, precog.iiitd.edu.in

ACM Distinguished Speaker

Page 3: Credibility, Identity Resolution, and Privacy on Online Social Media

https://www.youtube.com/channel/UCHWDvGDh4QjWbV79bM2neSg

Page 4: Credibility, Identity Resolution, and Privacy on Online Social Media

What we dabble with!

Page 5: Credibility, Identity Resolution, and Privacy on Online Social Media

https://arxiv.org/pdf/1611.01911v2

Page 6: Credibility, Identity Resolution, and Privacy on Online Social Media

http://labs.precog.iiitd.edu.in/killfie/

Page 7: Credibility, Identity Resolution, and Privacy on Online Social Media
Page 8: Credibility, Identity Resolution, and Privacy on Online Social Media

https://arxiv.org/pdf/1610.07772v1.pdf

Page 9: Credibility, Identity Resolution, and Privacy on Online Social Media

http://precog.iiitd.edu.in/Publications_files/aa-spyingextensions.pdf

Page 10: Credibility, Identity Resolution, and Privacy on Online Social Media

10

Non-trustworthy Content

FAKE

$

#ChennaiFloodsRUMORS

Page 11: Credibility, Identity Resolution, and Privacy on Online Social Media

11

Methodology

Page 12: Credibility, Identity Resolution, and Privacy on Online Social Media

12

Training Data 500 Tweets per event Used CrowdFlower

Event Tweets UsersBoston Marathon Blasts (2013) 7,888,374 3,677,531

Typhoon Haiyan / Yolanda (2013) 671,918 368,269

Cyclone Phailin (2013) 76,136 34,776

Washington Navy yard shootings (2013) 484,609 257,682

Polar vortex cold wave (2014) 143,959 116,141

Oklahoma Tornadoes (2013) 809,154 542,049

Total 10,074,150 4,996,448

Page 13: Credibility, Identity Resolution, and Privacy on Online Social Media

13

Credibility Modeling Feature set Features (45)

Tweet meta-data Number of seconds since the tweet; Source of tweet (mobile / web/ etc); Tweet contains geo-coordinates

Tweet content (simple)

Number of characters; Number of words; Number of URLs; Number of hashtags; Number of unique characters; Presence of stock symbol; Presence of happy smiley; Presence of sad smiley; Tweet contains `via'; Presence of colon symbol

Tweet content (linguistic)

Presence of swear words; Presence of negative emotion words; Presence of positive emotion words; Presence of pronouns; Mention of self words in tweet (I; my; mine)

Tweet author Number of followers; friends; time since the user if on Twitter; etc.

Tweet network Number of retweets; Number of mentions; Tweet is a reply; Tweet is a retweet

Tweet links WOT score for the URL; Ratio of likes / dislikes for a YouTube video

Page 14: Credibility, Identity Resolution, and Privacy on Online Social Media

Implementation

Page 15: Credibility, Identity Resolution, and Privacy on Online Social Media

15

Feedback by Users

Page 16: Credibility, Identity Resolution, and Privacy on Online Social Media

v

Page 17: Credibility, Identity Resolution, and Privacy on Online Social Media

Harvard (1839) – Harvard – Harvard – Harvard – MIT – Northwestern – UIUC – WUSL – CMU (2009) – IIITD (2015)

Page 18: Credibility, Identity Resolution, and Privacy on Online Social Media

18

http://twitdigest.iiitd.edu.in/TweetCred/

Page 19: Credibility, Identity Resolution, and Privacy on Online Social Media

19

De-duplicating audience

Social audience = 437,632 + 153,000 + 805,097 or less??

Page 20: Credibility, Identity Resolution, and Privacy on Online Social Media

20

Challenges

ProfessionalOpinion

Dating

Heterogeneous OSNs

Personal

Degree of Details

Quality and descriptive personal And professional information

Little personal information Descriptive opinions

Attribute Evolution

Time

Information evolved on one but not on other

{jainpari, Bangalore}

Registration with same information on both OSNs{paridhij, New Delhi}

Page 21: Credibility, Identity Resolution, and Privacy on Online Social Media

21

Generic Identity Resolution

Extract available &

discriminativefeatures

Candidate Identities

IDENTITY SEARCH IDENTITY LINKING

Pairwise Comparisons

Page 22: Credibility, Identity Resolution, and Privacy on Online Social Media

cerc.iiitd.ac.in22

Heuristic Identity Search

Paridhi Jain, Ponnurangam Kumaraguru, and Anupam Joshi. 2013. @I seek ‘fb.me’: Identifying Users across Multiple Online Social Networks. In Proceedings of the 22nd International Conference on World Wide Web, WWW ’13 Companion. ACM, New York, NY, USA, 1259- 1268. DOI=http://dx.doi.org/10.1145/2487788.2488160 [Honorable Mention Award}

Page 23: Credibility, Identity Resolution, and Privacy on Online Social Media

Harvard (1839) – Harvard – Harvard – Harvard – MIT – Northwestern – UIUC – WUSL – CMU (2009) – IIITD (2016)

Page 24: Credibility, Identity Resolution, and Privacy on Online Social Media

24

Page 25: Credibility, Identity Resolution, and Privacy on Online Social Media

25

How many of you have posted mobile numbers on Online Social

Networks?

How many of you have seen mobile numbers being posted on

Online Social Networks?

Page 26: Credibility, Identity Resolution, and Privacy on Online Social Media

26

Sample posts

Page 27: Credibility, Identity Resolution, and Privacy on Online Social Media

27

Sample posts

Page 28: Credibility, Identity Resolution, and Privacy on Online Social Media

28

Sample posts

Page 29: Credibility, Identity Resolution, and Privacy on Online Social Media

29

Sample posts

Page 30: Credibility, Identity Resolution, and Privacy on Online Social Media

30

Data statistics Twitter: 12th October 2012 – 20th October 2013 Facebook: 16th November 2012 – 20th April 2013

Numbers Category +91 Category 0 Category void Total

Twitter Facebook Twitter Facebook

Twitter Facebook Twitter Facebook

Mobile Numbers

885 2,191 14,909 8,873 25,566 25,294 41,360 36,358

User profiles

1,074 2,663 17,913 9,028 31,149 25,406 49,817 36,588

Page 31: Credibility, Identity Resolution, and Privacy on Online Social Media
Page 32: Credibility, Identity Resolution, and Privacy on Online Social Media

32

SocialCaller App

https://play.google.com/store/apps/details?id=com.ayush.socialcaller&hl=en

Page 33: Credibility, Identity Resolution, and Privacy on Online Social Media

33

Takeaways Online Social Media is a different beast in

terms of privacy, identity, and credibility- Research / technologies should be developed

Multiple interesting research, engineering, and innovation waiting to be done

Interested in hosting students – B.Tech., M.Tech., Ph.D.

Page 34: Credibility, Identity Resolution, and Privacy on Online Social Media

34

https://www.facebook.com/PreCog.IIITD/