Comparing Automated Factual Claim Detection Against...

Post on 27-Sep-2020

4 views 0 download

Transcript of Comparing Automated Factual Claim Detection Against...

Comparing Automated Factual Claim Detection Against Judgments of Journalism OrganizationsNaeemul Hassan#, Mark Tremayne, Fatma Arslan, Chengkai Li

University of Texas at Arlington 2016 Computation + Journalism Symposium, Sept. 30th, 2016

# The author is now with University of Mississippi

©2015-2016 The University of Texas at Arlington. All Rights Reserved.

Fact-checking at Peak

2

©2015-2016 The University of Texas at Arlington. All Rights Reserved.

Fact-checking the Presidential Debates

3

http://www.politifact.com/

http://www.politifact.com/

©2015-2016 The University of Texas at Arlington. All Rights Reserved.4

©2015-2016 The University of Texas at Arlington. All Rights Reserved.

Closed-caption decoder

Live Coverage of Debates

5

©2015-2016 The University of Texas at Arlington. All Rights Reserved.

Presidential Debate

Transcripts (1960-2012)

Ground Truth

Finding Important Factual Claims: A Supervised Learning Task

Human Annotation Feature

Vectors

Feature Extraction

Learning Algorithm

2016 Presidential Debates

Important factual claims

6

©2015-2016 The University of Texas at Arlington. All Rights Reserved.

Training Dataset o Transcripts of all 30 debates (11 general elections) in history o 20k sentences by candidates: removed very short (< 5 words) sentences o Ground truth collection website

7

©2015-2016 The University of Texas at Arlington. All Rights Reserved.

Experiments 2016 U.S. Presidential Election Primary Debates o Transcripts of 21 debate (12 Republican, 9 Democratic) o 30737 sentences o Collected fact-checks by PolitiFact and CNN o Detected topics of the sentences 12 Most Important Topics (MIP) used by Gallup* For each topic, we manually created a set of keywords representing that

topic. E.g., keywords for topic “Abortion”: {abortion, pregnancy, planned parenthood}

8

* M. E. McCombs and D. L. Shaw. The agenda-setting function of mass media. Public opinion quarterly, 1972 * T. W. Smith. America’s most important problem-a trend analysis. Public opinion quarterly, 1980

©2015-2016 The University of Texas at Arlington. All Rights Reserved.

(1) Sentences selected by Journalism organizations (CNN and PolitiFact) for checking had ClaimBuster scores significantly higher (were more check-worthy) than sentences not selected for checking.

9

©2015-2016 The University of Texas at Arlington. All Rights Reserved.

Distribution of topics over sentences scored low (≤ 0.3) by ClaimBuster

Distribution of topics over sentences scored high (≥ 0.5) by ClaimBuster

10

©2015-2016 The University of Texas at Arlington. All Rights Reserved.

(2) ClaimBuster is similar to Journalism organizations (CNN and PolitiFact) in the distribution of topics among highly scored sentences (≥ 0.5) and fact-checked sentences.

11

©2015-2016 The University of Texas at Arlington. All Rights Reserved.

new factual claims: solicit analyses from professionals; algorithmic fact-checking

o debates o interviews o speeches o political ads o social media o web o news

repository of fact-checks by professionals

matched existing fact-checks: delivered via browser extensions, mobile apps, and smart-TV apps

The Quest to Automate Fact-Checking

claimed detected and ranked by check-worthiness

12

©2015-2016 The University of Texas at Arlington. All Rights Reserved.

One Day Robot Fact-checkers will Dominate

13

o Less subjective

©2015-2016 The University of Texas at Arlington. All Rights Reserved.

One Day Robot Fact-checkers will Dominate

14

o Restless and more efficient

©2015-2016 The University of Texas at Arlington. All Rights Reserved.

One Day Robot Fact-checkers will Dominate

15

o More intelligent

©2015-2016 The University of Texas at Arlington. All Rights Reserved.

Acknowledgment UTA Students o Naeemul Hassan (graduated) o Fatma Arslan o Siddhant Gawsane o Aaditya Kulkarni o Joseph Minumol (graduated) o Vikas Sable o Gensheng Zhang o Many more students who are

currently working on it

Collaborators o Bill Adair (Duke) o Pankaj Agarwal (Duke) o Peter Fray (UTS) o James Hamilton (Stanford) o Mark Tremayne (UTA) o Jun Yang (Duke) o Cong Yu (Google Research)

16

©2015-2016 The University of Texas at Arlington. All Rights Reserved.

Acknowledgment Funding sponsors

Disclaimer: This material is based upon work partially supported by the National Science Foundation Grants 1117369, 1408928 and 1565699 and a Knight Prototype Fund from the Knight Foundation. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the funding agencies.

17

©2015-2016 The University of Texas at Arlington. All Rights Reserved.

Thank You! Questions? o http://ranger.uta.edu/~cli cli@uta.edu o Twitter @ClaimBusterTM o Prototype http://idir.uta.edu/claimbuster o You are invited to contribute http://bit.ly/claimbusters

18