Professional Credibility: Authority on the Web Jack G. Conrad, Jochen Leidner, Frank Schilder...

18
Professional Credibility: Authority on the Web Jack G. Conrad, Jochen Leidner, Frank Schilder Thomson Reuters — Professional, Research & Development Second Workshop on Information Credibility on the Web (WICOW08) Napa, California USA — October 30, 2008

Transcript of Professional Credibility: Authority on the Web Jack G. Conrad, Jochen Leidner, Frank Schilder...

Professional Credibility: Authority on the Web

Jack G. Conrad, Jochen Leidner, Frank Schilder

Thomson Reuters — Professional, Research & Development

Second Workshop on Information Credibility on the Web (WICOW08)

Napa, California USA — October 30, 2008

2J. Conrad, WICOW 2008, Napa, CA, 30 October 2008

OUTLINE

• INTRODUCTION

• CASE STUDY

• PROPOSED APPROACH• To measure credibility of Web-based sources

• ANNOTATION EXPERIMENT

• RESEARCH QUESTIONS

• CONCLUSIONS

3J. Conrad, WICOW 2008, Napa, CA, 30 October 2008

INTRODUCTION (1)

• Increasing importance of analyzing text for opinion

• Blogs and other forums continue to expand rapidly

• Growing need for information filtering or ranking – Ideally by authority or credibility

• Baseline metrics rely on Web-based popularity – E.g., leveraging blog activity and in-links alone is deficient

• Case Study to follow with illustration

– More resilient, complex models need to evolve

4J. Conrad, WICOW 2008, Napa, CA, 30 October 2008

INTRODUCTION (2)

• Why does Thomson Reuters care about credibility?

– Products and information services focus on knowledge workers

– Customers tend to be trained professionals in their domains• E.g., lawyers, financial analysts, scientists, health-care professionals

– Information and work-flow solutions delivered possess added-value from human annotations, richly aggregated metadata, etc.

– Users typically pay for the information or services they receive• Quality of delivered products thus needs to be high

5J. Conrad, WICOW 2008, Napa, CA, 30 October 2008

INTRODUCTION (3)

• Why does Thomson Reuters care about credibility? (cont.)

– Supporting materials assembled from the Web are thus expected to have additional properties • Information should be ranked by some form of utility

• Should come from credible, relevant authorities

• Presumption is that material from such authorities is more trustworthy

6J. Conrad, WICOW 2008, Napa, CA, 30 October 2008

INTRODUCTION (4)

• Definitions

– Authority — an accepted source of information or advice, either an expert on the subject or a persuasive force

– Credibility — a quality of being believable, trustworthy

– Trust — a reliance on the integrity, ability, credibility of a person or source of information

7J. Conrad, WICOW 2008, Napa, CA, 30 October 2008

INTRODUCTION (5)

• Example of Challenges

– Speaker A: • “For a lifetime, John McCain has inspired with his deeds … I ask you to join

this cause. Join this cause and help America elect a great man as the next president of the United States.”

– Speaker B: • [On John McCain:] “He lost his brand as a maverick. He did not live up to

his pledge to run a clean campaign.”

8J. Conrad, WICOW 2008, Napa, CA, 30 October 2008

INTRODUCTION (5)

• Example of Challenges

– Speaker A: • “For a lifetime, John McCain has inspired with his deeds … I ask you to join

this cause. Join this cause and help America elect a great man as the next president of the United States.”

• Governor Sarah Palin, State of Alaska, John McCain’s Vice Presidential running mate

– Speaker B: • [On John McCain:] “He lost his brand as a maverick. … He did not live up to

his pledge to run a clean campaign.”• Rep. Chris Shays, Head of McCain Campaign in the State of Connecticut

• Issues:– Which speaker is more authoritative?– Does the authority of the speaker contribute to the speaker’s credibility?– Does this authority and potential credibility carry the same weight to all audiences?

9J. Conrad, WICOW 2008, Napa, CA, 30 October 2008

Case Study — Legal Domain

• User 1 — 2nd yr law student– Active in blogosphere (3 yrs)

– Frequently comments on legal + extracurricular topics

– Activity and Popularity metrics would be high for user

• Numerous responses to contributions exist

– Other features needed to better profile student’s limited legal expertise

– Revealing features may include:• c.v., alias, linguistic /

grammatical / statistical features

• User 2 federal court clerk– Recently started a blog (2 mos)

– Focuses largely on legal topics in area of expertise

– Activity and Popularity metrics would be low for user

• Limited responses to contributions exist

– Other features required to better profile clerk’s knowledge

– Essential features may include:• Professional bio, level of

discourse, linguistic features, others

10

J. Conrad, WICOW 2008, Napa, CA, 30 October 2008

PREVIOUS WORK

(1) B. Ulicny and K. Baclawski, “New metrics for newsblog credibility,” In Proceedings of the First International Conference on Weblogs and Social Media (ICWSM07), Boulder, CO., www.icwsm.org, 2007.

— Similar use cases, varied feature set, but harness no machine learning techniques

(2) N. Agarwal, H. Liu, L. Tang and P. Yu, “Identifying the influencial bloggers,” In Proceedings of the First International Conference on Web Search and Data Mining (WSDM08), http://videolectures.net/wsdm08_agarwal_iib/, 2008.

— Work with premise that influential bloggers not necessarily amongthe most active

— Focus on social networks and influence; no credibility criteria

11

J. Conrad, WICOW 2008, Napa, CA, 30 October 2008

PROPOSED APPROACH — to measuring credibility of Web-based sources

• A hybrid technique that combines multiple features– Exploiting opportunities for ML techniques

• Use activity-level, popularity as baseline methods

• Combine a rich set of diverse, measurable features

• Supplemental dimensions may include– Statistics from writing style (e.g., avg. sentence length)

• Can train an SVM classifier using training data

• Goal: to reduce false positive authority reporting

• Should be robust enough to work with sparse feature matrices

• Could subsequently ask viewers to rate resulting rankings

12

J. Conrad, WICOW 2008, Napa, CA, 30 October 2008

CLASSES of FEATURES for MEASURING AUTHORITY

No. Feature Illustration

(1) Internet activity level Cumulative blog participation

(2) Nature of Web alias Comical or witty vs. conventional

(3) Proper name features Arthur C. Clarke vs. M. Mouse

(4) Title features Authority levels of persons, URLs

(5) Internet domain features Transparent, from relevant country

(6) Linguistic features Degree, quality of entities, noun phrases

(7) Grammatical features Level of diction

(8) Statistical features Average length of sentences

Proposed Attributes for Feature Vector

13

J. Conrad, WICOW 2008, Napa, CA, 30 October 2008

ANNOTATION EXPERIMENT

• Goal: To determine the degree of agreement possible between to human annotators when tagging blog comments for the authority of the respondents

• Two paralegal reviewers – Working on the same data

• Two blogs on legal topics– Reviewers to annotate each using identical guidelines

– Blogs contained ~20 comments

• Reviewers asked to identify – Reference point (replying to what?)

– Authority-level

– Polarity and Intensity of entry

• Compared overall inter-reviewer agreement

14

J. Conrad, WICOW 2008, Napa, CA, 30 October 2008

ANNOTATION SET

• Referent Id — for entry, comments, e.g., {e1, c1, c2, c3 …}

• Popularity — { AGREE, DISAGREE, BALANCED, UNRELATED }

• Degree — { 1 (mild), 2 (medium), 3 (intense) }

• Authority — { 0, 1, 2, 3 }• 0 – no evidence present indicating author an authority

• 1 – some indications of being an authority (e.g., writing style)

• 2 – greater indication that person is an authority (e.g., profession, legal blog, writing style)

• 3 – clear authority in the field (e..g., law professor, attorney with relevant practice area …)

15

J. Conrad, WICOW 2008, Napa, CA, 30 October 2008

ANNOTATION EXPERIMENT

Legal Blog

No. of Comments

Complete Agreement

Percent Agreement

Within 1 Agreement

Percent Agreement

Kendall tau

Inter-annotator

Volokh Conspiracy

18 4 22% 17 94% 0.49

Balkan- ization

30 22 73% 30 100% 0.88

Combined 48 26 54% 47 98% 0.69

Inter-assessor agreement for Blog Respondent Authority Levels

• Conclusion: strict agreement not terrific, but loose agreement may be useful

16

J. Conrad, WICOW 2008, Napa, CA, 30 October 2008

RESEARCH QUESTIONS(1) What are the pieces of evidence to best track an author’s level of authority

and presumed credibility?

(2) How can relationships between authority/credibility and influence/trust be formally modeled?

(3) How do we ‘calibrate’ authority w.r.t. different audiences?

(4) How is authority/credibility propagated on the Web?

(5) How can a trust-based approach guard against adversarial behavior (attacks)?

(6) What are the consequences when authority does not correlate well with credibility?

(7) What are the most effective external properties (e.g., prof. affiliation) vs. structural properties (e.g., via links)?

(8) Is there an ideal mark-up scheme to track credibility for Web-based applications?

17

J. Conrad, WICOW 2008, Napa, CA, 30 October 2008

CONCLUSIONS

• Existing metrics for authority on the Web markedly deficient

• Author-Authority can be better tracked by authority indicators– In our model, authority viewed as a proxy for credibility

• Improved authority tracking essential for next generation customer-centric Web applications– Especially true for professional systems that need to provide added

value to higher-end users (e.g., lawyers, doctors, scientists)

• Given a more resilient set of features, opportunities for training ML systems exist– Based on the literature, research in the area of measuring credibility

appears to remain in its infancy

• A great field of practical endeavor for opinion mining efforts

Professional Credibility: Authority on the Web

Jack G. Conrad, Jochen Leidner, Frank Schilder

Thomson Reuters — Professional, Research & Development

Second Workshop on Information Credibility on the Web (WICOW08)

Napa, California USA --- October 30, 2008 QUESTIONS?