Presentation on Affect Analysis and Ranking

Post on 19-Jan-2015

116 views 2 download

Tags:

description

Presentation for the weekly Artificial Intelligence meeting of the VU. It covers work on affect analysis (master project), and planned work on ranking of Linked Data

Transcript of Presentation on Affect Analysis and Ranking

AFFECT ANALYSIS OF DUTCH SOCIAL MEDIAAND

RANKING OF QUERY RESULTS OVER LINKED DATA

Laurens Rietveld

Master Project Background

Affect analysis of Dutch social media Finished July 2010 VU (Stefan) GfK Daphne

Marketing Research Online dashboard

Not involved yet in webmining Business case: National Railway Company (NS)

Data collectio

n

Data Processi

ngAnalysis

Project BackgroundAffect Analysis

Affect: experience of feeling or emotion[1]

Multiple measurements Physiological Behavioral Vocal Linguistic

[1] W. Huitt, The Affective System

[2] W. Parrott, Emotions in Social Psychology

Project BackgroundAffect Analysis

What is online affect analysis Detect emotions on web pages Types of emotions[2]:

Love Joy Surprise Anger Sadness Fear

Project BackgroundAffect Analysis

Main problems Unstructured data

Internet (html) Text

Domain dependencies “Go read the book” positive in book reviews,

negative in movie reviews Ambivalence

Text Emotion

Project BackgroundDutch Social Media

Used Social Media Types: Blogs (www.blogspot.com)

Online news item reactions (www.fok.nl)

Micro-blogs (www.twitter.com)

Project BackgroundCrowd Sourcing

Problems: Affect analysis needs training data Annotating data is time-consuming Annotate every domain Normally done by researcher

Solution: Crowd Sourcing Mechanical Turks Outsourcing simple tasks to large community

+ -

Many tasks English only

Quick Risk of lower quality

Cheap Unethical (debatably)

Affect Analysis Approach

Research Questions

Is it possible to apply crowd-sourcing to affect analysis of Dutch social media

Are there differences between social media types in affect analysis

Results

Inter annotator agreement: low Neutral outvotes emotion Possible causes:

Missing sentence context Too few annotators Noise introduced by translation

1/1/

2007

1/3/

2007

1/5/

2007

1/7/

2007

1/9/

2007

1/11

/200

7

1/1/

2008

1/3/

2008

1/5/

2008

1/7/

2008

1/9/

2008

1/11

/200

8

1/1/

2009

1/3/

2009

1/5/

2009

1/7/

2009

1/9/

2009

1/11

/200

9

1/1/

2010

1/3/

2010-1%

0%

1%

2%

3%

4%

5%

6%

7%

8%

9%

All social media

Joy Surprise Anger Sadness

Period

% o

f all d

ocum

ents

Results

Period EventJuly 2007 Problems in the payment system of ticket automatsJanuary 2009 Required chip card payment method for studentsDecember 2009 Train and railway malfunctions due to snowFebruary 2010 Filthy train stations due to cleaning crew strikes

Future work

Other list of emotions Improve annotation process

More voting Use other strategies for annotation tasks

Not sentence annotation but paragraph/document

Different social media types, different feature-extraction/classifier/annotation strategies

AFFECT ANALYSIS OF DUTCH SOCIAL MEDIAAND

RANKING OF QUERY RESULTS OVER LINKED DATA

Laurens Rietveld

Data2Semantics

Data2Semantics

Data2Semantics

Wicherts JM, Bakker M, Molenaar D, 2011 Willingness to Share Research Data Is Related to the Strength of the Evidence and the Quality of Reporting of Statistical Results. PLoS ONE 6(11)

Data2Semantics

Provide semantic infrastructure for e-Science

How to share, publish, access, analyze, interpret and reuse data? Querying Ranking Information utility Enriched publications Provenance Annotation/interpretation

Census

CDS toolsCDS tools

Patient Profile

EMR LIS

Linked Data

Elsevier-published Clinical Guideline

Clinical evidencee.g. CT report

AERSHospital

Clinical Decision Support

My Research

http://dbpedia.org/fct/ http://google.com

Ranking

My Research

Ranking1. Relevance

No proper ‘PageRank’ equivalent for semantic web

Heterogeneous and imprecise data

2. Ordering Performance

Relevance

What query results are most relevant?

Semantic web comes with implicit orderings. Possible indicators: Which ontologies are used more often? What can we say about these ontologies? Which query results are semantically similar? Which query results can I trust?

Ordering

SELECT ?price ?offer ?product ?vendor ((?rating + ?popularity) AS ?score){ ?product :hasRating ?rating . ?product :producer ?producer . ?producer :hasPopularity ?popularity . ?offer :product ?product . ?offer :price ?price .}ORDER BY DESC(?score)LIMIT 1

Berlin SPARQL Benchmark

Slice 1

?product?rating

BGP

BGP

?producer?popularity

RankJoin

Rank

Rank

SPA

RQ

L-R

an

k

Join

BGP

?product?producer?offer?price

646 679

30 29

195

1

1Slice

1

Order

BGP

?product?rating

BGP

BGP

?producer?popularity

?product?producer?offer?price

trad

itio

nal

Join

646

Join

679

438634 13205

13205

13205

1

Ordering

Related work: Sara Magliacane

Current Question

What if reasoning is required to materialize information?

Top-k Closure (Stefan Schlobach) Avoid full materialization while still being

complete

Vb materialisatie

Thank You