Selecting Attributes for Sentiment Classification Using Feature Relation Networks

Post on 24-Feb-2016

45 views 0 download

Tags:

description

Selecting Attributes for Sentiment Classification Using Feature Relation Networks. Presenter : Jian-Ren Chen Authors : Ahmed Abbasi, Stephen France, Zhu Zhang , and Hsinchun Chen 2011 , IEEE TKDE. Outlines. Motivation Objectives Methodology Experiments Conclusions Comments. - PowerPoint PPT Presentation

Transcript of Selecting Attributes for Sentiment Classification Using Feature Relation Networks

Intelligent Database Systems Lab

Presenter : JIAN-REN CHEN

Authors : Ahmed Abbasi, Stephen France, Zhu Zhang,

     and Hsinchun Chen

2011 , IEEE TKDE

Selecting Attributes for Sentiment Classification Using Feature Relation Networks

Intelligent Database Systems Lab

OutlinesMotivationObjectivesMethodologyExperimentsConclusionsComments

Intelligent Database Systems Lab

MotivationSentiment analysis has emerged as a method for

mining opinions from such text archives.

challenging problem:

1. requires the use of large quantities of linguistic features

2. integrate these heterogeneous n-gram categories into a single

feature set

- noise 、 redundancy and computational limitations

1) polarity 2) intensityI don’t like you 、 I hate you

Intelligent Database Systems Lab

n-gram - (Markov model)天氣:晴天、陰天、雨天美麗 vs 美痢

“HAPAX” and “DIS” tagsI hate Jimreplaced with “I hate HAPAX”

Intelligent Database Systems Lab

Objectives• Feature Relation Network (FRN) considers semantic information

and also leverages the syntactic relationships between n-gram

features.

- enhanced sentiment classification on extended sets of

heterogeneous n-gram features.

Intelligent Database Systems Lab

Methodology-Extended N-Gram Feature Set

Intelligent Database Systems Lab

Methodology - Subsumption Relations

A subsumes B(A → B) “I love chocolate”

  unigram :   I, LOVE, CHOCOLATE  bigrams :   I LOVE, LOVE CHOCOLATE  trigrams :   I LOVE CHOCOLATE

W hat about the bigrams and trigrams?It depends on their weight.Their weight exceeds that of their general lower order counterparts by threshold t.

Intelligent Database Systems Lab

Methodology - Parallel RelationsA parallel B (A - B)

POS tag: “ADMIRE_VP”   → “ like”     semantic class: “SYN-Affection”  → “ love”

A and B have a correlation coefficient greater than some threshold p, one of the attributes is removed to avoid redundancy.

Intelligent Database Systems Lab

Methodology - The Complete Network

Intelligent Database Systems Lab

Methodology - Incorporating Semantic  Information

Intelligent Database Systems Lab

Experiments - Datasets

Intelligent Database Systems Lab

Experiments – FRN vs Univariate

Intelligent Database Systems Lab

Experiments - FRN vs Univariate (WithinOne)

Intelligent Database Systems Lab

Experiments - FRN vs Multivariate

Intelligent Database Systems Lab

Experiments - FRN vs Multivariate (WithinOne)

Intelligent Database Systems Lab

Experiments - FRN vs Hybrid

Intelligent Database Systems Lab

Experiments - FRN vs Hybrid (WithinOne)

Intelligent Database Systems Lab

Experiments - Ablation

Intelligent Database Systems Lab

Experiments - Parametert (0.0005, 0.005, 0.05, and 0.5)p (0.80, 0.90, and 1.00)

Intelligent Database Systems Lab

Experiments - Average Runtimes

Intelligent Database Systems Lab

Conclusions• FRN had significantly higher best accuracy and best

percentage within-one across three testbeds.

• The ablation and parameter testing results play an

important role for the subsumption and parallel

relation thresholds.

Intelligent Database Systems Lab

Comments• Advantages

- accuracy 、 computationally efficient• Disadvantage

- ablation and parameter is sensitive• Applications

- sentiment classification- feature selection method