Presenter : Jian-Ren Chen Authors : Sheng-Tun Li a,b,* , Fu-Ching Tsai a 2013 , KBS
description
Transcript of Presenter : Jian-Ren Chen Authors : Sheng-Tun Li a,b,* , Fu-Ching Tsai a 2013 , KBS
Intelligent Database Systems Lab
Presenter : JIAN-REN CHEN
Authors : Sheng-Tun Lia,b,*, Fu-Ching Tsaia
2013 , KBS
A fuzzy conceptualization model for text mining with application in opinion
polarity classification
Intelligent Database Systems Lab
OutlinesMotivationObjectivesMethodologyExperimentsConclusionsComments
Intelligent Database Systems Lab
MotivationMost existing document classification algorithms are easily
affected by ambiguous terms.
The ability to disambiguate for a classifier is thus as important as
the ability to classify accurately.
- opinion polarity classification
Intelligent Database Systems Lab
ObjectivesWe propose a concept driven text classification approach based on
Formal Concept Analysis (FCA) to train a classifier using concepts
instead of documents, so as to reduce the inherent ambiguities.
We further utilize fuzzy formal concept analysis (FFCA) to take
uncertain information into consideration.
Intelligent Database Systems Lab
Formal concept analysis
Objects: {Review6,Review7}
Attributes: {Phenomenal, Fantastic, Love}
=> formal concept
positive class:‘‘Phenomenal’’, ‘‘Fantastic’’ and ‘‘Love’’ {Review1, Review4, Review6 and Review7}
neutral class:‘‘Cover’’{Review5}
negative class:‘‘Awful’’{Review2, Review3}
Intelligent Database Systems Lab
Formal concept analysis
positive class: {Review1, Review4, Review6, Review7}negative class:{Review2, Review3}neutral class:{Review5}
Intelligent Database Systems Lab
Methodology - Architecture
Intelligent Database Systems Lab
Methodologytf-idf:
Inverted ConformityFrequency (ICF):
Uniformity (Uni):tf-idf > 26 ICF < log(2)Uni > 0.2
Intelligent Database Systems Lab
Methodology
Intelligent Database Systems Lab
Methodology
Intelligent Database Systems Lab
Experiments - Data set and evaluation
• Data set: Reuter-21578 movie review e-book review
• Evaluation
Intelligent Database Systems Lab
Experiments (parameters)
Intelligent Database Systems Lab
Experiments
Intelligent Database Systems Lab
Experiments (conceptualization)
Intelligent Database Systems Lab
Experiments
Intelligent Database Systems Lab
Experiments
Intelligent Database Systems Lab
Conclusions• FFCM successfully reduce the impact from textual ambiguity.
• The results from the experiments show that FFCM
outperforms other state-of-the-art algorithms for both
Reuters-21578 and two opinion polarity collections.
Intelligent Database Systems Lab
Comments• Advantages
- the formal concepts plays an important role• Disadvantage
- α may differ from various datasets- only focuses on single-class classification
• Applications- text mining