University Synopsius (1)

8/12/2019 University Synopsius (1)

1/3

Introduction

Sentiment analysis or opinion mining is the computational study of peoples opinions, appraisals,

attitudes, and emotions toward entities, individuals, issues, events, topics and their attributes.

Businesses always want to find public or consumer opinions about their products and services. Potential

customers also want to know the opinions of existing users before they use a service or purchase aproduct. With the explosive growth of social media (i.e., reviews, forum discussions, blogs and social

networks) on the Web, individuals and organizations are increasingly using public opinions in these

media for their decision making. However, finding and monitoring opinion sites on the Web and

distilling the information contained in them remains a formidable task because of the proliferation of

diverse sites.

Motivation

The exponential increase in the Internet usage and exchange of users opinion is the motivation for

Opinion Mining. As the number of reviews that a product receives may grow rapidly and many times the

reviews may also be quite lengthy, it is hard for the customers to analyze them through manual reading

to make an informed decision to purchase a product. A large number of reviews for a single product may

also make it harder for individuals to evaluate the true underlying quality of a product. In these cases,

customers may naturally gravitate to read a few reviews in order to form a decision regarding the

product and he/ she may get only a biased view of the product. Similarly, manufacturers want to read

the reviews to identify what elements of a product affect sales most, and a large number of reviews

make it hard for product manufacturers or business organizations to keep track of customer's opinions

and sentiments on their products and services. Since, most of the reviews are stored either in

unstructured or semi-structured format; the distillation of knowledge from this huge repository

becomes a challenging task. It would be a great help for both customers and manufacturers if the

reviews could be processed automatically and presented in a summarized form highlighting the product

features and users opinions expressed over them..

Objectives

Most of the opinion mining tools classify the reviews as positive or negative. Fails to reveal the product

features opinion as liked or disliked by the users. And more over the issue with all the sentimental

analysers is that the parser are trained to deal with grammatically correct language. And the reviews

that are obtain from sites, does not always have grammatically correct language. They general contain

abbreviations, slang words, short codes etc which makes the job of the parser difficult.

Methodology


2/3

The figure shows steps for analysing the sentiment of product reviews. A web crawler would e xtract the

information from the website whose reviews have to be analysed. Resulting in HTML tags, Text and

Links. In the next phase of Data preparation, filtering of noisy data takes place. Noisy data like stop for

unwanted stop-words and words not listed in the dictionary. And also the HTML tags are discarded. So

after the data preparation stage, only the text ie the review is retained.

The pre-processed data is now classified as Informal and Formal Text. Before this data is sent for review

classification the text is substituted by the equivalent formal text. Later the data is passed on to the NLP

parser for classification into POS tags. Thus the feature and its accompanying opinion is identified and

extracted.

Continuing with the summarization stage. Thus the feature and its accompanying opinion is identifiedand extracted. The parsers cannot classify informal text, summarizing with informal text would informal

result in ineffective summarization.

To list down the steps of mining,

1. Review Documents Retrieval: For a target review site, the crawler retrieves review documents and

stores them locally after filtering markup language tags.


3/3

2. Document Pre-processor: The filtered review documents are divided into manageable record-size

chunks. Pre processing is done on review documents to filter out noisy reviews.

3. Document Parser: The functionality of this module is to facilitate the linguistic and semantic analysis

of text for information component extraction. This module accepts record-size chunks generated by

document pre-processor as input to assign Parts Of-Speech (POS) tags to each word. It also converts

each sentence into a set of dependency relations between the pair of words. For POS analysis and

dependency relation generation purpose, Stanford parser is used.

4. Feature Pruning and onion identification: Noun phrases generally correspond to product features,

adjectives refer to opinions and adverbs are generally used as modifiers to represent the degree of

expressiveness of opinions. In this system POS based filtering mechanism to avoid unwanted texts from

further processing.

5. Opinion Classification and Summarization: After feature and opinion have been identified,classifi cation of features is done to summarize (i.e., positive or negative)

References

1. A.Kamal, M. Abulaish and T. Anwar, Mining Feature -Opinion Pairs and Their Reliability Scores from

Web Opinion Sources, Proc 2nd Intl. Conference on Web Intell igence, Mining and Semantic, 2012.

2. Bing Liu. Sentiment Analysis and Opinion Mining, Morgan & Claypool Publishers, May 2012.

3. C. C. Aggarwal and C. Zhai, editors, Mining TextData. Springer, 2012.

4. Feldman, R., Techniques and applications for sentiment analysis, Communications of the ACM, Vol.

56 Issue 4, (2013), 82-89. 5. Lee, Dongjoo, Ok-Ran Jeong, and Sang-goo Lee. "Opinion mining of

customer feedback data on the web." Proceedings of the 2nd international conference on Ubiquitous

information management and communication. ACM, 2008. 6. Liddy, Elizabeth D. "Natural language

processing." (2001). 7. Marie-Catherine, De Marneffe and Christopher D. Manning. "Stanford typed

dependencies manual." (2008).

8. Mohammad Sadegh Hajmohammadi , Roliana Ibrahim , Zulaiha Ali Othman Opinion Mining and

Sentiment Analysis: A Survey, International Journal of Computers & Technology,Volume 2 No. 3, June,

2012 .

9. Pang, B. and Lee, L. (2008). Opinion mining and sentiment analysis. Foundation andTrends in

Information Retrieval, 2(1-2):1135.

10. Sheikha, Fadi Abu, and Diana Inkpen. "Learning to classify documents according to formal and

informal style." Linguistic Issues in Language Technology 8 (2012).

University Synopsius (1)

Documents

Transcript of University Synopsius (1)