Sentiment Analysis Using Solr

23
By: Pradeep Pujari

description

Solr is an open source, widely used, popular IR machine. It can be used for simple sentiment analysis and sentiment retrieval tool. Its multi-language analyzers together with UIMA (Unstructured Information Management Architecture) framework can be extended for sentiment extraction. Each sentence passes through a series of pluggable annotators. Entity and its associated polarity are detected for each sentence. Polarity of each sentence is stored into Solr index. Persistent model files can be created from training data and accessed at run time.

Transcript of Sentiment Analysis Using Solr

Page 1: Sentiment Analysis Using Solr

By: Pradeep Pujari

Page 2: Sentiment Analysis Using Solr

Working mostly in Search domain

Search = IR + ML + NLP

Who am I?

Works for

Page 3: Sentiment Analysis Using Solr

Contributing to SolrSherlock

- Open Source Project

Who am I?

http://solrsherlock.github.io/SolrSherlock/

Page 4: Sentiment Analysis Using Solr

What is Sentiment Analysis? A linguistic analysis technique that identifies

The movie is great.

The movie stars Mr. X

The movie is horrible.

opinion early in a piece of text.

Page 5: Sentiment Analysis Using Solr

Challenging

Too easy Too hard

Difficulty

mis

cla

ssif

icati

on

What is Sentiment Analysis?

Page 6: Sentiment Analysis Using Solr

Sentiment Analysis

NLP

Cognitive Science

What is Sentiment Analysis?

Page 7: Sentiment Analysis Using Solr

Human can easily understand emotions.

Can a machine be trained to do it?

What is Sentiment Analysis?

Page 8: Sentiment Analysis Using Solr

Solr ? Http Request Servlet

Admin Interface

Update Servlet

Standard Request Handler

Custom Request Handler

Response Writer

Solr Core

Lucene

Analysis UIMA

config Caching

Update Handler

Page 9: Sentiment Analysis Using Solr

Linguistics module Stems, Lemmas and Synonyms multi language capability CJKAnalyzer, UIMA Analyzers

UIMA integration UpdateProcessorChain

Why Solr ?

Page 10: Sentiment Analysis Using Solr

Why Solr ? Extract domain specific entities and concepts

Time and Cost

Solr Set Up – 5 mins

UIMA Annotators - 5 days

Enrich text, write to dedicated field

Page 11: Sentiment Analysis Using Solr

Tagging entities in review text

Usecase

I wasn't really in the market for another tablet, but my girlfriend ended up getting one for me so she got me on this one. I would like to say that this tablet reminds me of the first Motorola Droid smartphone that came out several years back. The phone jam packed a ton of bells & whistles into its hardware and software to give a lot of bang for your buck. This is what it feels like amazon has done with the Kindle Fire 8.9. They have put a lot of advanced hardware and innovative software, so for the average user, specially someone who absorbs a lot of media, you get a lot for the price. But just because you get a lot for the price, doesn't mean it is without its flaws.

Page 12: Sentiment Analysis Using Solr

Usecase Consumer feedback about products

Which product features are more relevant

Polarity

Page 13: Sentiment Analysis Using Solr

Digital SLR with Full 1080p HD Video

There are many preprogrammed scene modes that make this a very easy camera to use.

The picture quality is beyond belief, and even better for the price.

Price:

Usecase

Page 14: Sentiment Analysis Using Solr

Why UIMA ? UIMA Framework manages components and data flow – No coding

Deploy pipeline of analysis engines

AEs wrap NLP algorithms

Person Place

organization Language Detection

Aggregate analysis engine

Sentence Annotator

POS Annotator

NER

Page 15: Sentiment Analysis Using Solr

Index

Lucene

Solr Update RequestProcessor

Solr

QParser Data

Solr+UIMA

UIMA AE

Page 16: Sentiment Analysis Using Solr

NLP+UIMA Use POS in query understanding

boosting terms

Synonym expansion

Extract concepts/entities

Faceting using entities

Identify places in query and use spatial queries

Page 17: Sentiment Analysis Using Solr

Ideas: Sentiment Analysis App

Identify Subjective Sentences from text

Remove noisy sentences – Regex, conditional probability

Graph min cut – LingPipe

Subjectivity Lexicons

Discard Facts and Objective Sentences

Page 18: Sentiment Analysis Using Solr

Subjectivity detector

Subjective

Objective

Polarity Classifier

Ideas: Sentiment Analysis App

Page 19: Sentiment Analysis Using Solr

Sentiments Intensity - SentiWordNet

WordNet-Affect: WordNet +

annotated concepts

Ideas: Sentiment Analysis App

Hybrid model with adding dictionary

Page 20: Sentiment Analysis Using Solr

Update Handler with

processor chain

Remove Duplicates processor

Logging processor

Custom Transform processor

Index processor

Update Processor Chain

Text Analyzers

Lucene

Lucene Index

Sentence Detection processor

Sentiment Classifier

Company Name Annotator

Sentiment Score processor

Product Reviews

Page 22: Sentiment Analysis Using Solr

Questions ?

Page 23: Sentiment Analysis Using Solr

Thank You

Email: [email protected]