Telecom Data Analysis Using Social Media Feeds
-
Upload
juhi-srivastava -
Category
Education
-
view
569 -
download
1
Transcript of Telecom Data Analysis Using Social Media Feeds
![Page 1: Telecom Data Analysis Using Social Media Feeds](https://reader033.fdocuments.in/reader033/viewer/2022051706/58f041ac1a28ab62508b4691/html5/thumbnails/1.jpg)
TELECOM DATA ANALYSISUSING SOCIAL MEDIA FEED
![Page 2: Telecom Data Analysis Using Social Media Feeds](https://reader033.fdocuments.in/reader033/viewer/2022051706/58f041ac1a28ab62508b4691/html5/thumbnails/2.jpg)
Introduction
Data Extraction
Data Pre-Processing
Classification
Word Cloud
Frequent words and association
Clustering
Business Value
Cross- sell/Up-sell
Customer Churn and Retention
Customer Genomics
Future Scope
CONTENT
![Page 3: Telecom Data Analysis Using Social Media Feeds](https://reader033.fdocuments.in/reader033/viewer/2022051706/58f041ac1a28ab62508b4691/html5/thumbnails/3.jpg)
Data ExtractionHoot suite (uberVU) uses web crawler to extract the data from different social media sources
![Page 4: Telecom Data Analysis Using Social Media Feeds](https://reader033.fdocuments.in/reader033/viewer/2022051706/58f041ac1a28ab62508b4691/html5/thumbnails/4.jpg)
DATA PRE-PROCESSING• Part of KDD Process• Removed Missing Values• Reduced data from 10,000 rows to 1009 rows in excel
![Page 5: Telecom Data Analysis Using Social Media Feeds](https://reader033.fdocuments.in/reader033/viewer/2022051706/58f041ac1a28ab62508b4691/html5/thumbnails/5.jpg)
TEXT STEMMING AND CLEANING# remove at people mydata$Content = gsub("@\\w+", "", mydata$Content) # remove punctuation mydata$Content = gsub("[[:punct:]]", "",mydata$Content) > # remove numbers mydata$Content = gsub("[[:digit:]]", "", mydata$Content) # remove html links mydata$Content = gsub("http\\w+", "", mydata$Content) # remove unnecessary spaces mydata$Content = gsub("[ \t]{2,}", "", mydata$Content) mydata$Content = gsub("^\\s+|\\s+$", "", mydata$Content)
![Page 6: Telecom Data Analysis Using Social Media Feeds](https://reader033.fdocuments.in/reader033/viewer/2022051706/58f041ac1a28ab62508b4691/html5/thumbnails/6.jpg)
TEXT CLASSIFICATION
Implemented Naïve Bayes algorithm and Simple Voter algorithm to find out the sentiments of customer feedbacks.
Classify Polarity –
Function allows us to classify some text as positive or negative or neutral.
Classify emotion –
Function helps us to analyse some text and classify it in different types of emotion: anger, disgust, fear, joy, sadness, and surprise.
![Page 7: Telecom Data Analysis Using Social Media Feeds](https://reader033.fdocuments.in/reader033/viewer/2022051706/58f041ac1a28ab62508b4691/html5/thumbnails/7.jpg)
# classify emotion class_emo = classify_emotion(mydata$Content, algorithm="bayes", prior=1.0) # get emotion best fit > emotion = class_emo[,7]
# classify emotion class_emo = classify_emotion(mydata$Content, algorithm=“voter", prior=1.0) # get emotion best fit > emotion = class_emo[,7]
# classify polarity > class_pol = classify_polarity(mydata$Content, algorithm="bayes") get polarity best fit > polarity = class_pol[,4]
# classify polarity > class_pol = classify_polarity(mydata$Content, algorithm=“voter") get polarity best fit > polarity = class_pol[,4]
![Page 8: Telecom Data Analysis Using Social Media Feeds](https://reader033.fdocuments.in/reader033/viewer/2022051706/58f041ac1a28ab62508b4691/html5/thumbnails/8.jpg)
![Page 9: Telecom Data Analysis Using Social Media Feeds](https://reader033.fdocuments.in/reader033/viewer/2022051706/58f041ac1a28ab62508b4691/html5/thumbnails/9.jpg)
Emotion Analysis(Brand vs Emotion)
![Page 10: Telecom Data Analysis Using Social Media Feeds](https://reader033.fdocuments.in/reader033/viewer/2022051706/58f041ac1a28ab62508b4691/html5/thumbnails/10.jpg)
WORD CLOUD Image composed of words used in a particular text or subject, in which the size of each word indicates its frequency or importance.
# separating text by emotion > emos = levels(factor(sent_df$emotion)) > nemo = length(emos) > emo.docs = rep("", nemo)
# remove stopwords > emo.docs = removeWords(emo.docs, stopwords("english"))
>
# create corpus > corpus = Corpus(VectorSource(emo.docs)) > tdm = TermDocumentMatrix(corpus) > tdm = as.matrix(tdm)
> colnames(tdm) = emos
# comparison word cloud > comparison.cloud(tdm, colors = brewer.pal(nemo,
"Dark2"), + scale = c(3,.5), random.order = FALSE, title.size = 1.5)
![Page 11: Telecom Data Analysis Using Social Media Feeds](https://reader033.fdocuments.in/reader033/viewer/2022051706/58f041ac1a28ab62508b4691/html5/thumbnails/11.jpg)
Frequent words and Association findAssocs(dtms, c("service"), corlimit=0.98)
$service
will #centurylink at&t centurylink never 1.00 0.99 0.99 0.99 0.99
findAssocs(dtms, c("at&t"), corlimit=0.98)
at&t`
centurylink internet service will #centurylink the 1.00 0.99 0.99 0.99 0.98 0.98
dtms <- removeSparseTerms(dtm, 0.1) > inspect(dtms) <<DocumentTermMatrix (documents: 5, terms: 10) >> Non-/sparse entries: 50/0 Sparsity : 0% Maximal term length: 12 Weighting : term frequency (tf) Terms Docs #centurylink at&t can centurylink dear internet never service the will 1 39 99 25 177 4 51 10 55 40 38
![Page 12: Telecom Data Analysis Using Social Media Feeds](https://reader033.fdocuments.in/reader033/viewer/2022051706/58f041ac1a28ab62508b4691/html5/thumbnails/12.jpg)
TERM SIMILARITY BY CLUSTERING
dtmss <- removeSparseTerms(dtm, 0.1) # This makes a matrix that is only 10% empty space, maximum. inspect(dtmss)
d <- dist(t(dtmss), method="euclidian") fit <- hclust(d=d, method="ward.D") fit Call: hclust(d = d, method = "ward.D") Cluster method : ward.D Distance : euclidean Number of objects: 10
HIERARCHICAL CLUSTERING
Remove uninteresting or infrequent words
![Page 13: Telecom Data Analysis Using Social Media Feeds](https://reader033.fdocuments.in/reader033/viewer/2022051706/58f041ac1a28ab62508b4691/html5/thumbnails/13.jpg)
CUSTOMER SEGMENTATION
Based on spends and sentiments
Low value: 30$ to 50$Medium Value: 51$ to 80$High Value: 81$ to 120$
![Page 14: Telecom Data Analysis Using Social Media Feeds](https://reader033.fdocuments.in/reader033/viewer/2022051706/58f041ac1a28ab62508b4691/html5/thumbnails/14.jpg)
BASED ON CUSTOMER SPENDS AND SENTIMENT
we classify customers for cross sell-upsell campaigns and also customer retention campaign.
CROSS SELL/ UP SELL
![Page 15: Telecom Data Analysis Using Social Media Feeds](https://reader033.fdocuments.in/reader033/viewer/2022051706/58f041ac1a28ab62508b4691/html5/thumbnails/15.jpg)
![Page 16: Telecom Data Analysis Using Social Media Feeds](https://reader033.fdocuments.in/reader033/viewer/2022051706/58f041ac1a28ab62508b4691/html5/thumbnails/16.jpg)
FUTURE SCOPE
• Sarcasm detection in unstructured data using Natural Language Processing.
• Increase the efficiency of sentiment Analysis.
• Sarcasm detection Method:1. Lexical Analysis2. Prediction using likes and dislikes3. Fact negation4. Temporal Knowledge extraction
![Page 17: Telecom Data Analysis Using Social Media Feeds](https://reader033.fdocuments.in/reader033/viewer/2022051706/58f041ac1a28ab62508b4691/html5/thumbnails/17.jpg)
CUSTOMER GENOMICS• Every customer is represented by a unique model
created by their specific transaction
• Predictive Models access over 200 dimensions for each person assigns label across all dimensions such as what they buy, what factors influence their purchase decision, how they engage, and potential life event.
• Learns from every customer transaction via social media, loyalty, self-stated survey data, panel data, and other 3rd appended information.
• Automatically learns from every new transaction about customer behaviour and updates every probability that is associated with the customer.
• Avoids over-fitting and counter- intuitive decisions by supervising the automation process to ensure that results are intuitive, accurate and relevant.
![Page 18: Telecom Data Analysis Using Social Media Feeds](https://reader033.fdocuments.in/reader033/viewer/2022051706/58f041ac1a28ab62508b4691/html5/thumbnails/18.jpg)
References
• http://www.fractalanalytics.com/products-and-solutions/customer-genomics
• http://www.slideshare.net/rdatamining/text-mining-with-r-an-analysis-of-twitter-data
• https://sites.google.com/site/miningtwitter/questions/sentiment/sentiment
![Page 19: Telecom Data Analysis Using Social Media Feeds](https://reader033.fdocuments.in/reader033/viewer/2022051706/58f041ac1a28ab62508b4691/html5/thumbnails/19.jpg)