BI Reading Presentation v0.3

25
Gaining competitive intelligence Gaining competitive intelligence from social media data from social media data Wu He, Jiancheng Shen, Xin Tian Yaohang Li Wu He, Jiancheng Shen, Xin Tian Yaohang Li (Old Dominion University, Norfolk, Virginia, USA) (Old Dominion University, Norfolk, Virginia, USA) Vasudeva Akula Vasudeva Akula ( ( VOZIQ Company, Reston, Virginia, USA) VOZIQ Company, Reston, Virginia, USA) Gongjun Yan Gongjun Yan ( ( Romain University of Southern Indiana, Evansville, Indiana, USA) and Romain University of Southern Indiana, Evansville, Indiana, USA) and Ran Tao Ran Tao ( ( Donghua University, Shanghai, China) Donghua University, Shanghai, China) SML 856 - Business Intelligence SML 856 - Business Intelligence

description

Gaining competitive intelligencefrom social media data

Transcript of BI Reading Presentation v0.3

Gaining competitive intelligenceGaining competitive intelligencefrom social media datafrom social media data

Wu He, Jiancheng Shen, Xin Tian Yaohang LiWu He, Jiancheng Shen, Xin Tian Yaohang Li (Old Dominion University, Norfolk, Virginia, USA)(Old Dominion University, Norfolk, Virginia, USA)

Vasudeva AkulaVasudeva Akula ((VOZIQ Company, Reston, Virginia, USA)VOZIQ Company, Reston, Virginia, USA)

Gongjun Yan Gongjun Yan ((Romain University of Southern Indiana, Evansville, Indiana, USA) andRomain University of Southern Indiana, Evansville, Indiana, USA) and

Ran Tao Ran Tao ((Donghua University, Shanghai, China)Donghua University, Shanghai, China)

SML 856 - Business IntelligenceSML 856 - Business Intelligence

Objective –Objective –

► Business analytics techniques enable organizations to Business analytics techniques enable organizations to conduct deep analysis of their business data to identify conduct deep analysis of their business data to identify potential issues, problems, opportunities and best potential issues, problems, opportunities and best practices. But it is also necessary to analyse the practices. But it is also necessary to analyse the following questions-following questions- How do organizations know whether or not their own best How do organizations know whether or not their own best

practices are actually “best”?practices are actually “best”? What if an organization’s internal best practices are still poor What if an organization’s internal best practices are still poor

when compared to peers? when compared to peers?

Competitive IntelligenceCompetitive Intelligence

► Competitive intelligence offers an approach for Competitive intelligence offers an approach for organizations to compare their performance against organizations to compare their performance against their peer organizations. their peer organizations.

► As a result of the comparison, organizations can As a result of the comparison, organizations can focus their efforts on improving the areas that are still focus their efforts on improving the areas that are still poor when compared to peers and also develop efforts poor when compared to peers and also develop efforts that can have the greatest impact.that can have the greatest impact.

Data SourcesData Sources

► Internal Data Source – Transactional databases contains Internal Data Source – Transactional databases contains information about customer, product/service etc.information about customer, product/service etc.

► External Data Source – Social Media data related to External Data Source – Social Media data related to organisation/product/serviceorganisation/product/service

Importance of Social MediaImportance of Social Media► Rapid development of social media is greatly Rapid development of social media is greatly

influencing the way in which people communicate influencing the way in which people communicate with one another and obtain information with one another and obtain information (Ngai et al., (Ngai et al., 2015).2015).

► A large amount of user-generated content is available A large amount of user-generated content is available on social media sites. For example, in the business on social media sites. For example, in the business field, more and more consumers rely on user-field, more and more consumers rely on user-generated reviews to evaluate products and services generated reviews to evaluate products and services prior to making a purchase. Many tourists choose a prior to making a purchase. Many tourists choose a restaurant based on reviews and ratings restaurant based on reviews and ratings (Kang et al., (Kang et al., 2013)2013)..

Need to analyse social media data - Need to analyse social media data -

► User-generated social media content is offering User-generated social media content is offering unprecedented opportunities as well as challenges to unprecedented opportunities as well as challenges to organizations because they contain a deluge of organizations because they contain a deluge of opinions, viewpoints and conversations by millions of opinions, viewpoints and conversations by millions of users. users.

► There is a need for organizations to efficiently There is a need for organizations to efficiently manipulating and analysing user-generated social manipulating and analysing user-generated social media content related to their media content related to their organizations/product/service.organizations/product/service.

FrameworkFramework

ToolsTools►Vozip-to gather tweetsVozip-to gather tweets

Apache Solr for text searchesApache Solr for text searches Hadoop for big data analysisHadoop for big data analysis MySQL for storing processed dataMySQL for storing processed data JavaServer Pages ( JSP) for web-based visualizationJavaServer Pages ( JSP) for web-based visualization

►Twitter search APIs : Twitter search APIs : To access twitterTo access twitter►NVivo 10: Text analysis toolNVivo 10: Text analysis tool►Leximancer: Text mining toolLeximancer: Text mining tool►Lexalytics: Sentiment analysis tool Lexalytics: Sentiment analysis tool

Case study – Walmart and Case study – Walmart and CostcoCostco

► Walmart is the largest and Costco is the second largest Walmart is the largest and Costco is the second largest retail chain in the world in terms of retail revenue.retail chain in the world in terms of retail revenue.

► Tweets are short and constantly generated by online Tweets are short and constantly generated by online users.users.

► Collaborated with VOZIP, a social media analytics Collaborated with VOZIP, a social media analytics company based in USA, to gather the tweets company based in USA, to gather the tweets (containing the two companies’ names) that were (containing the two companies’ names) that were submitted to the Twitter service during December 1, submitted to the Twitter service during December 1, 2014 to February 28, 2015.2014 to February 28, 2015.

Case study – Case study – contd…..contd…..

► The tweets were collected using Twitter search APIs. The tweets were collected using Twitter search APIs. VOZIQ crawls a huge amount of Twitter data that VOZIQ crawls a huge amount of Twitter data that contain specific keywords utilizing Twitter search APIs contain specific keywords utilizing Twitter search APIs on a daily basis. In particularly, VOZIQ uses Apache on a daily basis. In particularly, VOZIQ uses Apache Solr for text searches, used Hadoop for big data Solr for text searches, used Hadoop for big data analysis, used MySQL for storing processed data and analysis, used MySQL for storing processed data and used JavaServer Pagesused JavaServer Pages ( JSP) for web-based ( JSP) for web-based visualization.visualization.

Case study – Case study – contd…..contd…..

► A popular sentiment analysis tool called Lexalytics was A popular sentiment analysis tool called Lexalytics was used to detect sentiments of each tweet in our data set. used to detect sentiments of each tweet in our data set. Lexalytics offers a sentiment analysis algorithm to Lexalytics offers a sentiment analysis algorithm to identify the emotive phrases within a document. After identify the emotive phrases within a document. After each phrase is scored (roughly −1-+1), the scores of all each phrase is scored (roughly −1-+1), the scores of all the emotive phrases are combined to discern the overall the emotive phrases are combined to discern the overall sentiment of the sentence.sentiment of the sentence.

Case study – Case study – contd…..contd…..

► They created three pieces of information based on They created three pieces of information based on VOZIP raw Twitter data: the number of tweets for a VOZIP raw Twitter data: the number of tweets for a day, average positive sentiment, average negative day, average positive sentiment, average negative sentiment.sentiment.

Case study – Case study – contd…..contd…..

► In addition to the overall volume and sentiment trend In addition to the overall volume and sentiment trend analysis, volume and sentiment trend analysis on analysis, volume and sentiment trend analysis on individual product level also analyse since both individual product level also analyse since both Walmart and Costco are direct competitors and often Walmart and Costco are direct competitors and often sell the same type of Product.sell the same type of Product.

► Four highly popular grocery products: muffin, Four highly popular grocery products: muffin, cookie, pizza and chicken were chosen and analyzed cookie, pizza and chicken were chosen and analyzed their tweets by using a wellknown text analysis tool their tweets by using a wellknown text analysis tool called NVivo 10 to query the four products from the called NVivo 10 to query the four products from the gathered tweets, and then analyzed the content and gathered tweets, and then analyzed the content and sentiment of these products.sentiment of these products.

Product wise volume and sentiment analysis Product wise volume and sentiment analysis

Results - Results -

► Although customers mentioned Walmart more than Although customers mentioned Walmart more than Costco on Twitter during that period, people tend to Costco on Twitter during that period, people tend to talk about Costco’s muffin, cookie, pizza and chicken talk about Costco’s muffin, cookie, pizza and chicken more than Walmart’s muffin and cookie on Twitter. more than Walmart’s muffin and cookie on Twitter.

► Higher number of positive comments and negative Higher number of positive comments and negative comments on Costco’s muffin, cookie, pizza and comments on Costco’s muffin, cookie, pizza and chicken than on Walmart’s muffin, cookie, pizza and chicken than on Walmart’s muffin, cookie, pizza and chicken. However, Costco received a lower chicken. However, Costco received a lower percentage of positive comments and a higher percentage of positive comments and a higher percentage of negative comments than Walmart for percentage of negative comments than Walmart for muffin.muffin.

Results - Results - contd….contd….

► In contrast, Costco received a lower percentage of In contrast, Costco received a lower percentage of positive comments and a lower percentage of positive comments and a lower percentage of negative comments than Walmart for cookie and negative comments than Walmart for cookie and pizza.pizza.

► These product-level comparisons reveal potential These product-level comparisons reveal potential space for improvement.space for improvement.

Clustering - Clustering -

► we used a popular text mining tools called Leximancer we used a popular text mining tools called Leximancer to mine and cluster the tweets related to each of the four to mine and cluster the tweets related to each of the four products in order to better understand what customers products in order to better understand what customers are talking about for each of the products.are talking about for each of the products.

Cluster diagrams for cookie-related Costco and Walmart tweetsCluster diagrams for cookie-related Costco and Walmart tweets

Conclusion - Conclusion -

► Competitive analytics and intelligence has a great Competitive analytics and intelligence has a great potential to produce useful information, actionable potential to produce useful information, actionable knowledge and critical insights for companies to knowledge and critical insights for companies to enhance competitiveness and solve business problems. enhance competitiveness and solve business problems. The knowledge-intensive business activities caused by The knowledge-intensive business activities caused by technical advances in competitive analytics and technical advances in competitive analytics and intelligence will generate tangible and intangible intelligence will generate tangible and intangible business values and contribute to a knowledge-basedbusiness values and contribute to a knowledge-based

Points to be analyse carefully - Points to be analyse carefully -

► Although sentiment analysis has made good progress, it Although sentiment analysis has made good progress, it still has issues with sarcastic and ironic sentences and still has issues with sarcastic and ironic sentences and the sentiment analysis scores may contain many errorsthe sentiment analysis scores may contain many errors

► Many people also write spam reviews on social media Many people also write spam reviews on social media to promote their own products by giving undeserving to promote their own products by giving undeserving positive opinions, or defame their competitors’ products positive opinions, or defame their competitors’ products by giving false negative opinions.by giving false negative opinions.

► There are also various types of unwanted and malicious There are also various types of unwanted and malicious spam messages on social media.spam messages on social media.

References- References- ► Amidon, D.M., Formica, P. and Mercier-Laurent, E. (2005), Knowledge Economics: Amidon, D.M., Formica, P. and Mercier-Laurent, E. (2005), Knowledge Economics:

Emerging Principles, Practices and Policies, Faculty of Economics and Business Emerging Principles, Practices and Policies, Faculty of Economics and Business Administration, University of Tartu, Tartu.Administration, University of Tartu, Tartu.

► Barbier, G. and Liu, H. (2011), “Data mining in social media”, Social Network Data Barbier, G. and Liu, H. (2011), “Data mining in social media”, Social Network Data Analytics, pp. 327-352, available at: http://link.springer.com/chapter/10.1007/978-1-Analytics, pp. 327-352, available at: http://link.springer.com/chapter/10.1007/978-1-4419-8462-3_124419-8462-3_12

► Benhardus, J. and Kalita, J. (2013), “Streaming trend detection in Twitter”, Benhardus, J. and Kalita, J. (2013), “Streaming trend detection in Twitter”, International Journal of Web Based Communities, Vol. 9 No. 1, pp. 122-139.International Journal of Web Based Communities, Vol. 9 No. 1, pp. 122-139.

► Berman, J.J. (2013), Principles of Big Data: Preparing, Sharing, and Analyzing Berman, J.J. (2013), Principles of Big Data: Preparing, Sharing, and Analyzing Complex Information, Newnes, Boston, WA.Complex Information, Newnes, Boston, WA.

► Bifet, A. and Frank, E. (2010), “Sentiment knowledge discovery in Twitter streaming Bifet, A. and Frank, E. (2010), “Sentiment knowledge discovery in Twitter streaming data”, in Pfahringer, B., Holmes, G. and Hoffmann, A. (Eds), Discovery Science, data”, in Pfahringer, B., Holmes, G. and Hoffmann, A. (Eds), Discovery Science, Springer, Berlin and Heidelberg, pp. 1-15.Springer, Berlin and Heidelberg, pp. 1-15.

► Bollen, J., Mao, H. and Zeng, X. (2011), “Twitter mood predicts the stock market”, Bollen, J., Mao, H. and Zeng, X. (2011), “Twitter mood predicts the stock market”, Journal of Computational Science, Vol. 2 No. 1, pp. 1-8.Journal of Computational Science, Vol. 2 No. 1, pp. 1-8.

► Bose, R. (2008), “Competitive intelligence process and tools for intelligence analysis”, Bose, R. (2008), “Competitive intelligence process and tools for intelligence analysis”, Industrial Management & Data Systems, Vol. 108 No. 4, pp. 510-528.Industrial Management & Data Systems, Vol. 108 No. 4, pp. 510-528.

► Duan, W., Cao, Q., Yu, Y. and Levy, S. (2013), “Mining online user-generated content: Duan, W., Cao, Q., Yu, Y. and Levy, S. (2013), “Mining online user-generated content: using sentiment analysis technique to study hotel service quality”, Proceedings of the using sentiment analysis technique to study hotel service quality”, Proceedings of the 4646thth Hawaii International Conference on System Sciences, pp. 3119-3128. Hawaii International Conference on System Sciences, pp. 3119-3128.

THANK YOU THANK YOU

OverallOverall Volume and Sentiment AnalysisVolume and Sentiment Analysis

WalmartWalmart

CostcoCostco

Twitter data comparison of muffin for Costco and WalmartTwitter data comparison of muffin for Costco and Walmart

Twitter data comparison of cookies for Costco and WalmartTwitter data comparison of cookies for Costco and Walmart