ars.els-cdn.com · Web viewWeb Appendix. This Appendix contains details on (A) the brand sample...
Transcript of ars.els-cdn.com · Web viewWeb Appendix. This Appendix contains details on (A) the brand sample...
Web Appendix
This Appendix contains details on (A) the brand sample composition, (B) the sentiment analysis,
(C) YouGov data, (D) Model specifications, results on (E) brand and (F) industry level and (G)
forecasting.
Web Appendix A: Brand Sample Composition
To enable empirical generalizations across different types of products and brands, we selected 8
different industries from YouGov’s consumer and technology sectors. By analyzing Airlines,
Banking, Beverages, Foods, Cars, Consumer Electronics, Fashion, and Gastronomy, we include
different levels of involvement, expertise, buying frequencies and social media engagement
across our sample.
For each industry, we included both a high and low-end volume leader and several niche players.
Further, we aimed to achieve a similar brand awareness and impression composition within the
industry groups as displayed in Table A1. To match companies, we used the average YouGov
metrics for our observation period. The data period is identical for all brands: th e 586 days
between November 12th, 2012 and June 20th, 2014.
1
Table A1: Within Industry Sample Matching
Brand Awareness Impression Average Number of Fans
Number of Comments
American Airlines 91.55 13.05 997326 310067Frontier 55.00 3.39 86648 40792JennAir 62.50 23.34 160076 7033Jet-Blue 80.05 11.04 726142 72209Lufthansa 61.14 12.62 60274 236543Qantas 50.70 10.74 31422 123551Singapore Airlines 33.90 4.36 37776 33777Southwest Airlines 89.61 41.68 3786358 380931Fidelity 76.11 12.47 112421 128685th3rd 33.90 -2.15 130772 8660Huntington Bank 18.33 -1.83 9408 11620JPMorgan 86.63 -12.02 120340 3794Liberty Mutual 18.63 -0.02 9497 2844PNC Bank 81.76 6.64 1702367 74365RaboBank 45.69 -0.35 209292 19018US Bank 3.98 -2.16 573 32428Dos Equis 62.79 11.20 2846756 216537Hennessey 70.07 7.93 1381774 36289Jack Daniels 93.75 27.89 2357122 312358Jameson 49.05 11.70 522849 68494Smirnoff
91.44 20.22 2159448 197037Audi 91.67 20.92 764517 260013BMW 93.98 32.55 1054739 335931Ford 95.80 41.41 270638 123520Kia 92.31 -0.44 46769 11682Lexus 93.08 28.96 1209543 183869Subaru 92.14 20.90 1492191 152959
2
Volkswagen 93.69 26.75 797270 256394Volvo 92.59 23.69 71247 80659Apple 95.17 44.65 349038 82419Lenovo 48.21 7.38 332131 115804Samsung 94.96 47.43 18670111 87219SanDisk 57.42 20.66 243583 6230Sony 95.36 59.77 1648076 202015Abercrombie & Fitch 82.74 -8.11 3085234 204297Aeropostale 63.41 5.44 5408882 316914NineWest 51.66 15.14 694069 17508NorthFace 52.23 24.00 2320379 46704KettleBrand 74.13 26.77 1162493 44785JosABank 63.80 17.95 375013 15220Nestea 90.61 26.29 174783 10957Pepsi 96.19 25.66 6217911 65391Tostitos 90.18 38.95 406750 21855Burger King 96.31 24.25 3009032 13140Donato’s 11.44 1.52 1 51809KonaGrill 13.95 1.74 29511 4713McDonalds 96.84 19.04 9222254 3990Starbucks 60.24 -3.42 16219222 95939
3
For each brand, we used Facebook’s Graph API to identify the official US Facebook site. We
then used the API to extract all posts from the Facebook site (including company and customer
posts). For each post, we then used the specific post ID to re-crawl all related comments. This
procedure ensures that we extract all publicly available user generated content from Facebook.
Finally, we used the API to collect the number of likes and shares each post received.
It is important to note that Facebook provides only the time stamp for comments on posts
but does not provide such time stamp for likes and shares. Thus, we can only use comments in
our time series analysis. Table A1 further gives an overview over the number of comments and
user posts extracted per brand. Differences across brands abound, driven largely by factors such
as the number of Facebook followers and their engagement, how long the company has had the
Facebook page etc.
To ensure that we only analyze user-generated content, we first cleaned the data sets from
company posts and comments. To do so we used again the API to find the companies’ Facebook
IDs and then cleaned the textual data from all comments and posts coming from these IDs. The
resulting data then entered the next stage: sentiment analysis.
4
Web Appendix B: Sentiment Analysis
Dictionary Based Sentiment Analysis
For the linguistic, dictionary based approach, we use the Linguistic Inquiry Word Count
Software from Pennebaker et al. (2015). The software does not only belong to the most widely
used text analysis tools, but does also provide researchers with reliable and valid dictionaries for
measuring a mixture of sentiments within textual data. The software is easily accessible and
affordable to researchers and practitioners alike. The main LIWC dictionary has been developed
for over 20 years (Pennebaker 1993) and is based on the linguistic analysis of over 181,000 text
files containing more than 231 million words. The developer used these documents to infer
words, word stems and symbols, which are related to various human feelings and expressions.
The main dictionary now consists of over 6400 entries related to over 30 categories, including
also positive and negative emotions. The positive emotions category consists of 620 words such
as happy, smile, love and sweet. The negative emotions category features in total 744 words
including words such as hurt, ugly, nasty and fearful. Thus, LIWC offers a very wide dictionary
allowing to measure very different concepts. For example, emotions that are distinguished on the
two established dimensions of valence and arousal (Bradley et al. 1992). Words such as furious
express anger (negative valence, high arousal), depressed express sadness (negative valence, low
arousal), supercool express excitement (positive valence, high arousal) and fine and okay express
contentment (positive valence, low arousal). The scientific community so far developed a wide
array of additional dictionaries in different languages and for different purposes such as for e.g.
measuring stated morality in texts (Haidt and Joseph 2004). LIWC was successfully applied in
marketing research (see e.g. Rooderkerk and Pauwels 2016).
5
To determine a sentence’s total sentiment LIWC counts the number of total words in a
text document and the number of words related to positive and negative emotions. It then
calculates a relative score by dividing the number of positive words over the number of total
words. For example, the sentence “Coke offers the best soft drink on the market”, leads to the
following result. The total number of words is nine. “Best” is the only word associated with
positive emotions, so the positive emotion score becomes one divided by nine, equaling 0.11.
There is no word associated with negative emotions, so the negative score is 0. To come to a
final sentiment variable, we subtract the positive score from the negative score, ending up with a
Pos-Neg Score of .011.
Machine Learning Based Sentiment Analysis
Based on the findings of Gamon (2004) and Agarwal et al. (2011), we chose Support Vector
Machines (SVM) as the focal point for our supervised machine learning based sentiment
analysis. SVMs are a specific form of supervised Machine Learning and commonly used for
classification purposes (Kübler, Wieringa, and Pauwels 2017). To classify data into pre-set
categories, SVMs aim at maximizing the space between the classified groups. To do so, the SVM
spans a plane between the two groups, limited by support-vectors that define the outer boarders
on the edges of the (two) to different classes (in our case positive and negative sentiments). The
algorithm tries to identify the support vectors that maximize the plane.
As all forms of supervised machine learning algorithms, SVM need a set of training data
to infer the classification rules for new data. Machine learning has the advantage that it can rely
on specific training data that should be tailored to the analysis. However, this implies that
machine learning based models suffer from a classic Garbage In-Garbage out (GIGO) issue. If
6
the training data is not suitable or biased, this bias will spillover on the classification routine and
lead to biased results. Thus, obtaining suitable and reliable training data is key for machine
learning based sentiment extraction. Machine learning research has therefore invested significant
time and resources to identify suitable training data sources. The main idea is that the training
data should be as similar as possible to the data that needs to be classified. In terms of text
analysis, this means that longer texts also require longer training texts, preferably from the same
author. Classification of short texts –such as from social media – similarly need shorter text corpi
for training, preferably from the same social media source (Pak and Paroubek 2010). Beside
source and style similarity, researchers need to trade-off between computational power and
training data size. More training data ensures better classification results. The more text samples
are present in the training data, the more likely it is to observe rare combinations that leading to
better precision. However, larger training sets have two downsides: they consume significantly
larger computational power, and are harder to obtain.
One approach could be to slice a part of the crawled data and to use human coders, to
classify text and then to use this subset for training purposes. Given the existence of service
platforms such as M-Turk it becomes feasible to pre-classify textual data into desired categories
(like positive vs. negative sentiment or complain vs. praise, etc.). Still human coding is
expensive, takes time and is rather inflexible. Additionally, it bares the risk that coders could be
biased by instructions and overlook or over-accentuate patterns in the data. Therefore, such an
approach is especially useful in case of smaller datasets, which only require low amounts of
training data. In our case however, we face a fast amount of data from different sources and
different contexts. We would thus need multiple human coders to classify more than 200,000
7
comments and posts. Given that a single coder would not be able to achieve this, inter-coder-
reliability would become an issue.
Another approach is the use of publicly available classified sentiment data from public
online sources such as Twitter (e.g. Culotta and Cutler 2016; Toubia and Stephen 2013).
Different studies used emoticons and emojis within tweets to infer the sentiment of the related
tweets. A mixture of such datasets is available for training purposes (Naji 2013). Despite being
large, these sets however are all unrelated to our companies, industries, products and
consumption settings. Another key problem here is how to determine whether the obtained data
is positive or negative. Naji (2013) use emoticons in tweets (like and ) to infer whether a
tweet is positive or negative. Collecting more than 1.5 million tweets he successfully trains a
NB-classifiers and is able to correctly classify non-classified tweets as positive or negative with a
hit-rate of more than 75%. Mcauley and Yang (2016) use Amazon Product Reviews to build
training sets. Online reviews commonly do not only feature textual data but also scale-rating
(commonly from 1 to 5, with 1 expressing very bad and 5 expressing very good). By excluding
all reviews with ratings other than 1 and 5, they also construct a reliable and very powerful
training set. A key advantage of rating based training data is that it can be adapted to product
categories by only including ratings from a specific category. However, such large scale social
media data for training purposes brings internal validity issues, as it is unclear if the training data
can really capture sentiments specific to the our data context. Having imprecise training data
could in the worst-case lead to miss-classifications, significantly biasing our insights.
To achieve a sufficient level of reliability and validity we thus use a mixed approach. We
use publicly available training data, obtained from Amazon online reviews available from
(McAuley, Pandey, and Leskovec 2015). For each of our product category, we use reviews from
8
a similar Amazon category. This approach ensures that we have a category specific fit for each
of our 7 product categories. For airlines and banking we use data from the web’s leading airline
rating website (Skytrax) and one of the leading rating websites for US banks and financial
services (depositaccounts.com). To only obtain unambiguously positive and negative reviews,
we only use reviews that come with a very low (1star) or very high (5star) rating. Table A2
shows for each the category the source and size of the training data for each of the 7 categories.
To apply SVMs to textual data, one needs to transform the data into a readable format.
Most commonly researchers use Term Frequency Matrices (TFM) to achieve this. TFMs are
sparse matrices that have a column for each word occurring at least once in one of the text
documents used for analysis. The rows of the TFM represent each document in the dataset.
Dummy codes are then used to indicate whether a specific word occurs in the respective
document/row. Each textual document within the data set can then be expressed as a
combination of word dummies.
We followed the standardized procedure of text preparation described in Feinerer (2017)
to prepare for each training set the necessary TFM. To reduce the number of columns and to safe
computational power, we first apply R’s TM package stop word list to drop all unnecessary
words from the document (such as e.g. the, and, or, etc.). We then further stem the remaining
words to their core form to further minimize TFM length. We follow Mullen and Collier (2004)
who show that linear Kernel based SVM are best suitable for sentiment analysis.
For the training and sentiment prediction we use R’s Rtexttool package that combines
different machine learning packages and text processing packages in R. The SVM is trained with
the standard settings for linear SVMs as documented in Jurka and Collingwood (2015). We use
10% of our training material to test the power of our SVM with a holdout approach. These 10%
9
are not included in the training set, but used for classification. Table A2 displays the holdout-hit-
rates for each category’s SVM. Hitrates vary from 83% to 92% indicating generally good fit of
our SVM.
To perform the sentiment analysis on our data, we apply the same data preparation
approach to our social media data (stop word dropping, word stemming, TFM) before we use the
trained SVMs to classify texts into positive and negative sentiments. The Rtexttool package
further delivers a classification likelihood ranging from 0 to 100% for each classified document.
Following Jurka et al. (2013) we only assign a text to a category if the classification likelihood is
above 75%. Otherwise we classify the text to be neutral, not belonging to either of the categories.
To finally asses the quality of our classification, we randomly drew 500 positive,
negative and neutral comments from each category. Human coders, unrelated to the project were
then asked to control the classification. Table A2 reports these human post-classification
confirmation checks. Again hit rates – ranging from 87% to 96% (with intercoder reliability
above 89%) - indicate a very good fit.
Table A2: Human post-classification confirmation checks.
Training Data Source
Training Data Size Prediction Hit Rate (Holdout)
Human Confirmation Rate (Post-Estimation)
Airlines Skytrax 12,000 85% 89%Banking & Finance
depositaccounts.com 8,000 83% 87%
Beverages Amazon Food & Beverages Reviews
20,000 87% 91%
Electronics Amazon Electronics Reviews
20,000 92% 96%
Fashion Amazon Fashion Reviews
20,000 85% 90%
Food Amazon Food & Beverages Reviews
20,000 89% 93%
Gastronomy Amazon Food & Beverages Reviews
20,000 87% 94%
10
Web Appendix C: Details on the YouGov data
YouGov’s Brandindex is a daily measure of consumer mind-set metrics among the public,
tracking many brand across multiple consumer sectors simultaneously. For the US market,
YouGOv surveys 5,000 randomly selected consumers (from a panel of 5 million) on a daily
basis. Table A3 provides details on the exact question on each item.
The data collection of YouGov can be described as follows: For each item minimum of
100 respondents per day are randomly drawn from the panel and provided with a set of up to 30
brands for a pre-selected industry. To reduce common method bias respondents, evaluate only
one brand item per industry per enquiry. First, for Customer Satisfaction measure, respondents
select those brands (per click) for which they agree with the positive statement of the brand item
(e.g. satisfied with this brand). Then, they select those brands for which they agree with the
negative statement of the brand item (e.g. not satisfied with the brand). The aggregate brand
measure (e.g. Customer Satisfaction) is calculated by counting the number of respondents who
agree with the negative statement subtracted by the number of respondents who agree with the
positive statement divided by the total number of respondents (= [number of positive-number of
negative]/ [number of positive+negative+neutral] multiplied by 100. As a consequence, the
Customer Satisfaction in YouGov Brandindex measure a ratio-scaled variable that lies within the
range of -100 to +100. Similarly, for the awareness, impression, purchase intent and
recommendation metrics, the respondents are prompted with the respective questions. Again,
respondents can answer yes or no to these questions implying that similarly to Customer
Satisfaction, these metrics measure a ratio-scaled variable that lies within the range of -100 to
+100.
11
The collection procedure yields about 5000 daily responses across all the brand items. To
ensure representativeness individual sampling weights are applied to correct for variations in the
probability selection of respondents. Although panellists might be re-invited after a period of two
weeks, they will be blocked for the respective sector and brand item they have answered before
for a period of at least two months. This is important to eliminate repeated measurement as a
source for demand effects and serial correlation in brand perceptions. Brand competition effects
are also controlled for because respondents rate the competing brands within one sector
simultaneously.
Table A3Consumer Mind-set Metrics
(Questionnaire YouGov Brandindex)
Consumer Journey Stage Question
AwarenessWhich of the following brands have you *ever* heard of?
Please select all that apply.
Impression
a. Overall, of which of the following brands do you have a
POSITIVE impression?
b. Now which of the following brands do you have an overall
NEGATIVE impression?
Purchase Intent From which of these would you be most likely to purchase?
Satisfaction
a. Of which of the following brands would you say that you
are a "satisfied customer"?
b. Of which of the following brands would you say that you
are a "dissatisfied customer"?
Recommendation
a. Which of the following brands would you RECOMMEND
to a friend or colleague?
b. And which of the following brands would you tell a friend
or colleague to AVOID?
12
Web Appendix D: VAR models specifications
In this part of the Web Appendix, we provide more details on the VAR models estimated in the
study. Our analysis consists of several methodological steps (see Table A4) which we apply to
each brand separately (e.g. Pauwels and Hanssens 2007).
Table A4
Analysis steps in the Vector Autoregression Modeling Approach
Methodological Step Relevant Literature Research Question1. Tests
Granger causality test (Granger 1969)What is the temporal causality among
variables?
Augmented Dickey-Fuller Test (Enders 2014) Are variables stationary or evolving?
KPSS test*(Maddala and Kim
1998)
Are the results robust to null
hypothesis?
Cointegration test(Johansen, Mosconi,
and Nielsen 2000)
Are evolving variables in long-term
equilibrium?
2. Model of Dynamic
Interactions
Vector autoregressive (VAR)
model
(Dekimpe and
Hanssens 1999)
How do SETs and consumer mind-set
metrics interact in the long run
accounting for the unit roots and
cointegration?
3. Forecast error variance
decomposition (FEVD)
(Nijs, Srinivasan, and
Pauwels 2007)
What fraction of performance variance
comes from each SETs?Note: *Null hypothesis: Series are stationary
13
Unit roots
A mean-reverting trends exhibits stationarity and a trend that changes permanently exhibits
evolution. Following Enders (2014) we use the augmented Dickey–Fuller test (ADF) with
evolution as the null hypothesis. We complement the ADF test with the KPSS test (stationarity
as the null hypothesis) proposed by Kwiatkowski et al. (1992). Each test is estimated in two
forms: with and without a deterministic trend. Ideally they should converge in their results
(Maddala and Kim 1998). In addition, we also conduct panel unit root tests (see Table A5). We
find that all variables are stationary and enter the system in levels.
Table A5
Panel Unit Roots
Panel and individual unit root tests Levin, Lin and Chu(no intercept, no trend)
Breitung(individual intercept and trend)
ADF-Fisher(no intercept, no trend)
Null Hypothesis Common Common Individual
VolumeLikes .000 .000 .000Comments .000 .000 .000Shares .000 .000 .000
SVMPositive .000 .000 .000Negative .000 .000 .000Neutral .000 .000 .000
LIWCPositive .000 .000 .000Negative .000 .000 .000
SVM DispersionPositive .000 .000 .000Negative .000 .000 .000
Consumer Mindset Metrics
Awareness .000 .000 .000Impression .000 .000 .000PurchaseIntent
.000 .000 .000
Recommendation .000 .000 .000Satisfaction .000 .000 .000
14
Lag Selection
The optimal lag order (“n”) is chosen by Akaike Information Criterion (AIC) taking into
account the Breusch–Godfrey serial correlation Lagrange multiplier test (Breusch 1978). Our
goal is to balance lag-selection criteria with auto-correlation bias (Slotegraaf and Pauwels 2008).
We first select the appropriate lag based on AIC, estimate the model and check whether we
should add lags to pass diagnostic tests on residual autocorrelation (Franses 2005). We add lags
until we have no serial auto-correlation in the model. Overall, we select the seventh lag model for
each SETs-mindset metrics model for each brand.
Separate VAR models
We estimate the VAR models for each brand separately for two reasons. First, we want to
isolate time series support or refutation for our arguments and resulting recommendations. For
instance, brand A may enjoy a large number of positive comments and a higher Purchase Intent
than brand B, but that does not mean that brand B can increase its mindset metrics by increasing
its positive comments. Estimating the model for each brand allows us to both show for how
many brands a relationship holds, and give brand-specific advice (based on a second-stage
analysis). Second, brand-specific estimation allows easier comparison with previous papers that
demonstrated the effect of social media on consumer mindset metrics (Colicev et al. 2017) and of
effectiveness of social media (Luo, Zhang, and Duan 2013; Tirunillai and Tellis 2012).
Control Variables
We control for advertising as previous research has a wide evidence that advertising has an
impact on customer acquisition (Trusov, Bucklin, and Pauwels 2009) and mindset metrics (Colicev et
al. 2016; Srinivasan, Vanhuele, and Pauwels 2010). We include the measure from YouGov “advertising
awareness” that proxies for advertising expenditures.
15
Model Estimation
We estimate five main models, for each SETs, and four models that combines different
SETs in the same model. Specifically, in Models 1-5 we estimate how the Volume metrics (likes,
comments, shares), SVM negative and positive, LIWC negative and positive, SVM positive and
negative dispersion and SVM positive, negative and neutral comments, respectively explain the
five consumer mindset metrics (Awareness, Impression, Purchase, Satisfaction and
Recommendation) (see Equation 1-5). Next, we test how different combinations of SETs can
explain the consumer mindset metrics in Models 6-9. In particular, Model 6 combines Volume
and SVM Neutral, Model 7 combines Volume and LIWC, Model 8 combines LIWC and SVM
Neutral and Model 9 combines Volume, LIWC and SVM Neutral. Finally, we from the above
models we select the most performant individual metrics and combine them in Model 10.
Model Equations 1-10: Ten (10) Models of SETs and Consumer Mindset Metrics
Model 1: Volume
[Likest
Commentst
Sharest
Awarenesst
Impression t
Purchase Intent t
Satisfactiont
Recommendationt
]=∑n=1
p [γ1,1n . γ1,8
n
. . .γ8 ,1
n . γ8 ,8n ] [
Likest-n
Commentst-n
Shares t-n
Awarenesst-n
Impressiont-n
Purchase Intent t-n
Satisfaction t-n
Recommendationt-n
]+[φ1,1 . φ1,2
. . .φ8 ,1 . φ8 ,2
] [x1
x2]+ [e1t
e2t
e3t
e4t
e5t
e6t
e7t
e8t
] (1)
Model 2: SVM (without neutral)
16
[SVM positive t
SVM negativet
Awarenesst
Impression t
Purchase Intent t
Satisfactiont
Recommendationt
]=∑n=1
p [γ1,1n . γ1,7
n
. . .γ7 ,1
n . γ7 ,7n ] [
SVM positive t-n
SVMnegativet-n
Awarenesst-n
Impressiont-n
Purchase Intent t-n
Satisfaction t-n
Recommendationt-n
]+[φ1,1 . φ1,2
. . .φ7 ,1 . φ7 ,2
] [x1
x2]+ [e1t
e2t
e3t
e4t
e5t
e6t
e7t
] (2)
Model 3: LIWC
[LIWC positivet
LIWCnegativet
Awarenesst
Impression t
Purchase Intent t
Satisfactiont
Recommendationt
]=∑n=1
p [γ1,1n . γ1,7
n
. . .γ7 ,1
n . γ7 ,7n ] [
LIWC positivet-n
LIWCnegativet-n
Awarenesst-n
Impressiont-n
Purchase Intent t-n
Satisfaction t-n
Recommendationt-n
]+[φ1,1 . φ1,2
. . .φ7 ,1 . φ7 ,2
] [x1
x2]+ [e1t
e2t
e3t
e4t
e5t
e6t
e7t
] (3)
Model 4: SVM (adjusted for dispersion)
[SVM ¿t
SVM ¿t
Awarenesst
Impression t
Purchase Intent t
Satisfactiont
Recommendationt
]=∑n=1
p [γ1,1n . γ1,7
n
. . .γ7 ,1
n . γ7 ,7n ] [
SVM ¿t-n
SVM ¿t-n
Awarenesst-n
Impressiont-n
Purchase Intent t-n
Satisfaction t-n
Recommendationt-n
]+[φ1,1 . φ1,2
. . .φ7 ,1 . φ7 ,2
] [x1
x2]+ [e1t
e2t
e3t
e4t
e5t
e6t
e7t
] (4)
Model 5: SVM (with Neutral)
(5)
17
[SVM positive t
SVM negativet
SVM neutralt
Awarenesst
Impression t
Purchase Intent t
Satisfactiont
Recommendationt
]=∑n=1
p [γ1,1n . γ1,8
n
. . .γ8 ,1
n . γ8 ,8n ] [
SVM positive t-n
SVMnegativet-n
SVM neutralt-n
Awarenesst-n
Impressiont-n
Purchase Intent t-n
Satisfaction t-n
Recommendationt-n
]+[φ1,1 . φ1,2
. . .φ8 ,1 . φ8 ,2
] [x1
x2]+ [e1t
e2t
e3t
e4t
e5t
e6t
e7t
e8t
]Model 6: Volume+ SVM (with Neutral)
[Likest
Commentst
Shares t
SVM positive t
SVM negativet
SVM neutralt
Awarenesst
Impression t
Purchase Intent t
Satisfactiont
Recommendationt
]=∑n=1
p [γ1,1n . γ1,11
n
. . .γ11,1
n . γ11,11n ] [
Likest-n
Commentst-n
Sharest-n
SVM positive t-n
SVM negativet-n
SVMneutral t-n
Awarenesst-n
Impression t-n
Purchase Intent t-n
Satisfactiont-n
Recommendation t-n
]+[φ1,1 . φ1,2
. . .φ11 ,1 . φ11,2
] [x1
x2]+ [e1t
e2t
e3t
e4t
e5t
e6t
e7t
e8t
e9t
e10t
e11t
] (6)
Model 7: Volume+ LIWC
[Likest
Commentst
Shares t
LIWC positivet
LIWCnegativet
Awarenesst
Impression t
Purchase Intent t
Satisfactiont
Recommendationt
]=∑n=1
p [γ1,1n . γ1,10
n
. . .γ10,1
n . γ10,10n ][
Likest-n
Commentst-n
Shares t-n
LIWC positive t-n
LIWCnegativet-n
Awarenesst-n
Impression t-n
Purchase Intent t-n
Satisfactiont-n
Recommendationt-n
]+[φ1,1 . φ1,2
. . .φ10,1 . φ10 ,2
][ x1
x2]+[e1t
e2t
e3t
e4t
e5t
e6t
e7t
e8t
e9t
e10t
] (7)
18
Model 8: LIWC+ SVM Neutral
[LIWC positivet
LIWCnegativet
SVM positive t
SVM negativet
SVM neutralt
Awarenesst
Impression t
Purchase Intent t
Satisfactiont
Recommendationt
]=∑n=1
p [γ1,1n . γ1,10
n
. . .γ10,1
n . γ10,10n ][
LIWC positive t-n
LIWCnegativet-n
SVM positivet-n
SVM negative t-n
SVM neutralt-n
Awarenesst-n
Impression t-n
Purchase Intent t-n
Satisfactiont-n
Recommendationt-n
]+[φ1,1 . φ1,2
. . .φ10,1 . φ10 ,2
][ x1
x2]+[e1t
e2t
e3t
e4t
e5t
e6t
e7t
e8t
e9t
e10t
] (8)
Model 9: Volume+ LIWC+SVM (with Neutral)
[Likest
Commentst
Shares t
LIWC positivet
LIWCnegativet
SVM positive t
SVM negativet
SVM neutralt
Awarenesst
Impression t
Purchase Intent t
Satisfactiont
Recommendationt
]=∑n=1
p [γ1,1n . γ1,13
n
. . .γ13,1
n . γ13,13n ] [
Likest-n
Comments t-n
Sharest-n
LIWC positivet-n
LIWCnegativet-n
SVM positive t-n
SVMnegative t-n
SVM neutralt-n
Awarenesst-n
Impression t-n
Purchase Intent t-n
Satisfaction t-n
Recommendationt-n
]+[φ1,1 . φ1,2
. . .φ13,1 . φ13 ,2
][x1
x2]+[e1t
e2t
e3t
e4t
e5t
e6t
e7t
e8t
e9t
e10t
e11t
e12t
e13t
] (9)
Model 10: Combination of best performant metrics: Likes, Negative (LIWC) and Positive
(SVM)
(5)
19
[Likest
LIWCnegativet
SVM positive t
Awarenesst
Impression t
Purchase Intent t
Satisfactiont
Recommendationt
]=∑n=1
p [γ1,1n . γ1,8
n
. . .γ8 ,1
n . γ8 ,8n ] [
Likest-n
LIWCnegativet-n
SVM positive t-n
Awarenesst-n
Impressiont-n
Purchase Intent t-n
Satisfaction t-n
Recommendationt-n
]+[φ1,1 . φ1,2
. . .φ8 ,1 . φ8 ,2
] [x1
x2]+ [e1t
e2t
e3t
e4t
e5t
e6t
e7t
e8t
]Second-Stage
In this section, we provide details on the second-stage estimations. To systematically uncover
contingency factors, we regress in a second stage the brand-specific FEVD results on brand
strength and industry sentiment. Our dependent variables are the quality scores given by the
difference in FEVD between SVM with neutral option (the best classification method, on
average) with (1) the volume measures, (2) SVM without a neutral option, (3) LIWC and (4)
SVM dispersion. In equations (7-10) below, we show the second-stage estimation for these
differences.
FEVD ¿β3 BrandStrengt hi∗Average Industry Sentimen ti+β4 Searchi+β5Searchi∗BrandStrengt h i+εi
(7)
FEVD ¿β3 BrandStrengt hi∗Average Industry Sentimen ti+β4 Searchi+β5Searchi∗BrandStrengt h i+εi
(8)
FEVD ¿β3 BrandStrengt hi∗Average Industry Sentimen ti+β4 Searchi+β5Searchi∗BrandStrengt h i+εi
(9)
FEVD ¿β3 BrandStrengt hi∗Average Industry Sentimen ti+β4 Searchi+β5Searchi∗BrandStrengt h i+εi
(10)
20
Web Appendix E: Brand-level results
This part of the Appendix contains the brand-level results for each mindset metric. We present
such results in Tables A6-10.
Table A6 : FEVD Brand Level Results for AwarenessBrand Industry Vol SVM LIWC SVM
dispersionSVM
neutralVol
+SVM neutral
Vol+LIWC
LIWC+SVM
Neutral
Vol+LIWC+
SVM Neutral
Likes+LIWC
(negative)+
SVM(positive)
Abercrombie Fitch Fashion 0.057 0.068 0.067 0.025 0.075 0.109 0.101 0.094 0.135 0.081Aeropostale Fashion 0.044 0.027 0.035 0.035 0.044 0.089 0.089 0.099 0.135 0.064
American Airlines Airlines 0.078 0.032 0.036 0.046 0.044 0.120 0.119 0.061 0.144 0.079Apple Electronics 0.066 0.074 0.041 0.047 0.083 0.128 0.085 0.115 0.146 0.077Audi Cars 0.078 0.077 0.089 0.031 0.086 0.099 0.100 0.128 0.134 0.082BMW Cars 0.048 0.021 0.031 0.069 0.037 0.089 0.066 0.077 0.114 0.041
Burger King Gastronomy 0.082 0.029 0.032 0.033 0.056 0.114 0.109 0.086 0.224 0.064Donatos Gastronomy 0.040 0.021 0.015 0.029 0.037 0.080 0.066 0.048 0.112 0.030
DosEquis Beverages 0.028 0.032 0.018 0.022 0.045 0.080 0.048 0.071 0.100 0.029Fidelity Banks 0.027 0.036 0.042 0.032 0.043 0.072 0.070 0.082 0.107 0.060
Fifth Third Bank Banks 0.073 0.014 0.019 0.030 0.034 0.111 0.093 0.066 0.147 0.064Ford Cars 0.064 0.035 0.034 0.015 0.041 0.095 0.094 0.066 0.126 0.048
Frontier Airlines 0.047 0.035 0.050 0.047 0.063 0.112 0.095 0.095 0.139 0.045Hennessy Beverages 0.032 0.036 0.045 0.026 0.053 0.092 0.086 0.095 0.133 0.053
Huntington Bank Banks 0.029 0.019 0.019 0.018 0.032 0.064 0.059 0.062 0.094 0.023JPMorgan Banks 0.041 0.050 0.031 0.039 0.063 0.092 0.067 0.080 0.104 0.061
Jack Daniels Beverages 0.055 0.036 0.041 0.028 0.057 0.094 0.084 0.087 0.123 0.057Jameson Beverages 0.039 0.020 0.026 0.031 0.031 0.078 0.074 0.066 0.119 0.043JennAir Airlines 0.020 0.029 0.019 0.040 0.067 0.087 0.047 0.079 0.105 0.032JetBlue Airlines 0.036 0.019 0.043 0.023 0.050 0.088 0.083 0.082 0.117 0.049
JosABank Fashion 0.053 0.021 0.031 0.021 0.047 0.095 0.077 0.084 0.131 0.044Kettle Brand Chips Food 0.047 0.033 0.026 0.025 0.054 0.098 0.067 0.083 0.119 0.049
Kia Cars 0.098 0.014 0.018 0.030 0.023 0.126 0.118 0.039 0.141 0.095Kona Grill Gastronomy 0.054 0.028 0.034 0.021 0.038 0.106 0.082 0.066 0.137 0.054
Lenovo Electronics 0.056 0.027 0.014 0.045 0.058 0.107 0.084 0.095 0.140 0.050Lexus Cars 0.058 0.048 0.021 0.055 0.068 0.114 0.087 0.101 0.145 0.040
Liberty Mutual Banks 0.152 0.076 0.070 0.068 0.094 0.202 0.172 0.120 0.222 0.149Lufthansa Airlines 0.035 0.101 0.074 0.101 0.107 0.118 0.099 0.114 0.128 0.116
McDonalds Gastronomy 0.029 0.019 0.022 0.011 0.023 0.060 0.054 0.051 0.090 0.033Nestea Food 0.102 0.085 0.069 0.060 0.095 0.157 0.131 0.109 0.174 0.119
Nine West Fashion 0.066 0.048 0.029 0.022 0.059 0.100 0.084 0.080 0.130 0.055NorthFace Fashion 0.046 0.046 0.036 0.029 0.057 0.103 0.081 0.092 0.135 0.046PNC Bank Banks 0.039 0.032 0.040 0.013 0.039 0.060 0.072 0.077 0.098 0.040
Pepsi Food 0.046 0.023 0.030 0.019 0.033 0.090 0.073 0.062 0.118 0.040Qantas Airlines 0.031 0.026 0.034 0.006 0.032 0.058 0.063 0.050 0.077 0.055
Rabobank Banks 0.038 0.037 0.018 0.047 0.051 0.097 0.062 0.071 0.116 0.046Samsung Electronics 0.026 0.024 0.014 0.034 0.041 0.066 0.040 0.054 0.078 0.024SanDisk Electronics 0.037 0.032 0.026 0.033 0.041 0.092 0.074 0.059 0.108 0.035
Singapore Airlines Airlines 0.034 0.028 0.029 0.049 0.037 0.092 0.087 0.083 0.127 0.054Smirnoff Beverages 0.067 0.050 0.043 0.048 0.060 0.106 0.092 0.074 0.119 0.077
Sony Electronics 0.032 0.023 0.016 0.029 0.032 0.068 0.047 0.044 0.076 0.023Southwest Airlines 0.054 0.027 0.022 0.065 0.043 0.097 0.074 0.077 0.119 0.045
Starbucks Frappucino Gastronomy 0.050 0.026 0.038 0.029 0.032 0.096 0.102 0.072 0.124 0.059Subaru Cars 0.034 0.031 0.020 0.058 0.061 0.101 0.062 0.081 0.122 0.028Tostitos Food 0.065 0.059 0.051 0.014 0.076 0.095 0.088 0.092 0.107 0.065US Bank Banks 0.038 0.028 0.037 0.028 0.045 0.079 0.062 0.070 0.099 0.053
Volkswagen Cars 0.049 0.021 0.023 0.032 0.034 0.084 0.064 0.056 0.099 0.036Volvo Cars 0.060 0.022 0.024 0.010 0.041 0.089 0.078 0.052 0.101 0.048
21
Table A7 : FEVD Brand Level Results for ImpressionBrand Industry Vol SV
MLIW
CSVM
dispersion
SVMneutra
l
Vol+SVM neutra
l
Vol+LIWC
LIWC+
SVM Neutra
l
Vol+LIWC+
SVM Neutral
Likes+LIWC
(negative)+SVM
(positive)Abercrombie Fitch Fashion 0.08
90.11
8 0.087 0.123 0.130 0.165 0.127 0.140 0.184 0.127Aeropostale Fashion 0.03
80.02
9 0.017 0.045 0.042 0.083 0.063 0.059 0.100 0.032American Airlines Airlines 0.07
60.04
7 0.041 0.038 0.071 0.139 0.115 0.112 0.174 0.078Apple Electronics 0.03
80.02
5 0.045 0.038 0.048 0.093 0.077 0.079 0.122 0.054Audi Cars 0.04
90.04
1 0.040 0.037 0.056 0.088 0.085 0.084 0.110 0.055BMW Cars 0.02
80.01
5 0.023 0.029 0.047 0.078 0.055 0.060 0.105 0.035Burger King Gastronom
y0.03
90.03
3 0.025 0.029 0.042 0.074 0.069 0.072 0.160 0.049Donatos Gastronom
y0.04
30.02
8 0.013 0.069 0.037 0.085 0.063 0.057 0.115 0.031DosEquis Beverages 0.05
80.03
2 0.024 0.025 0.052 0.104 0.081 0.081 0.128 0.040Fidelity Banks 0.04
80.03
4 0.024 0.033 0.044 0.084 0.075 0.063 0.107 0.042Fifth Third Bank Banks 0.03
70.02
4 0.034 0.020 0.035 0.073 0.068 0.076 0.115 0.038Ford Cars 0.05
70.01
3 0.014 0.011 0.017 0.079 0.076 0.039 0.098 0.041Frontier Airlines 0.04
10.02
9 0.023 0.019 0.051 0.092 0.063 0.086 0.119 0.041Hennessy Beverages 0.04
70.04
7 0.060 0.045 0.067 0.098 0.110 0.121 0.147 0.092Huntington Bank Banks 0.02
90.01
8 0.019 0.027 0.037 0.067 0.042 0.055 0.091 0.033JPMorgan Banks 0.05
90.03
6 0.036 0.043 0.052 0.115 0.090 0.081 0.138 0.055Jack Daniels Beverages 0.02
40.02
3 0.035 0.018 0.030 0.050 0.055 0.062 0.081 0.036Jameson Beverages 0.08
80.03
2 0.023 0.047 0.056 0.130 0.102 0.065 0.142 0.059JennAir Airlines 0.02
40.03
9 0.032 0.033 0.059 0.088 0.058 0.083 0.116 0.047JetBlue Airlines 0.04
50.03
9 0.059 0.045 0.059 0.097 0.091 0.097 0.131 0.074JosABank Fashion 0.05
90.04
1 0.021 0.035 0.054 0.119 0.082 0.075 0.140 0.065Kettle Brand Chips Food 0.03
30.02
1 0.019 0.017 0.034 0.058 0.047 0.058 0.081 0.039Kia Cars 0.03
00.01
7 0.023 0.030 0.026 0.057 0.052 0.050 0.085 0.039Kona Grill Gastronom
y0.03
60.01
8 0.040 0.015 0.035 0.062 0.083 0.089 0.124 0.049Lenovo Electronics 0.06
20.04
9 0.038 0.027 0.071 0.135 0.105 0.100 0.163 0.086Lexus Cars 0.05
20.01
2 0.016 0.035 0.047 0.099 0.068 0.067 0.121 0.032Liberty Mutual Banks 0.03
70.01
7 0.032 0.032 0.033 0.093 0.076 0.065 0.121 0.053Lufthansa Airlines 0.02
60.03
9 0.032 0.047 0.054 0.072 0.050 0.070 0.087 0.045McDonalds Gastronom
y0.04
70.03
3 0.027 0.026 0.042 0.096 0.079 0.080 0.130 0.070Nestea Food 0.04
40.02
5 0.021 0.023 0.034 0.081 0.066 0.050 0.099 0.036Nine West Fashion 0.04
00.02
3 0.024 0.017 0.036 0.073 0.060 0.066 0.099 0.030NorthFace Fashion 0.05
10.03
4 0.031 0.022 0.037 0.082 0.076 0.058 0.106 0.035PNC Bank Banks 0.05
20.04
6 0.042 0.048 0.055 0.103 0.084 0.087 0.132 0.078Pepsi Food 0.04
00.03
0 0.032 0.018 0.051 0.077 0.073 0.068 0.100 0.038Qantas Airlines 0.02
00.03
2 0.020 0.018 0.036 0.054 0.051 0.056 0.076 0.025Rabobank Banks 0.05
60.01
0 0.025 0.032 0.039 0.100 0.087 0.067 0.130 0.046
22
Samsung Electronics 0.032
0.018 0.014 0.019 0.024 0.063 0.057 0.049 0.092 0.030
SanDisk Electronics 0.037
0.025 0.012 0.041 0.039 0.094 0.058 0.057 0.118 0.025
Singapore Airlines Airlines 0.032
0.025 0.033 0.061 0.043 0.085 0.074 0.078 0.110 0.033
Smirnoff Beverages 0.025
0.019 0.012 0.024 0.028 0.063 0.044 0.063 0.097 0.030
Sony Electronics 0.043
0.027 0.024 0.034 0.050 0.094 0.074 0.076 0.120 0.031
Southwest Airlines 0.059
0.041 0.038 0.039 0.062 0.100 0.090 0.098 0.140 0.072
Starbucks Frappucino
Gastronomy
0.032
0.009 0.019 0.024 0.021 0.056 0.050 0.053 0.092 0.024
Subaru Cars 0.039
0.028 0.042 0.025 0.040 0.067 0.075 0.078 0.102 0.052
Tostitos Food 0.037
0.037 0.041 0.023 0.049 0.097 0.075 0.077 0.121 0.051
US Bank Banks 0.061
0.041 0.053 0.022 0.065 0.123 0.110 0.103 0.163 0.069
Volkswagen Cars 0.058
0.032 0.035 0.057 0.041 0.096 0.090 0.088 0.134 0.067
Volvo Cars 0.025
0.023 0.038 0.022 0.031 0.056 0.060 0.058 0.078 0.030
23
Table A8: FEVD Brand Level Results for Purchase IntentBrand Industry Vol SVM LIWC SVM
dispersionSVM
neutralVol
+SVM neutral
Vol+LIWC
LIWC+SVM
Neutral
Vol+LIWC+
SVM Neutral
Likes+LIWC
(negative)+
SVM(positive)
Abercrombie Fitch Fashion 0.032 0.031 0.027 0.042 0.037 0.069 0.057 0.057 0.087 0.038Aeropostale Fashion 0.070 0.036 0.028 0.045 0.051 0.096 0.084 0.069 0.115 0.053
American Airlines Airlines 0.078 0.024 0.027 0.044 0.038 0.126 0.109 0.056 0.154 0.065Apple Electronics 0.035 0.018 0.012 0.025 0.032 0.065 0.051 0.044 0.074 0.035Audi Cars 0.036 0.019 0.035 0.033 0.023 0.069 0.079 0.074 0.110 0.058BMW Cars 0.076 0.048 0.060 0.029 0.066 0.122 0.105 0.102 0.150 0.071
Burger King Gastronomy 0.068 0.046 0.039 0.046 0.066 0.116 0.103 0.089 0.231 0.072Donatos Gastronomy 0.031 0.022 0.025 0.027 0.028 0.067 0.057 0.065 0.134 0.030
DosEquis Beverages 0.066 0.045 0.029 0.016 0.062 0.119 0.091 0.071 0.130 0.062Fidelity Banks 0.050 0.026 0.034 0.035 0.035 0.089 0.087 0.062 0.117 0.048
Fifth Third Bank Banks 0.040 0.017 0.019 0.020 0.042 0.085 0.060 0.086 0.133 0.025Ford Cars 0.033 0.016 0.019 0.022 0.022 0.059 0.052 0.051 0.084 0.032
Frontier Airlines 0.051 0.024 0.029 0.014 0.042 0.093 0.075 0.064 0.114 0.045Hennessy Beverages 0.078 0.040 0.034 0.021 0.048 0.107 0.091 0.078 0.132 0.064
Huntington Bank Banks 0.032 0.031 0.022 0.018 0.051 0.101 0.058 0.074 0.124 0.049JPMorgan Banks 0.024 0.029 0.038 0.025 0.038 0.069 0.071 0.078 0.117 0.070
Jack Daniels Beverages 0.029 0.022 0.022 0.025 0.034 0.069 0.053 0.057 0.090 0.038Jameson Beverages 0.059 0.020 0.018 0.022 0.025 0.089 0.082 0.048 0.115 0.033JennAir Airlines 0.050 0.011 0.017 0.023 0.022 0.077 0.076 0.065 0.123 0.026JetBlue Airlines 0.062 0.035 0.034 0.031 0.041 0.085 0.080 0.073 0.120 0.070
JosABank Fashion 0.036 0.041 0.042 0.031 0.064 0.094 0.069 0.092 0.121 0.050Kettle Brand Chips Food 0.063 0.010 0.009 0.027 0.018 0.090 0.077 0.045 0.109 0.020
Kia Cars 0.028 0.039 0.033 0.039 0.055 0.087 0.065 0.086 0.115 0.040Kona Grill Gastronomy 0.063 0.046 0.068 0.041 0.077 0.121 0.121 0.124 0.173 0.077
Lenovo Electronics 0.053 0.022 0.016 0.017 0.028 0.079 0.066 0.053 0.098 0.040Lexus Cars 0.042 0.013 0.020 0.024 0.030 0.068 0.060 0.067 0.106 0.030
Liberty Mutual Banks 0.044 0.017 0.023 0.024 0.034 0.076 0.057 0.055 0.091 0.040Lufthansa Airlines 0.044 0.030 0.016 0.023 0.050 0.113 0.075 0.080 0.143 0.039
McDonalds Gastronomy 0.041 0.025 0.030 0.031 0.036 0.084 0.073 0.073 0.118 0.040Nestea Food 0.041 0.016 0.016 0.034 0.051 0.082 0.062 0.068 0.096 0.037
Nine West Fashion 0.034 0.015 0.018 0.033 0.038 0.071 0.051 0.061 0.097 0.032NorthFace Fashion 0.020 0.009 0.018 0.026 0.019 0.037 0.037 0.040 0.067 0.017PNC Bank Banks 0.050 0.032 0.034 0.035 0.060 0.101 0.073 0.072 0.114 0.047
Pepsi Food 0.039 0.019 0.013 0.023 0.027 0.072 0.047 0.045 0.084 0.040Qantas Airlines 0.043 0.024 0.019 0.022 0.028 0.071 0.064 0.049 0.084 0.049
Rabobank Banks 0.050 0.014 0.027 0.023 0.041 0.093 0.085 0.080 0.133 0.033Samsung Electronics 0.037 0.018 0.026 0.033 0.037 0.068 0.070 0.072 0.101 0.039SanDisk Electronics 0.030 0.018 0.020 0.038 0.028 0.065 0.052 0.063 0.088 0.031
Singapore Airlines Airlines 0.039 0.021 0.022 0.028 0.042 0.078 0.062 0.058 0.096 0.032Smirnoff Beverages 0.021 0.021 0.018 0.035 0.031 0.050 0.036 0.060 0.074 0.025
Sony Electronics 0.042 0.020 0.021 0.027 0.030 0.063 0.063 0.054 0.097 0.038Southwest Airlines 0.033 0.021 0.026 0.032 0.041 0.075 0.066 0.071 0.103 0.032
Starbucks Frappucino Gastronomy 0.036 0.039 0.029 0.033 0.046 0.080 0.067 0.087 0.129 0.036Subaru Cars 0.051 0.022 0.014 0.031 0.044 0.098 0.067 0.058 0.118 0.037Tostitos Food 0.056 0.046 0.044 0.013 0.066 0.104 0.091 0.085 0.118 0.055US Bank Banks 0.035 0.030 0.029 0.034 0.046 0.077 0.072 0.079 0.112 0.043
Volkswagen Cars 0.035 0.021 0.017 0.023 0.035 0.071 0.051 0.057 0.089 0.032Volvo Cars 0.075 0.076 0.067 0.019 0.088 0.142 0.128 0.134 0.182 0.092
24
Table A9: FEVD Brand Level Results for Satisfaction Brand Industry Vol SV
MLIW
CSVM
dispersion
SVMneutra
l
Vol+SVM neutra
l
Vol+LIWC
LIWC+
SVM Neutra
l
Vol+LIWC+
SVM Neutral
Likes+LIWC
(negative)+SVM
(positive)Abercrombie Fitch Fashion 0.05
60.02
2 0.025 0.034 0.028 0.085 0.935 0.064 0.117 0.053Aeropostale Fashion 0.03
90.04
4 0.011 0.046 0.061 0.115 0.946 0.072 0.126 0.035American Airlines Airlines 0.07
20.06
9 0.040 0.032 0.122 0.196 0.963 0.147 0.222 0.108Apple Electronics 0.03
90.02
7 0.022 0.027 0.046 0.083 0.943 0.064 0.099 0.041Audi Cars 0.03
80.02
4 0.023 0.042 0.030 0.061 0.907 0.044 0.086 0.039BMW Cars 0.04
40.02
9 0.033 0.027 0.039 0.087 0.914 0.065 0.103 0.057Burger King Gastronom
y0.04
20.01
9 0.021 0.022 0.030 0.075 0.947 0.063 0.146 0.041Donatos Gastronom
y0.04
10.02
4 0.027 0.042 0.034 0.068 0.950 0.056 0.113 0.046DosEquis Beverages 0.04
00.03
0 0.028 0.023 0.043 0.090 0.950 0.065 0.107 0.041Fidelity Banks 0.03
80.01
7 0.020 0.034 0.033 0.072 0.929 0.056 0.094 0.034Fifth Third Bank Banks 0.04
10.01
1 0.019 0.026 0.027 0.066 0.924 0.062 0.117 0.022Ford Cars 0.04
70.02
3 0.018 0.038 0.030 0.089 0.964 0.055 0.108 0.046Frontier Airlines 0.02
60.01
8 0.029 0.050 0.041 0.085 0.911 0.076 0.120 0.025Hennessy Beverages 0.04
40.03
9 0.033 0.032 0.055 0.092 0.942 0.092 0.124 0.045Huntington Bank Banks 0.03
10.02
9 0.015 0.028 0.048 0.102 0.940 0.070 0.116 0.049JPMorgan Banks 0.02
90.03
1 0.047 0.023 0.046 0.086 0.939 0.106 0.131 0.072Jack Daniels Beverages 0.02
20.02
1 0.035 0.019 0.045 0.062 0.937 0.074 0.098 0.033Jameson Beverages 0.05
30.02
9 0.030 0.035 0.037 0.082 0.938 0.067 0.111 0.084JennAir Airlines 0.04
30.03
3 0.028 0.021 0.043 0.083 0.950 0.069 0.119 0.057JetBlue Airlines 0.04
50.04
5 0.058 0.030 0.076 0.126 0.946 0.131 0.170 0.063JosABank Fashion 0.04
50.01
9 0.023 0.018 0.025 0.078 0.945 0.048 0.097 0.044Kettle Brand Chips Food 0.05
90.03
7 0.029 0.017 0.047 0.090 0.920 0.066 0.107 0.054Kia Cars 0.02
90.02
2 0.031 0.034 0.044 0.080 0.922 0.080 0.114 0.045Kona Grill Gastronom
y0.03
30.04
1 0.025 0.034 0.054 0.087 0.952 0.073 0.105 0.036Lenovo Electronics 0.03
30.02
8 0.034 0.008 0.040 0.070 0.919 0.066 0.100 0.050Lexus Cars 0.03
90.03
8 0.038 0.047 0.052 0.086 0.945 0.086 0.122 0.053Liberty Mutual Banks 0.02
80.03
4 0.031 0.033 0.050 0.080 0.940 0.067 0.092 0.031Lufthansa Airlines 0.02
80.01
5 0.016 0.018 0.029 0.070 0.958 0.055 0.086 0.025McDonalds Gastronom
y0.05
90.02
4 0.030 0.025 0.035 0.098 0.951 0.057 0.116 0.053Nestea Food 0.04
60.02
1 0.012 0.019 0.033 0.085 0.953 0.054 0.101 0.041Nine West Fashion 0.02
40.03
0 0.029 0.014 0.066 0.091 0.935 0.092 0.119 0.032NorthFace Fashion 0.04
70.02
4 0.019 0.027 0.038 0.083 0.935 0.050 0.108 0.035PNC Bank Banks 0.03
10.02
4 0.033 0.022 0.033 0.072 0.940 0.080 0.119 0.038Pepsi Food 0.03
80.02
9 0.014 0.024 0.038 0.067 0.928 0.049 0.084 0.036Qantas Airlines 0.04
20.03
4 0.031 0.015 0.042 0.074 0.955 0.060 0.099 0.049Rabobank Banks 0.06
50.01
7 0.020 0.022 0.037 0.105 0.960 0.063 0.128 0.028
25
Samsung Electronics 0.039
0.027 0.015 0.034 0.035 0.083 0.932 0.067 0.115 0.041
SanDisk Electronics 0.058
0.022 0.025 0.046 0.049 0.106 0.953 0.070 0.124 0.046
Singapore Airlines Airlines 0.035
0.022 0.017 0.037 0.032 0.069 0.935 0.045 0.081 0.033
Smirnoff Beverages 0.041
0.042 0.035 0.037 0.058 0.087 0.934 0.081 0.110 0.050
Sony Electronics 0.045
0.026 0.025 0.029 0.042 0.077 0.891 0.062 0.092 0.042
Southwest Airlines 0.041
0.018 0.021 0.048 0.034 0.085 0.933 0.061 0.117 0.044
Starbucks Frappucino
Gastronomy
0.028
0.010 0.023 0.037 0.017 0.053 0.919 0.051 0.092 0.041
Subaru Cars 0.065
0.026 0.018 0.037 0.038 0.097 0.938 0.070 0.135 0.047
Tostitos Food 0.040
0.015 0.026 0.025 0.021 0.077 0.915 0.053 0.095 0.023
US Bank Banks 0.025
0.035 0.035 0.053 0.069 0.105 0.933 0.097 0.140 0.051
Volkswagen Cars 0.041
0.032 0.025 0.034 0.042 0.078 0.953 0.070 0.107 0.041
Volvo Cars 0.035
0.021 0.027 0.027 0.029 0.062 0.933 0.063 0.098 0.037
26
Table A10 : FEVD Brand Level Results for RecommendationBrand Industry Vol SVM LIWC SVM
dispersionSVM
neutralVol
+SVM neutral
Vol+LIWC
LIWC+SVM
Neutral
Vol+LIWC+
SVM Neutral
Likes+LIWC
(negative)+
SVM(positive)
Abercrombie Fitch Fashion 0.092 0.112 0.096 0.106 0.123 0.163 0.082 0.136 0.189 0.128Aeropostale Fashion 0.027 0.032 0.014 0.031 0.041 0.081 0.061 0.056 0.097 0.030
American Airlines Airlines 0.067 0.047 0.026 0.044 0.082 0.145 0.134 0.119 0.172 0.059Apple Electronics 0.043 0.030 0.046 0.017 0.053 0.080 0.054 0.079 0.113 0.061Audi Cars 0.040 0.030 0.026 0.032 0.039 0.091 0.057 0.050 0.106 0.049BMW Cars 0.039 0.022 0.033 0.018 0.047 0.072 0.073 0.083 0.125 0.040
Burger King Gastronomy 0.057 0.041 0.046 0.019 0.056 0.104 0.078 0.096 0.243 0.064Donatos Gastronomy 0.047 0.033 0.031 0.064 0.039 0.074 0.059 0.060 0.125 0.040
DosEquis Beverages 0.038 0.026 0.024 0.023 0.050 0.092 0.065 0.072 0.101 0.034Fidelity Banks 0.048 0.020 0.017 0.026 0.029 0.087 0.066 0.067 0.120 0.030
Fifth Third Bank Banks 0.039 0.014 0.032 0.017 0.032 0.066 0.068 0.064 0.091 0.026Ford Cars 0.042 0.011 0.019 0.012 0.021 0.073 0.071 0.054 0.102 0.028
Frontier Airlines 0.043 0.035 0.023 0.024 0.058 0.090 0.063 0.079 0.113 0.031Hennessy Beverages 0.044 0.023 0.050 0.017 0.029 0.072 0.071 0.077 0.126 0.061
Huntington Bank Banks 0.022 0.020 0.028 0.032 0.037 0.068 0.053 0.070 0.097 0.046JPMorgan Banks 0.065 0.045 0.029 0.041 0.053 0.136 0.073 0.104 0.170 0.059
Jack Daniels Beverages 0.026 0.029 0.030 0.046 0.048 0.072 0.058 0.068 0.094 0.036Jameson Beverages 0.045 0.044 0.024 0.060 0.060 0.113 0.079 0.084 0.135 0.056JennAir Airlines 0.033 0.025 0.027 0.020 0.057 0.088 0.084 0.076 0.108 0.033JetBlue Airlines 0.043 0.034 0.041 0.023 0.059 0.105 0.097 0.088 0.133 0.059
JosABank Fashion 0.053 0.039 0.035 0.025 0.050 0.119 0.068 0.108 0.173 0.066Kettle Brand Chips Food 0.025 0.024 0.025 0.023 0.039 0.061 0.079 0.067 0.088 0.033
Kia Cars 0.040 0.013 0.019 0.017 0.014 0.056 0.070 0.060 0.108 0.051Kona Grill Gastronomy 0.031 0.029 0.034 0.029 0.050 0.080 0.058 0.089 0.124 0.040
Lenovo Electronics 0.043 0.012 0.014 0.020 0.037 0.080 0.068 0.067 0.119 0.041Lexus Cars 0.047 0.017 0.024 0.025 0.034 0.076 0.074 0.062 0.101 0.051
Liberty Mutual Banks 0.027 0.024 0.035 0.045 0.039 0.077 0.059 0.063 0.103 0.053Lufthansa Airlines 0.038 0.041 0.037 0.040 0.053 0.087 0.047 0.071 0.104 0.053
McDonalds Gastronomy 0.054 0.016 0.031 0.024 0.020 0.084 0.083 0.061 0.119 0.060Nestea Food 0.042 0.026 0.020 0.025 0.044 0.079 0.057 0.066 0.112 0.034
Nine West Fashion 0.030 0.033 0.032 0.020 0.053 0.071 0.051 0.079 0.097 0.037NorthFace Fashion 0.048 0.051 0.041 0.030 0.056 0.095 0.065 0.079 0.132 0.041PNC Bank Banks 0.047 0.039 0.039 0.051 0.047 0.094 0.070 0.072 0.115 0.058
Pepsi Food 0.041 0.024 0.026 0.015 0.042 0.073 0.051 0.061 0.101 0.043Qantas Airlines 0.059 0.041 0.040 0.020 0.047 0.078 0.056 0.060 0.086 0.057
Rabobank Banks 0.050 0.020 0.017 0.020 0.024 0.075 0.091 0.048 0.095 0.054Samsung Electronics 0.038 0.029 0.018 0.023 0.036 0.080 0.069 0.064 0.114 0.054SanDisk Electronics 0.040 0.015 0.025 0.030 0.028 0.079 0.081 0.055 0.106 0.035
Singapore Airlines Airlines 0.031 0.036 0.034 0.047 0.047 0.098 0.055 0.065 0.110 0.051Smirnoff Beverages 0.024 0.033 0.024 0.031 0.038 0.067 0.068 0.073 0.102 0.025
Sony Electronics 0.055 0.015 0.017 0.032 0.031 0.088 0.065 0.052 0.110 0.049Southwest Airlines 0.033 0.014 0.012 0.022 0.019 0.057 0.073 0.040 0.078 0.024
Starbucks Frappucino Gastronomy 0.033 0.012 0.017 0.026 0.019 0.050 0.059 0.045 0.089 0.028Subaru Cars 0.043 0.021 0.024 0.023 0.027 0.077 0.089 0.064 0.108 0.044Tostitos Food 0.043 0.021 0.027 0.017 0.034 0.091 0.071 0.075 0.118 0.043US Bank Banks 0.038 0.043 0.043 0.027 0.057 0.097 0.072 0.104 0.142 0.055
Volkswagen Cars 0.028 0.025 0.035 0.017 0.034 0.063 0.064 0.078 0.104 0.053Volvo Cars 0.029 0.021 0.031 0.019 0.036 0.059 0.066 0.055 0.084 0.033
27
Figures of the Results
Figure A1: Explanatory Power (R2) across brands
Figure A2: FEVD across five models
28
Figure A3: FEVD for each metric
29
Web Appendix F: Industry Level Analysis (PVAR models)
In this part of the Appendix, we provide details on the industry-level analysis. To obtain the
industry level estimates, we model the relation between the our variables as a dynamic system of
simultaneous equations using the panel vector auto regressive model (Holtz-Eakin, Newey, and
Rosen 1988) which allows for unobserved individual (sector-level) heterogeneity. To estimate
our model, we implement the STATA package pvar developed by Abrigo and Love (2015).
PVAR was recently employed in the marketing (e.g. Hewett et al. 2016) and information systems
literature (e.g. Dewan and Ramaprasad 2014).
Similarly, to our brand-level VAR models, we employ a PVAR model in the following
form (see Equation 12) for each industry and SETs model. Where SETs are the Volume metrics,
SVM without neutral, LIWC, SVM dispersion and SVM with neutral. The optimal lag order
(“p”) is chosen similarly to the VAR model and we select lag 7 for the model estimation. We
estimate the Panel VAR analysis according to the optimal lag order selected from the
combination of the criteria. Based on these criteria we select the seventh order panel VAR since
it has the smallest MAIC and Hansen’s J statistic and the largest CD.
We also checked the stability condition of the estimated panel VAR and found that the
panel VAR is invertible with all moduli of the companion matrix strictly less than one (Hamilton
1994). Therefore the panel VAR model is stable and we can proceed with the next steps of the
analysis.
[ SET t ¿ ] [ Awarenesst ¿ ] [Impression t ¿ ] [ Purchaset ¿ ] [Satisfactiont ¿ ]¿¿
¿¿ (12)
30
In panel VAR with the presence of lagged dependent variables in the right-hand side of the
system of equations the estimates would be biased even with large N (Nickell 1981). Therefore,
the joint estimation of the system of equations may be implemented based on the Generalized
Method of Moments (GMM) estimator. However, prior research has demonstrated that the fixed
effects μi are correlated with the regressors because of the lags of the dependent variables
(Arellano and Bover 1995). To solve this issue, Arellano & Bover (1995) proposed forward
orthogonal deviation (also known as Helmert transformation) that subtracts the average of all
available future observations, thereby minimizing data loss. Since past realizations are not
included in this transformation, they remain as valid instruments and can be used to estimate the
coefficients by system GMM. In addition, the use of Helmert transformation does not induce
autocorrelation in the error terms freeing us from serial autocorrelation (Drakos and
Konstantinou 2014). To avoid heteroscedasticity in the errors we implement the
heteroscedasticity-robust standard errors.
We then derive the FEVD’s from each PVAR model for each Industry and present the
findings in Tables A11-15 and Figures A1-A5
For airlines, we find that Volume dominates for the Awareness; SVM (dispersion) for
Impression, Purchase Intent and Recommendation and SVM with neutral for Satisafaction. The
airline industry is characterized by frequent service requests, which often have neutral sentiment.
Therefore, brand managers in the airline sector should use SVM and include neutral comments in
their analysis. For banking, we find that the simple Volume SETs outperform the other SETs in
all consumer mindset metrics but Satisfaction for which SVM Dispersion wins. This finding is in
line with our expectations, given the low average sentiment, that consumers use social media
31
mostly to complain about their experiences in the banking sector. Thus, ‘no news is good news’
in this industry, while UGC activity typically represents bad news. For electronics and
beverages, we find that the SVM (dispersion) metric consistently outperforms other sentiment
measures for all consumer mindset metrics. In these two industries, brands attract large amounts
of both positive and negative comments, implying high brand dispersion. Accordingly, managers
in these two industries should adjust for such sentiment dispersion and SVM as their main
sentiment technique. For fashion, we find that SVM with neutral dominates for impression, and
recommendation, while Volume dominates for Awareness, Purchase Intent and Satisfaction.
32
Table 11: Industry Level Results for Awareness
Industry
Model 1
Volume
Model 2
(SVM)
Model
3(LIWC)
Model 4
(SVM dispersion adjusted)
Model 5
(SVM with neutral)
Airlines 1.12% 0.29% 0.16% 0.75% 0.34%Banking 16.62% 0.50% 0.90% 0.35% 0.66%Beverages 3.72% 3.36% 2.72% 4.42% 3.71%Cars 1.01% 0.28% 0.18% 1.12% 0.36%Electronics 4.22% 1.45% 0.63% 16.88% 5.78%Fashion 2.48% 1.51% 1.92% 1.15% 1.75%Food 1.25% 0.98% 1.36% 0.41% 1.84%Gastronomy 1.85% 0.88% 0.15% 1.09% 1.35%
Figure A1: Industry Level Results for Awareness
33
Table A12: Industry Level Results for Impression
Industry
Model 1
Volume
Model 2
(SVM)
Model
3(LIWC)
Model 4
(SVM dispersion adjusted
Model 5
(SVM with neutral)
Airlines 1.70% 1.31% 0.88% 2.99% 1.51%Banking 12.02% 1.03% 1.28% 2.25% 1.32%Beverages 1.06% 0.82% 0.54% 2.07% 0.86%Cars 0.41% 0.36% 0.15% 0.52% 0.54%Electronics 3.50% 0.81% 0.73% 19.23% 2.36%Fashion 1.84% 2.02% 1.36% 1.86% 2.99%Food 1.50% 1.08% 0.96% 0.34% 1.59%Gastronomy 1.17% 0.77% 0.20% 0.99% 1.11%
Figure A2: Industry Level Results for Impression
A i r l i n e s B a n k i n g B e v e r a g e s C a r s E l e c t r o n i c s F a s h i o n F o o d G a s t r o n o m y
1.70
%
12.0
2%
1.06
%
0.41
%
3.50
%
1.84
%
1.50
%
1.17
%
1.31
%
1.03
%
0.82
%
0.36
%
0.81
%
2.02
%
1.08
%
0.77
%
0.88
%
1.28
%
0.54
%
0.15
%
0.73
%
1.36
%
0.96
%
0.20
%
2.99
%
2.25
%
2.07
%
0.52
%
19.2
3%
1.86
%
0.34
%
0.99
%
1.51
%
1.32
%
0.86
%
0.54
% 2.36
%
2.99
%
1.59
%
1.11
%
ImpressionVolume SVM LIWC SVM_DISP SVM_3 metr
34
Table A13: Industry Level Results for Purchase Intent
Industry
Model 1
Volume
Model 2
(SVM)
Model
3(LIWC)
Model 4
(SVM dispersion adjusted
Model 5
(SVM with neutral)
Airlines 0.38% 0.47% 0.52% 1.14% 0.76%Banking 0.89% 0.58% 0.84% 0.83% 0.69%Beverages 1.01% 0.85% 0.51% 1.65% 1.13%Cars 0.67% 0.50% 0.44% 0.63% 0.66%Electronics 5.46% 0.85% 0.53% 15.49% 5.93%Fashion 3.14% 1.02% 1.06% 0.81% 1.66%Food 1.47% 1.20% 0.87% 0.79% 1.49%Gastronomy 0.33% 0.08% 0.06% 1.11% 0.16%
Figure A3: Industry Level Results for Purchase Intent
35
Table A14: Industry Level Results for Satisfaction
Industry
Model 1
Volume
Model 2
(SVM)
Model
3(LIWC)
Model 4
(SVM dispersion adjusted
Model 5
(SVM with neutral)
Airlines 1.79% 1.26% 1.53% 0.54% 2.04%Banking 0.92% 0.31% 0.63% 0.54% 0.55%Beverages 2.07% 1.64% 1.53% 3.29% 1.95%Cars 0.47% 0.68% 0.22% 0.58% 0.81%Electronics 5.72% 0.88% 0.54% 14.74% 5.63%Fashion 2.25% 1.05% 0.45% 0.67% 1.21%Food 1.98% 0.92% 0.40% 1.12% 1.50%Gastronomy 1.43% 0.85% 0.24% 1.42% 1.19%
Figure A4: Industry Level Results for Satisfaction
36
Table A15: Industry Level Results for Recommendation
Industry
Model 1
Volume
Model 2
(SVM)
Model
3(LIWC)
Model 4
(SVM dispersion adjusted
Model 5
(SVM with neutral)
Airlines 0.46% 1.23% 0.76% 5.66% 1.62%Banking 2.05% 0.46% 0.67% 3.13% 0.63%Beverages 2.42% 2.10% 1.69% 3.69% 2.24%Cars 0.49% 0.24% 0.23% 0.39% 0.45%Electronics 4.93% 0.90% 0.67% 16.72% 2.82%Fashion 0.51% 1.51% 1.33% 0.79% 2.63%Food 0.90% 0.81% 0.79% 0.48% 1.29%Gastronomy 1.45% 1.03% 0.33% 1.15% 1.38%
Figure A5: Industry Level Results for Recommendation
37
Web Appendix G: Forecasting
To investigate the relative benefit of using a SET over other SETs we compare the
forecasting accuracy of each SET/mind-set metric combination. We explore such forecasting
accuracy of five main models and four combined models by estimating the out-of-sample
forecasts. We report the results in Table A16.
We leave out 10 last days of the sample (from 8th June 2014 to 17th June 2014) for each
brand and assess the forecasting accuracy of the model estimates. We report the Mean Average
Percent Error (MAPE), Mean Average Error (MAE), Root Mean Square Error (RMSE) and the
Theil Index, averaged across brand for each model and mind-set metric combination.
Table A16: Out of Sample Dynamic Forecasting Accuracy Indexes
a) Awareness
Model MAPE MAE RMSE Theil
Volume 0.119 3.811 4.622 0.849
SVM 0.118 3.601 4.392 0.783
LIWC 0.117 3.615 4.409 0.784
SVM (Dispersion) 0.117 3.604 4.424 0.785
SVM (Neutral) 0.117 3.618 4.413 0.789
Volume+SVM (Neutral) 0.124 4.032 4.852 0.893
Volume+LIWC 0.124 3.956 4.789 0.881
LIWC+SVM (Neutral) 0.117 3.679 4.457 0.798
Volume+LIWC+SVM (Neutral) 0.122 3.975 4.786 0.881
Likes+LIWC(negative)+SVM(positive) 0.116 3.764 4.529 0.793
38
b) Impression
Model MAPE MAE RMSE Theil
Volume 0.341 4.890 5.947 0.783
SVM 0.339 4.618 5.596 0.709
LIWC 0.338 4.641 5.616 0.713
SVM (Dispersion) 0.358 4.752 5.734 0.708
SVM (Neutral) 0.340 4.633 5.619 0.713
Volume+SVM (Neutral) 0.355 5.346 6.481 0.864
Volume+LIWC 0.349 4.918 6.009 0.798
LIWC+SVM (Neutral) 0.344 4.725 5.732 0.735
Volume+LIWC+SVM (Neutral) 0.340 4.785 5.832 0.784
Likes+LIWC(negative)+SVM(positive) 0.346 4.803 5.809 0.747
c) Purchase Intent
Model MAPE MAE RMSE Theil
Volume 0.273 1.910 2.394 0.812
SVM 0.262 1.906 2.372 0.799
LIWC 0.252 1.909 2.375 0.809
SVM (Dispersion) 0.265 1.898 2.378 0.802
SVM (Neutral) 0.266 1.911 2.384 0.802
Volume+SVM (Neutral) 0.292 2.041 2.543 0.875
Volume+LIWC 0.278 1.940 2.418 0.829
LIWC+SVM (Neutral) 0.287 1.919 2.393 0.813
Volume+LIWC+SVM (Neutral) 0.306 2.001 2.480 0.848
Likes+LIWC(negative)+SVM(positive) 0.286 1.999 2.475 0.845
39
d) Satisfaction
Model MAPE MAE RMSE Theil
Volume 0.262 3.456 4.168 0.791
SVM 0.239 3.132 3.852 0.756
LIWC 0.236 3.139 3.874 0.758
SVM (Dispersion) 0.257 3.143 3.866 0.752
SVM (Neutral) 0.234 3.157 3.884 0.755
Volume+SVM (Neutral) 0.251 3.508 4.295 0.798
Volume+LIWC 0.253 3.448 4.162 0.789
LIWC+SVM (Neutral) 0.231 3.284 4.029 0.765
Volume+LIWC+SVM (Neutral) 0.246 3.283 4.011 0.803
Likes+LIWC(negative)+SVM(positive) 0.236 3.344 4.060 0.801
e) Recommendation
Model MAPE MAE RMSE Theil
Volume 0.400 4.762 5.794 0.762
SVM 0.440 4.386 5.389 0.744
LIWC 0.390 4.361 5.377 0.741
SVM (Dispersion) 0.423 4.427 5.425 0.748
SVM (Neutral) 0.431 4.395 5.403 0.743
Volume+SVM (Neutral) 0.439 4.643 5.722 0.760
Volume+LIWC 0.356 4.500 5.563 0.758
LIWC+SVM (Neutral) 0.430 4.486 5.568 0.747
Volume+LIWC+SVM (Neutral) 0.440 4.587 5.629 0.791
Likes+LIWC(negative)+SVM(positive) 0.405 4.605 5.595 0.758
40
References
Abrigo, Manuel and Inessa Love (2015), “Estimation of Panel Vector Autoregression in Stata: a Package of Program.”
Agarwal, A., B. Xie, I. Vovsha, O. Rambow, and R. Passonneau (2011), “Sentiment Analysis of Twitter Data,” in Proceedings of the workshop on languages in social media, Association for Computational Linguistics, 30–38.
Arellano, Manuel and Olympia Bover (1995), “Another look at the instrumental variable estimation of error-components models,” Journal of econometrics, 68 (1), 29–51.
Bradley, M M, M K Greenwald, M C Petry, and P J Lang (1992), “Remembering pictures: pleasure and arousal in memory.,” Journal of experimental psychology: Learning, memory, and cognition, 18 (2), 379–90.
Breusch, T.S. (1978), “Testing for Autocorrelation in Dynamic Linear Models,” Australian Economic Papers, 17 (31), 334–55.
Colicev, Anatoli, Ashwin Malshe, Koen Pauwels, and Peter O’Connor (2016), “How Earned and Owned Social Media Impact Shareholder Value Through Consumer Mindset Metrics,” MSI Working Paper Series 16-115.
———, ———, ———, and ——— (2017), “Improving Consumer Mind-Set Metrics and Shareholder Value through Social Media: The Different Roles of Owned and Earned,” Journal of Marketing, (Forthcoming).
Culotta, Aron and Jennifer Cutler (2016), “Mining Brand Perceptions from Twitter Social Networks,” Marketing Science, Articles i, 1–16.
Dekimpe, Marnik G. and Dominique M. Hanssens (1999), “Sustained Spending and Persistent Response: A New Look at Long-Term Marketing Profitability,” Journal of Marketing Research, 36 (4), 397–412.
Dewan, Sanjeev and Jui Ramaprasad (2014), “Social Media, Traditional Media, and Music Sales,” MIS Quarterly, 2 (3), 101–21.
Drakos, Konstantinos and Panagiotis Th. Konstantinou (2014), “Terrorism, crime and public spending: Panel VAR evidence from Europe.,” Defence and Peace Economics, 25 (4), 349–61.
Enders, Walter (2014), Applied Econometric Time Series, New York: John Wiley & Sons.Feinerer, Ingo (2017), “Introduction to the tm Package: Text Mining in R,” R vignette, 1–8.Franses, Philip-Hans (2005), “On the Use of Econometric Models for Policy Simulation in Marketing,”
Journal of Marketing Research, 42 (February), 4–14.Gamon, M. (2004), “Sentiment classification on customer feedback data: noisy data, large feature vectors,
and the role of linguistic analysis.,” in Proceedings of the 20th international conference on Computational Linguistics, Association for Computational Linguistics, 841.
Granger, C.W.J (1969), “Investigating Causal Relations by Econometric Models and Cross-Spectral Methods,” Econometrica, 37 (3), 424–38.
Haidt, Jonathan and Craig Joseph (2004), “Intuitive Ethics: How Innately Prepared Intuitions Generate Culturally Variable Virtues,” Daedalus, 133, 55–66.
Hamilton, JD (1994), Time series analysis, Princeton University Press.Hewett, Kelly, William Rand, Roland T Rust, and Harald J van Heerde (2016), “Brand Buzz in the
Echoverse.,” Journal of Marketing, 80 (3), 1–24.Holtz-Eakin, Douglas, Whitney Newey, and Harvey S . Rosen (1988), “Estimating Vector
Autoregressions with Panel Data,” Econometrica, 56 (6), 1371–95.Johansen, Søren, Rocco Mosconi, and Bent Nielsen (2000), “Cointegration Analysis in the Presence of
Structural Breaks in the Deterministic Trend,” Econometrics Journal, 3 (2), 216–49.Jurka, Tp and Loren Collingwood (2015), “RTextTools: A Supervised Learning Package for Text
41
Classification.,” R Journal, 5 (1), 6–12.Kübler, Raoul V, Jaap Wieringa, and Koen H. Pauwels (2017), “Big Data and Machine Learning,” in
Advanced Methods for Modeling Markets, Leeflang, Wieringa, Bijmolt, and Pauwels, eds., Springer, Berlin, 1–35.
Kwiatkowski, Denis, Peter C.B. Phillips, Peter Schmidt, and Yongcheol Shin (1992), “Testing the Null Hypothesis of Stationary Against the Alternative of a Unit Root,” Journal of Econometrics, 54 (1–3), 159–78.
Luo, Xueming, Jie Zhang, and Wenjing Duan (2013), “Social media and firm equity value,” Information Systems Research, 24 (1), 146–63.
Maddala, G.S and In-Moo Kim (1998), Unit Roots, Cointegration, and Structural Change, Cambridge, UK: Cambridge University Press.
McAuley, Julian, Rahul Pandey, and Jure Leskovec (2015), “Inferring Networks of Substitutable and Complementary Products,” Knowledge Discovery and Data Mining, 12.
Mcauley, Julian and Alex Yang (2016), “Addressing Complex and Subjective Product-Related Queries with Customer Reviews,” in IW3C2, 625–35.
Mullen, Tony and Nigel Collier (2004), “Sentiment analysis using support vector machines with diverse information sources,” Proceedings of Conference on Empirical Methods in Natural Language Processing, 412–18.
Naji, I. (2013), “Twitter Sentiment Analysis with Naives Bayes,” [available at http://thinknook.com/twitter-sentiment-analysis-training-corpus-dataset-2012-09-22/].
Nickell, Stephen (1981), “Biases in dynamic models with fixed effects,” Econometrica, 49 (6), 1417–26.Nijs, Vincent R., Shuba Srinivasan, and Koen H. Pauwels (2007), “Retail-Price Drivers and Retailer
Profits,” Marketing Science, 26 (4), 473–87.Pak, Alexander and Patrick Paroubek (2010), “Twitter as a Corpus for Sentiment Analysis and Opinion
Mining,” in In Proceedings of the Seventh Conference on International Language Resources and Evaluation, 1320–26.
Pauwels, Koen H. and Dominique M. Hanssens (2007), “Performance Regimes and Marketing Policy Shifts,” Marketing Science.
Pennebaker, James W. (1993), “Putting stress into words: Health, linguistic, and therapeutic implications,” Behaviour Research and Therapy, 31 (6), 539–48.
Pennebaker, James W, Ryan L Boyd, Kayla Jordan, and Kate Blackburn (2015), “The development and psychometric properties of LIWC2015,” (September 2015), 1–22.
Rooderkerk, Robert P. and Koen H. Pauwels (2016), “No Comment?! The Drivers of Reactions to Online Posts in Professional Groups,” Journal of Interactive Marketing, 35, 1–15.
Slotegraaf, Rebecca J and Koen H. Pauwels (2008), “The Impact of Brand Equity and Innovation on the Long-Term Effectiveness of Promotions,” Journal of Marketing Research.
Srinivasan, Shuba, Marc Vanhuele, and Koen H. Pauwels (2010), “Mind-Set Metrics in Market Response Models : An Integrative Approach,” Journal of Marketing Research, XLVII (August), 672–84.
Tirunillai, Seshadri and Gerard J. Tellis (2012), “Does chatter really matter? Dynamics of user-generated content and stock performance,” Marketing Science, 31 (2), 198–215.
Toubia, Olivier and AT Stephen (2013), “Intrinsic vs. Image-Related Utility in Social Media: Why Do People Contribute Content to Twitter?,” Marketing Science, 32 (3), 368–92.
Trusov, Michael, Randolph E. Bucklin, and Koen H. Pauwels (2009), “Effects of Word-of-Mouth Versus Traditional Marketing: Findings from an Internet Social Networking Site,” Journal of marketing, 73 (5), 90–102.
42