Customer-Base Analysis Using Aggregated Data ( Or: The Joys of RCSS )
description
Transcript of Customer-Base Analysis Using Aggregated Data ( Or: The Joys of RCSS )
Kinshuk Jerath, Carnegie Mellon UniversityPeter S. Fader, Wharton/Univ. of PennBruce G. S. Hardie, London Business School
Customer-Base Analysis Using Aggregated Data (Or: The Joys of RCSS)
2
Customer-Base AnalysisFaced with a customer transaction database, we may
wish to determine
The level of transactions we expect in future periods, both collectively and individually
Key characteristics of the cohort (e.g., degree of heterogeneity in behavior)
Formal financial metrics (such as “customer lifetime value”) to guide resource allocation decisions
3
Typical Data Structure
Models for customer-base analysis typically require access to individual-customer-level data
4
Long-Standing IT Challenges
5
Too-Much-Data Problem
6
Data Privacy Issues
7
Barriers to Disaggregate DataMany firms may not (be able to) keep detailed individual-
level records: General weaknesses with the firm’s information
systems capabilities Corporate information silos make data integration
difficult Wariness given high-profile stories on data loss Data protection laws (with bans on trans-border data
flows)
“Anonymizing” (and other statistical disclosure control methods) costly and potentially ineffective
8
Key Challenges
What data formats are easy to create/maintain privacy preserving
Can we adapt our “tried and true” models to accommodate these data limitations but still work well?
How much do we lose in the process?
9
Repeated Cross-Sectional Summary Data
10
Proof of Concept: Tuscan Lifestyles
11
Tuscan Lifestyles Data
12
How would we proceed if we had disaggregate data?
13
“Buy Till You Die” Model
Transaction Process (“Buy”) While “alive”, a customer purchases randomly around his
mean transaction rate Transaction rates vary across customers
Dropout Process (“Till You Die”) Each customer has an unobserved “lifetime” Dropout rates vary across customers
14
“Shop Till You Drop”
15
The Pareto/NBD Model(Schmittlein, Morrison, and Colombo 1987)
Transaction Process: While active, number of transactions made by a customer
follows a Poisson process with transaction rate λ
Transaction rates are distributed gamma(r,α) across the population
Dropout Process: Each customer has an unobserved lifetime of length τ,
which is distributed exponential with dropout rate μ
Dropout rates are distributed gamma(s,β) across the population
Astonishingly good fit and predictive performance
16
The Pareto/NBD works very well…
…given individual-level (disaggregate) data.
17
18
Pareto/NBD using RCSS data
Same assumptions as for the usual Pareto/NBD implementation
Calculate purchase probabilities over discrete intervals: P(X(t, t +1)) = x, P(X(t +1, t +2)) = x, P(X(t +2, t +3)) = x, etc.
Apply to RCSS histograms and use standard MLE estimation
Parameter estimation is fast, stable, and robust All of the usual Pareto/NBD diagnostics (e.g.,
“P(Alive)”) can be obtained from the parameter estimates
19
Model Fit
20
Do We Need All Five Years of Data?
Calibrate the model on years 1-3 only, predict for years 4 and 5.
21
Customer-Base Analysis Using Repeated Cross-Sectional Summary (RCSS) DataUnder more general conditions, what is the
“information loss” by aggregating data?
Under what conditions can a model built using aggregated data accurately mimic its individual-level counterpart?
How much aggregated data is required to do this job well?
22
Reminder – RCSS Data
23
Research Design Manipulate the four parameters of the Pareto/NBD
r, s = 0.5, 1.0, 1.5 α, β = 5, 10, 15We have 34 = 81 “worlds”
For each “world,” simulate 104 weeks of data for five synthetic panels of 2500 customers (first 78 weeks for calibration, last 26 weeks for holdout)
Fit the Pareto/NBD model to the raw transaction data – obtain disaggregate LL and parameters
“Backward-looking” (“Chopping it up”) analysis
“Forward-looking” (“Build as you go”) analysis
24
“Backward-Looking” AnalysisHow many cross-sectional summaries should be created? (How to “chop it up?”)
One 78-week histogram? Two 39-week histograms? Three 26-week histograms? … Six 13-week histograms?
For each of the six aggregation conditions, fit the Pareto/NBD to the resulting RCSS data, and:
1. Compare RCSS parameter estimates to the disaggregate benchmarks
2. Evaluate the disaggregate LL functions using the RCSS parameter estimates and compare to the disaggregate benchmark LL
3. Evaluate the fit of the predicted histograms from RCSS and disaggregate parameter estimates to the actual holdout histograms
25
Scenario 1: r = 0.5, α = 5, s = 0.5, β = 5
# Hist. Avg. LL Dev. RMSE r α s β
1
-23452.
4 3.1% 37.1 0.37 3.83 1.65 40.01
2
-22813.
3 0.3% 5.4 0.40 4.63 0.65 11.65
3
-22759.
0 0.0% 5.0 0.41 4.48 0.57 7.89
4
-22759.
4 0.0% 4.9 0.41 4.56 0.58 8.54
5
-22767.
8 0.1% 5.0 0.41 4.58 0.56 8.18
6
-22754.
9 0.0% 5.0 0.46 4.79 0.50 5.32
Disagg.
-22748.
1 5.7 0.44 4.85 0.56 7.35
26
“Forward-Looking” AnalysisHow many quarterly (13-week) histograms are required?(How many to “build as you go?”)
One (total 13 weeks)? Two (total 26 weeks)? Three (total 39 weeks)? … Six (total 78 weeks)?
For each of the six “number of histogram” conditions, fit the Pareto/NBD to the resulting RCSS data, and:
1. Compare RCSS parameter estimates to the disaggregate benchmarks
2. Evaluate the disaggregate LL functions on the full data using the RCSS parameter estimates and compare to the disaggregate benchmark LL
3. Evaluate the fit of the predicted histograms from RCSS and disaggregate parameter estimates to the actual holdout histograms
27
Scenario 1: r = 0.5, α = 5, s = 0.5, β = 5
# Qtrs. Avg. LL Dev. RMSE r α s β
1
-23411.
8 2.9% 167.1 0.45 4.20 3.28 40.01
2
-22761.
9 0.1% 19.5 0.49 4.91 0.37 2.67
3
-22756.
5 0.0% 17.0 0.49 4.88 0.44 3.49
4
-22749.
8 0.0% 7.2 0.45 4.75 0.49 5.22
5
-22750.
2 0.0% 4.8 0.46 4.83 0.49 4.93
6
-22750.
1 0.0% 5.0 0.46 4.79 0.50 5.32
Disagg.
-22748.
1 5.7 0.44 4.85 0.56 7.37
28
Summary of Results
Using three or more quarters always provides the same performance as disaggregate data in terms of:
Parameter recovery
In-sample LL
Out-of-sample predictions
29
Conclusions
We can estimate the Pareto/NBD using RCSS data; the findings from the Tuscan Lifestyles study are generalizable
Useful/interesting model diagnostics still emerge – even in the absence of any individual-level data
Three cross-sections are generally sufficient
30
Other Desirable Properties Just the percentage of total customers in each bucket is
sufficient – don’t even need actual numbers
Data can be “aperiodic” (they just have to be “repeated”)
Histograms can be of different time lengths, e.g., 3-month + 6-month + 4-month
Histograms can be missing, e.g., Qtr. 1, –, Qtr. 3, Qtr. 4
Data management/storage benefits
31
What Would Managers (and Customers) Rather Use?
or