The Codependent Development of Patriotism and Xenophobia ...
Attribution & Identity: A Codependent Relationship › ARF_Knowledgebase › ARF...marketing...
Transcript of Attribution & Identity: A Codependent Relationship › ARF_Knowledgebase › ARF...marketing...
#ARFAxS
Attribution & Identity: A Codependent Relationship
Michael Finnerty
Group Director, Product
Neustar
Robert Stratton, Ph.D.
Director
Neustar
• Identity fragmentation poses an increasing threat to the promise of big data in marketing analytics
• Identity graphs join together observations from separate ID spaces such that the inputs and outputs from the correct individual or household can be associated for analysis
• This analysis demonstrates the impact of reflecting real-world identity graph biases on analytical accuracy, specifically marketing attribution
Big Data’s Identity Crisis
The Data: Observations for two separate media channels and a purchase event were drawn from three different synthetic domains and connected by a synthetic Identity graph.
The Analysis: A series of sensitivity analyses on a logistic regression operation designed to assess the impact of the media channel exposures on purchase behavior
Simulated Bias in the Identity Graph:1. Sparsity – incomplete connections between domains2. Media channel graphs - graphs that are connected by the input variable3. First-party only graphs – graphs that are connected by the output
variable
4. Inaccuracy – wrongly wired connections between entities across the domains
Study Design
A true graph and no graph…
True ID Graph
No ID Graph
LegendY1– Purchase EventX1i – Media Channel 1X2i – Media Channel 2
Scenario:
• Not all of the domain-specific identity labels can be mapped to the true underlying identity
Impact on Results:
• Media channel observations are no longer associated to the purchase observations
• Impact of base is overestimated
• Impact of media is underestimated and even turns negative at sever levels of sparsity
A Sparse Graph: Overestimation of Base, Underestimation of MediaDeviation From True AttributionBias Introduced to the Graph
Impact Summary% Bias Introduced Base X1 X2
10% 70% -14% -11%20% 151% -29% -24%50% 408% -72% -66%
Scenario:
• A single media channel (X1) and purchase data are tracked within the same ID space
Impact on Results:
• Media channel (X2) observations are no longer associated to the purchase observations
• Impact of base is overestimated
• Impact of properly associated media channel (X1) is overestimated
• Impact of media channel (X2) is underestimated, but does not turn negative
A Media Channel Graph: Overestimation of Base, Inconsistent estimation of Media
Deviation From True AttributionBias Introduced to the Graph
Impact Summary% Bias Introduced Base X1 X2
10% 16% 0% -6%20% 35% 1% -13%50% 95% 4% -36%
Scenario:
• Only purchase data is linked back to media channels
Impact on Results: • Media channel events that do
not lead to a purchase are not associated
• Impact of base is underestimated
• Impact of both media channels is overestimated
A First Party Graph: Underestimation of Base, Overestimation of MediaDeviation From True AttributionBias Introduced to the Graph
% Bias Introduced Base X1 X210% -7% 1% 1%20% -15% 3% 3%50% -35% 6% 6%
Impact Summary
Scenario:
• Wrong events are connected to each other, through for example a flawed matching process, excessive noise, or conflict in the underlying data signal
Impact on Results:
• The inaccurate associations resemble random noise in the datasets, diffusing the true impact of X1 and X2 on Y
• Impact of base is overestimated
• Impact of both media channels is underestimated
An Inaccurate Graph: Underestimation of Base, Overestimation of Media
Deviation From True AttributionBias Introduced to the Graph
Impact Summary% Bias Introduced Base X1 X2
10% 52% -9% -9%20% 111% -19% -20%50% 295% -50% -50%
• Biases in the underlying identity graphs used for marketing analytics introduce systemic inaccuracies into the downstream analytics
• These impacts are particular problematic to control for because the type of bias and true level of inaccuracy is seldom known
• The four types of real-world biases studied systematically skew analytical results in predictable directions and introduce up to 30% bias in attribution results
• This analysis can inform the relative direction of bias introduced by the particular type of Identity Graph used
Learnings & Implications:
#ARFAxS
Questions?