Gousto USE SNOWPLOW
Dejan Petelin Head of Data Science
love — Our journey of leveraging Snowplow Analy9cs …
• An online recipe box service. • Customers come to our site, or use
our apps and select from 22 meals each week.
• They pick the meals they want to cook and say how many people they’re cooking for.
• We deliver all the ingredients they need in exact propor@ons with step-by-step recipe cards in 2-3 days.
• No planning, no supermarkets and no food waste – you just cook (and eat).
• We’re a rapidly growing business.
About Gousto
• Transac@onal database and loads of external data sources, e.g. Excel spredsheets, 3rd party tools etc.
• Mul@ple ad-hoc analyses, mostly in Excel, which are difficult to update.
• Gap between web analy@cs (GA) and transac@onal data.
• Lack of customer event logs • we started snapsho@ng
transac@onal database.
• Loads of ques@ons from our CEO Timo :)
Our data journey…
MySQL
Transac@onal Read Replica
Mailchimp
Excel spreadsheets
Google Analy@cs
Zendesk CRM
Geo-demographic data
CRONed data processing Ad-hoc analyses
MySQL
Data Warehouse Excel reports
Stakeholders
Growing data capabili9es
Data Science
Analy9cs
Dat
a En
gine
erin
g
• As a subscrip@on service we are very retenDon focused – linking all the data sources is challenging.
• We believe that data is the voice of our customers, so we try to collect as much data as possible.
• Therefore we invested a lot in Snowplow as we own the data, which is very valuable asset and core of the business.
• The data is available to everyone – SQL is a great competency at Gousto.
Our data stack
Airflow (ETLs orchestra9on)
Trans DB
Data-warehouse (lake)
Daily email reports Ad-hoc analyses Predic9ve modelling
WMS
Snowplow as unified log
Customer Service
AcDvity Log Service
Order Service
Product Service
Recipe Service
. . .
AWS
Lambda
Amazon
DynamoDB
Platform
Deployment
Bucket
SNS
Subscribe to all messages Event API
Amazon RedshiK
AWS
Lambda
Subscribe to customer related messages
Snowplow on isomorphic JS
• Shiny and super quick, but… what happened to my events?!
• No page loads – no automa@c page views.
• We developed our custom framework for triggering events.
• We use structured events for that purpose, but store (unstructured) JSON objects in them.
• Such approach allows us to be flexible and quickly introduce new events.
• But, no data valida@on can lead to garbage leaking.
• Data modelling in Redshi[.
Client
Server
App API
Moving to the real-9me pipeline – use case
Snowplow
1
5
Store acDon taken
Churn model
GiK service
Process event 2Events stream
4If likely to churn
3Store churn score
• Analyse customer behaviour in real-@me.
• Automa@cally react as soon as possible.
• Feed the response back to Snowplow (serving as a unified log).
• So the whole customer journey is available to CRM & reten@on teams instantly.
How we leverage Snowplow data?
From analy9cs to op9misa9on …
Raw data
Standard reports
Op9misa9on
Predic9ve modelling Generic
predic9ve analy9cs Ad-hoc
reports
Source: Gartner
Sense & Respond Predict & Act
Complexity / Maturity
Com
pe@@
ve a
dvan
tage
From analy9cs to op9misa9on …
Raw data
Standard reports
Op9misa9on
Predic9ve modelling Generic
predic9ve analy9cs Ad-hoc
reports
Source: Gartner
Sense & Respond Predict & Act
Complexity / Maturity
Com
pe@@
ve a
dvan
tage
From analy9cs to op9misa9on …
• Daily trading reports, e.g. signups by channel, conversion rate, orders etc.
Raw data
Standard reports
Op9misa9on
Predic9ve modelling Generic
predic9ve analy9cs Ad-hoc
reports
Source: Gartner
Sense & Respond Predict & Act
Complexity / Maturity
Com
pe@@
ve a
dvan
tage
From analy9cs to op9misa9on …
Raw data
Standard reports
Op9misa9on
Predic9ve modelling Generic
predic9ve analy9cs Ad-hoc
reports
Source: Gartner
Sense & Respond Predict & Act
Complexity / Maturity
Com
pe@@
ve a
dvan
tage
• Daily trading reports, e.g. signups by channel, conversion rate, orders etc.
• Analy@cs • Customer behaviour • Ac@onable insights
From analy9cs to op9misa9on …
Raw data
Standard reports
Op9misa9on
Predic9ve modelling Generic
predic9ve analy9cs Ad-hoc
reports
Source: Gartner
Sense & Respond Predict & Act
Complexity / Maturity
Com
pe@@
ve a
dvan
tage
• Daily trading reports, e.g. signups by channel, conversion rate, orders etc.
• Analy@cs • Customer behaviour • Ac@onable insights
• Customer segmenta@on
• Marke@ng abribu@on
From analy9cs to op9misa9on …
Raw data
Standard reports
Op9misa9on
Predic9ve modelling Generic
predic9ve analy9cs Ad-hoc
reports
Source: Gartner
Sense & Respond Predict & Act
Complexity / Maturity
Com
pe@@
ve a
dvan
tage
• Daily trading reports, e.g. signups by channel, conversion rate, orders etc.
• Analy@cs • Customer behaviour • Ac@onable insights
• Customer segmenta@on
• Marke@ng abribu@on
• Churn predic,on
From analy9cs to op9misa9on …
• Daily trading reports, e.g. signups by channel, conversion rate, orders etc.
• Analy@cs • Customer behaviour • Ac@onable insights
• Customer segmenta@on
• Marke@ng abribu@on • Channel mix op@misa@on
• Churn predic,on • Automated menu design • Warehouse op@misa@on
• Tracking performance
Raw data
Standard reports
Op9misa9on
Predic9ve modelling Generic
predic9ve analy9cs Ad-hoc
reports
Source: Gartner
Sense & Respond Predict & Act
Complexity / Maturity
Com
pe@@
ve a
dvan
tage
Churn predic9on – intro
• As a subscrip@on service, we are very reten9on focused.
• Some customers are immediately convinced and become very loyal customers, while some customers need a bit more effort to get hooked.
• We use Snowplow events data to model customer behavior and find customers more likely to churn so we can focus on them.
• Use personalised approach to retain customers.
Churn predic9on – how we did it
Alice
Bob
Churn predic9on – how we did it
Alice
Bob
Churn predic9on – how we did it
Alice
Bob
Churn predic9on – how we did it
Alice
Bob
Churn predic9on – how we did it
Alice
Bob
Churn predic9on – how we did it
Alice
Bob
Churn predic9on – how we did it
Alice
Bob
Churn predic9on – how we did it
Alice
Bob
Churn predic9on – how we did it
Alice
Bob
Churn predic9on – how we did it
Alice
Bob
Churn predic9on – how we did it
Alice
Bob
Churn predic9on – how we did it
Alice
Bob
Churn predic9on – how we did it
Alice
Bob
Churn predic9on – how we did it
Alice
Bob
Churn predic9on – piTalls
• What churn actually is? How to define it?
• It might be beber trying to predict the likelihood of customer placing an order.
• How big should be a horizon? Where should we draw a line?
• Using events data, there is almost unlimited number of features – how to find really informa@ve ones?
• How do we keep model up to date if we are affec@ng customer journeys?
• How to measure success? • No maber how accurate the model, the
profit is what it counts at the end.
Churn predic9on – future
• Predic@ng when the next event will happen, rather then probability of an event in the next X weeks.
• Using recursive (deep) neural networks (RNN) to model events recursively, rather than engineering features.
Churn predic9on – results
• Accuracy of the model is ~80%. • A bit too op@mis@c in the lower region
and a bit too pessimis@c in higher region.
• Significant upli[ in the reten@on. • Indeed, it depends on the ac@on taken.
• Loads of A/B tes@ng to find the right ac@ons to be taken.
• In the future, we want to build another model, sugges@ng what ac@on should be taken for each customer.
• Actually, why not build an autonomous system trying different approaches and communica@on channels to find the best approach?
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
Actu
al p
ropo
r+on
Predicted likelihood
control varia9on A varia9on B
Automated menu design - intro
• The food team used to manually design menus – every week.
• With 22 recipes this task has become too demanding – diversity, mul@ple constraints, costs etc.
• They should be focusing on recipe development to keep delivering delicious recipes.
• Why not use machine learning to leverage the data to understand customers’ taste and design popular menus?
Automated menu design – how it works (I)
• We developed a very detailed ontology to describe our recipes.
• We built an internal Slack bot to collect data on recipe similarity.
• Insights gathered with that data enabled us to provide diverse menus.
• Understanding customers’ taste is a crucial part of designing popular menus.
• Transac@onal data (orders) is not enough – Snowplow data gives us way more insights on how customers explore menus.
Automated menu design – how it works (II)
• Mul@-objec@ve op@misa@on: • Maximising recipe diversity • Maximising menu popularity • Balancing costs • Matching forecasts
• Using Gene@c Algorithms (GA) • Speed is not an issue as we have a whole
week to generate new menu :)
• Mul@ple solu@ons so the food team can choose which menu best fit their objec@ves.
Selec9on
Cross-over
Muta9on
Evalua9on
Concluding thoughts
• Snowplow has helped us to scale our data capabili@es with limited data engineering resources. • TIP TO STARTUPS: start building data capabili9es as early as possible – data is a huge asset.
• Snowplow also serves us as a unified log. • Not necessarily limited to customer focused data.
• Snowplow enables us to ‘listen’ to our customers and provide them more personalised experience.
• Moving to the real-Dme pipeline to realise just-in-@me personalisa@on, e.g. personalised recipe ordering, add-on recommenda@ons (upselling) etc.
Top Related