Fast and Efficient A/B Testing Analysis with Shiny and SQL · Fast and Efficient A/B Testing...

Post on 29-Nov-2019

4 views 0 download

Transcript of Fast and Efficient A/B Testing Analysis with Shiny and SQL · Fast and Efficient A/B Testing...

Fast and Efficient A/B Testing Analysis with Shiny and SQL

Charlie ThompsonStoryblocks

A/B Testing at Storyblocks

Our search page for stock video

“Related Search” cards test

“Related Search” cards testTest Control

We store results for our tests in Shiny

We have > 100 metrics to analyze per test

We have thousands of A/B tests with millions of users

Multiple ways to measure “users”

Lots of metrics per user

A/B testing generates big data

Shiny and SQL together

A brief history

2015

Automated online dashboard in SQL

2014

Adhoc SQL queries2011

Outsourced to 3rd party

2016

To Shiny!

2017

Scaling within Shiny

Loading big data into Shiny

Raw A/B testing data (SQL)

test_1.RData

R script queries the SQL database and saves off an .RData file for each test that contains the raw data

Overnight preprocessing on shiny

server

test_2.RData

test_3.RData

test_4.RData

load_data.R

Loading big data into Shiny

Raw A/B testing data (SQL)

server.R

test_1.RData

R script queries the SQL database and saves off an .RData file for each test that contains the raw data

Shiny Dashboard

Overnight preprocessing on shiny

server

Live in dashboard

As tests are selected in the dashboard, Shiny pulls the raw data file and computes all the metrics needed, including hypothesis tests

test_2.RData

test_3.RData

test_4.RData

load_data.R

Constraints with Shiny at scale

Raw A/B testing data (SQL)

Bottleneck #1: Reading in large tests

server.R

test_1.RData

R script queries the SQL database and saves off an .RData file for each test that contains the raw data

Shiny Dashboard

Bottleneck #2: Calculating hypothesis tests for 50+ metrics

Bottleneck #3: Users queue

Overnight preprocessing on shiny

server

Live in dashboard

As tests are selected in the dashboard, Shiny pulls the raw data file and computes all the metrics needed, including hypothesis tests

test_2.RData

test_3.RData

test_4.RData

load_data.R

Overcoming Shiny constraints

Raw A/B testing data (SQL)

Bottleneck #1: Reading in large tests

server.R

test_1.RData

R script queries the SQL database and calculates hypothesis tests and saves off an .RData file for each test that contains the aggregated data

Shiny Dashboard

Bottleneck #2: Calculating hypothesis tests for 50+ metrics

Bottleneck #3: Users queue

Overnight preprocessing on shiny

server

Live in dashboard

As tests are selected in the dashboard, Shiny pulls the aggregated file for each test, which now contains historical values instead of daily snapshots

test_2.RData

test_3.RData

test_4.RData

load_data.R

NO WORRIES! The dashboard is so fast we won’t notice

FUHGETTABOUTIT! Aggregated data is wicked small

NOT ANYMORE! This is done in the morning

Making the most of your

data

When is a test done?

Aggregated data gives a time series view

Test begins

Time series helps prevent premature reads

P Value

Date

Test looks 95% significant here!

P-value should stabilize over time

Win or lose, the P-value should

stabilize before a test is “finished”

P Value

Date

When to think about scaling

Shiny: prototype vs production

Prototype Production

Hosting Local Shiny server, shinyapps.io, etc

Number of concurrent users One Multiple

Page load time Easy to overlook Instant, UX is important

Data storage Often pull in unused rows or columns

Loads only necessary data

Stability and maintenance Only needs to be working when demoing

Minimal downtime

Measuring Shiny usageMake sure you know how many users you have!

What we learned

Let SQL be SQL and R be R

R SQL

Big data aggregation Possible, but slow Made for exactly this

Hypothesis tests and charts Made for exactly this Painful, need tools

Data tips for Shiny in production

1. Subset your input data before reading it in

2. Use .RData files

3. Consider ETL process - do you really need real-time data?

4. Monitor usage

A/B Testing in the Wild [Etsy] - Emily Robinson

A/B Testing at Stack Overflow - Julia Silge

Experiments at Airbnb - Jan Overgoor

Shiny server system performance monitoring - Huidong Tian

Additional resources

We’re hiring!

https://weare.storyblocks.com

Contact me

chuck@storyblocks.com

www.RCharlie.com

Twitter: @RCharlie425

Questions?