Fast and Efficient A/B Testing Analysis with Shiny and SQL · Fast and Efficient A/B Testing...

27
Fast and Efficient A/B Testing Analysis with Shiny and SQL Charlie Thompson Storyblocks

Transcript of Fast and Efficient A/B Testing Analysis with Shiny and SQL · Fast and Efficient A/B Testing...

Page 1: Fast and Efficient A/B Testing Analysis with Shiny and SQL · Fast and Efficient A/B Testing Analysis with Shiny and SQL Charlie Thompson Storyblocks

Fast and Efficient A/B Testing Analysis with Shiny and SQL

Charlie ThompsonStoryblocks

Page 2: Fast and Efficient A/B Testing Analysis with Shiny and SQL · Fast and Efficient A/B Testing Analysis with Shiny and SQL Charlie Thompson Storyblocks

A/B Testing at Storyblocks

Page 3: Fast and Efficient A/B Testing Analysis with Shiny and SQL · Fast and Efficient A/B Testing Analysis with Shiny and SQL Charlie Thompson Storyblocks

Our search page for stock video

Page 4: Fast and Efficient A/B Testing Analysis with Shiny and SQL · Fast and Efficient A/B Testing Analysis with Shiny and SQL Charlie Thompson Storyblocks

“Related Search” cards test

Page 5: Fast and Efficient A/B Testing Analysis with Shiny and SQL · Fast and Efficient A/B Testing Analysis with Shiny and SQL Charlie Thompson Storyblocks

“Related Search” cards testTest Control

Page 6: Fast and Efficient A/B Testing Analysis with Shiny and SQL · Fast and Efficient A/B Testing Analysis with Shiny and SQL Charlie Thompson Storyblocks

We store results for our tests in Shiny

Page 7: Fast and Efficient A/B Testing Analysis with Shiny and SQL · Fast and Efficient A/B Testing Analysis with Shiny and SQL Charlie Thompson Storyblocks

We have > 100 metrics to analyze per test

Page 8: Fast and Efficient A/B Testing Analysis with Shiny and SQL · Fast and Efficient A/B Testing Analysis with Shiny and SQL Charlie Thompson Storyblocks

We have thousands of A/B tests with millions of users

Multiple ways to measure “users”

Lots of metrics per user

A/B testing generates big data

Page 9: Fast and Efficient A/B Testing Analysis with Shiny and SQL · Fast and Efficient A/B Testing Analysis with Shiny and SQL Charlie Thompson Storyblocks

Shiny and SQL together

Page 10: Fast and Efficient A/B Testing Analysis with Shiny and SQL · Fast and Efficient A/B Testing Analysis with Shiny and SQL Charlie Thompson Storyblocks

A brief history

2015

Automated online dashboard in SQL

2014

Adhoc SQL queries2011

Outsourced to 3rd party

2016

To Shiny!

2017

Scaling within Shiny

Page 11: Fast and Efficient A/B Testing Analysis with Shiny and SQL · Fast and Efficient A/B Testing Analysis with Shiny and SQL Charlie Thompson Storyblocks

Loading big data into Shiny

Raw A/B testing data (SQL)

test_1.RData

R script queries the SQL database and saves off an .RData file for each test that contains the raw data

Overnight preprocessing on shiny

server

test_2.RData

test_3.RData

test_4.RData

load_data.R

Page 12: Fast and Efficient A/B Testing Analysis with Shiny and SQL · Fast and Efficient A/B Testing Analysis with Shiny and SQL Charlie Thompson Storyblocks

Loading big data into Shiny

Raw A/B testing data (SQL)

server.R

test_1.RData

R script queries the SQL database and saves off an .RData file for each test that contains the raw data

Shiny Dashboard

Overnight preprocessing on shiny

server

Live in dashboard

As tests are selected in the dashboard, Shiny pulls the raw data file and computes all the metrics needed, including hypothesis tests

test_2.RData

test_3.RData

test_4.RData

load_data.R

Page 13: Fast and Efficient A/B Testing Analysis with Shiny and SQL · Fast and Efficient A/B Testing Analysis with Shiny and SQL Charlie Thompson Storyblocks

Constraints with Shiny at scale

Raw A/B testing data (SQL)

Bottleneck #1: Reading in large tests

server.R

test_1.RData

R script queries the SQL database and saves off an .RData file for each test that contains the raw data

Shiny Dashboard

Bottleneck #2: Calculating hypothesis tests for 50+ metrics

Bottleneck #3: Users queue

Overnight preprocessing on shiny

server

Live in dashboard

As tests are selected in the dashboard, Shiny pulls the raw data file and computes all the metrics needed, including hypothesis tests

test_2.RData

test_3.RData

test_4.RData

load_data.R

Page 14: Fast and Efficient A/B Testing Analysis with Shiny and SQL · Fast and Efficient A/B Testing Analysis with Shiny and SQL Charlie Thompson Storyblocks

Overcoming Shiny constraints

Raw A/B testing data (SQL)

Bottleneck #1: Reading in large tests

server.R

test_1.RData

R script queries the SQL database and calculates hypothesis tests and saves off an .RData file for each test that contains the aggregated data

Shiny Dashboard

Bottleneck #2: Calculating hypothesis tests for 50+ metrics

Bottleneck #3: Users queue

Overnight preprocessing on shiny

server

Live in dashboard

As tests are selected in the dashboard, Shiny pulls the aggregated file for each test, which now contains historical values instead of daily snapshots

test_2.RData

test_3.RData

test_4.RData

load_data.R

NO WORRIES! The dashboard is so fast we won’t notice

FUHGETTABOUTIT! Aggregated data is wicked small

NOT ANYMORE! This is done in the morning

Page 15: Fast and Efficient A/B Testing Analysis with Shiny and SQL · Fast and Efficient A/B Testing Analysis with Shiny and SQL Charlie Thompson Storyblocks

Making the most of your

data

Page 16: Fast and Efficient A/B Testing Analysis with Shiny and SQL · Fast and Efficient A/B Testing Analysis with Shiny and SQL Charlie Thompson Storyblocks

When is a test done?

Page 17: Fast and Efficient A/B Testing Analysis with Shiny and SQL · Fast and Efficient A/B Testing Analysis with Shiny and SQL Charlie Thompson Storyblocks

Aggregated data gives a time series view

Test begins

Page 18: Fast and Efficient A/B Testing Analysis with Shiny and SQL · Fast and Efficient A/B Testing Analysis with Shiny and SQL Charlie Thompson Storyblocks

Time series helps prevent premature reads

P Value

Date

Test looks 95% significant here!

Page 19: Fast and Efficient A/B Testing Analysis with Shiny and SQL · Fast and Efficient A/B Testing Analysis with Shiny and SQL Charlie Thompson Storyblocks

P-value should stabilize over time

Win or lose, the P-value should

stabilize before a test is “finished”

P Value

Date

Page 20: Fast and Efficient A/B Testing Analysis with Shiny and SQL · Fast and Efficient A/B Testing Analysis with Shiny and SQL Charlie Thompson Storyblocks

When to think about scaling

Page 21: Fast and Efficient A/B Testing Analysis with Shiny and SQL · Fast and Efficient A/B Testing Analysis with Shiny and SQL Charlie Thompson Storyblocks

Shiny: prototype vs production

Prototype Production

Hosting Local Shiny server, shinyapps.io, etc

Number of concurrent users One Multiple

Page load time Easy to overlook Instant, UX is important

Data storage Often pull in unused rows or columns

Loads only necessary data

Stability and maintenance Only needs to be working when demoing

Minimal downtime

Page 22: Fast and Efficient A/B Testing Analysis with Shiny and SQL · Fast and Efficient A/B Testing Analysis with Shiny and SQL Charlie Thompson Storyblocks

Measuring Shiny usageMake sure you know how many users you have!

Page 23: Fast and Efficient A/B Testing Analysis with Shiny and SQL · Fast and Efficient A/B Testing Analysis with Shiny and SQL Charlie Thompson Storyblocks

What we learned

Page 24: Fast and Efficient A/B Testing Analysis with Shiny and SQL · Fast and Efficient A/B Testing Analysis with Shiny and SQL Charlie Thompson Storyblocks

Let SQL be SQL and R be R

R SQL

Big data aggregation Possible, but slow Made for exactly this

Hypothesis tests and charts Made for exactly this Painful, need tools

Page 25: Fast and Efficient A/B Testing Analysis with Shiny and SQL · Fast and Efficient A/B Testing Analysis with Shiny and SQL Charlie Thompson Storyblocks

Data tips for Shiny in production

1. Subset your input data before reading it in

2. Use .RData files

3. Consider ETL process - do you really need real-time data?

4. Monitor usage

Page 26: Fast and Efficient A/B Testing Analysis with Shiny and SQL · Fast and Efficient A/B Testing Analysis with Shiny and SQL Charlie Thompson Storyblocks

A/B Testing in the Wild [Etsy] - Emily Robinson

A/B Testing at Stack Overflow - Julia Silge

Experiments at Airbnb - Jan Overgoor

Shiny server system performance monitoring - Huidong Tian

Additional resources

Page 27: Fast and Efficient A/B Testing Analysis with Shiny and SQL · Fast and Efficient A/B Testing Analysis with Shiny and SQL Charlie Thompson Storyblocks

We’re hiring!

https://weare.storyblocks.com

Contact me

[email protected]

www.RCharlie.com

Twitter: @RCharlie425

Questions?