Snowplow is at the core of everything we do
-
Upload
yalisassoon -
Category
Data & Analytics
-
view
2.883 -
download
1
Transcript of Snowplow is at the core of everything we do
![Page 1: Snowplow is at the core of everything we do](https://reader036.fdocuments.in/reader036/viewer/2022062523/58738c3d1a28ab272d8b6e07/html5/thumbnails/1.jpg)
Snowplow drives everything we do
What and why?
![Page 2: Snowplow is at the core of everything we do](https://reader036.fdocuments.in/reader036/viewer/2022062523/58738c3d1a28ab272d8b6e07/html5/thumbnails/2.jpg)
Digital and print publisher
Family-owned German company
116 sites across Australia and New Zealand
Tag management across all sites
Bauer Media
![Page 3: Snowplow is at the core of everything we do](https://reader036.fdocuments.in/reader036/viewer/2022062523/58738c3d1a28ab272d8b6e07/html5/thumbnails/3.jpg)
Just start collecting
Snowplow data collection in 2014
We didn’t really have a use case
![Page 4: Snowplow is at the core of everything we do](https://reader036.fdocuments.in/reader036/viewer/2022062523/58738c3d1a28ab272d8b6e07/html5/thumbnails/4.jpg)
![Page 5: Snowplow is at the core of everything we do](https://reader036.fdocuments.in/reader036/viewer/2022062523/58738c3d1a28ab272d8b6e07/html5/thumbnails/5.jpg)
Stuff we record
Page views
Metadata around content
User logins
Email click-throughs
Ad impressions
![Page 6: Snowplow is at the core of everything we do](https://reader036.fdocuments.in/reader036/viewer/2022062523/58738c3d1a28ab272d8b6e07/html5/thumbnails/6.jpg)
Use cases started showing up
Cross-site integrated reporting
Ad hoc tricky analysis
Sanity checking industry audience reporting
Stalking individual users
Audience overlaps
![Page 7: Snowplow is at the core of everything we do](https://reader036.fdocuments.in/reader036/viewer/2022062523/58738c3d1a28ab272d8b6e07/html5/thumbnails/7.jpg)
![Page 8: Snowplow is at the core of everything we do](https://reader036.fdocuments.in/reader036/viewer/2022062523/58738c3d1a28ab272d8b6e07/html5/thumbnails/8.jpg)
![Page 9: Snowplow is at the core of everything we do](https://reader036.fdocuments.in/reader036/viewer/2022062523/58738c3d1a28ab272d8b6e07/html5/thumbnails/9.jpg)
![Page 10: Snowplow is at the core of everything we do](https://reader036.fdocuments.in/reader036/viewer/2022062523/58738c3d1a28ab272d8b6e07/html5/thumbnails/10.jpg)
User behaviour
Ad impressions
Content metadata
Trending service
Recommendations
Dashboards
Ad hoc analysis
![Page 11: Snowplow is at the core of everything we do](https://reader036.fdocuments.in/reader036/viewer/2022062523/58738c3d1a28ab272d8b6e07/html5/thumbnails/11.jpg)
Some things you can’t do in GA
Tag-based reporting
Accurate reporting of in-app Facebook using user-agent contains FBAN
![Page 12: Snowplow is at the core of everything we do](https://reader036.fdocuments.in/reader036/viewer/2022062523/58738c3d1a28ab272d8b6e07/html5/thumbnails/12.jpg)
![Page 13: Snowplow is at the core of everything we do](https://reader036.fdocuments.in/reader036/viewer/2022062523/58738c3d1a28ab272d8b6e07/html5/thumbnails/13.jpg)
![Page 14: Snowplow is at the core of everything we do](https://reader036.fdocuments.in/reader036/viewer/2022062523/58738c3d1a28ab272d8b6e07/html5/thumbnails/14.jpg)
![Page 15: Snowplow is at the core of everything we do](https://reader036.fdocuments.in/reader036/viewer/2022062523/58738c3d1a28ab272d8b6e07/html5/thumbnails/15.jpg)
We’re using Snowplow 0.9.2 from 2014-04-29!
It just works
We’ve been busy building other stuff
![Page 16: Snowplow is at the core of everything we do](https://reader036.fdocuments.in/reader036/viewer/2022062523/58738c3d1a28ab272d8b6e07/html5/thumbnails/16.jpg)
But...
Page pings is b0rken: no time spent or scroll depth
(Out-of-the-box) browser categorisation is terrible
Hourly batches are a bit higher latency than we’d like
No context shredding, but JSON queries are performant enough
![Page 17: Snowplow is at the core of everything we do](https://reader036.fdocuments.in/reader036/viewer/2022062523/58738c3d1a28ab272d8b6e07/html5/thumbnails/17.jpg)
runSnowPlow.shWeb page
(JavaScript in page creates
image beacon)
S3
CloudfrontSnowCannon
(Node app in Elastic
Beanstalk)
Redirects to
Writes logs to
ETL(Elastic Map
Reduce)
S3
events(Redshift)
events_temp(Redshift)
x_events(Redshift)
![Page 18: Snowplow is at the core of everything we do](https://reader036.fdocuments.in/reader036/viewer/2022062523/58738c3d1a28ab272d8b6e07/html5/thumbnails/18.jpg)
Tips
Redshift can get very expensive very quickly
Decent dashboarding platforms are rare
And plenty of crap ones are overpriced
Just tip everything in and worry about what you’ll do later
![Page 19: Snowplow is at the core of everything we do](https://reader036.fdocuments.in/reader036/viewer/2022062523/58738c3d1a28ab272d8b6e07/html5/thumbnails/19.jpg)
What’s next?
![Page 20: Snowplow is at the core of everything we do](https://reader036.fdocuments.in/reader036/viewer/2022062523/58738c3d1a28ab272d8b6e07/html5/thumbnails/20.jpg)
Future plans
Upgrade ETL to real-time: probably our own solution
Time spent and scroll depth
Shredding?