PyData London 2015 - How We Turned EverythingMe Into a Data Driven Company

Post on 14-Apr-2017

684 views 0 download

Transcript of PyData London 2015 - How We Turned EverythingMe Into a Data Driven Company

HOW WE TURNED EVERYTHINGME INTO A DATA DRIVEN COMPANY

Hello, I’m Arik.

We’re geeks. YMMV.

Requirements:1. scalable2. fast3. easy to query (accessible)

Amazon Redshift

“Petabyte scale; massively parallel

Fully managed; zero admin…”

8TB total size of data*9 Nodes ClusterLoading new data every ~5 minutes~1500 query executions per day

One main fact table: fact_events

Step #2Give Everyone Access

BI TOOLS?

psql, SQL Workbench

CSV Sharing

(hackathing!)

re:dash supports multiple data sources:PostgreSQLRedshiftMySQLBigQueryMongoDBInfluxDBGraphite

Everyone at the company is using re:dash at some capacity.

8K queries created to date, with >130 dashboards.

Over 25* companies using it & 24 contributors.

* Probably more. Not everyone reaches out to me.

Step #3Improve

Giving everyone raw access is only the first step, not the end of the road.

Working with SQL all the time -> repeating yourself.

Template Queries, Parameters

Alerts

Search & Discovery

Pandas, IPython integration

• Give everyone at the company access to the data

Make it easy (accessible)Avoid restrictions

Thank you. @arikfrarik@everything.mehttp://redash.io/