Data Wrangling Kung Fu With pandas (PyData SV 2013)

6
Data Wrangling Kung Fu with pandas Wes McKinney @wesmckinn PyData Conference 2013, 2013-3-20 Saturday, April 13,

Transcript of Data Wrangling Kung Fu With pandas (PyData SV 2013)

Page 1: Data Wrangling Kung Fu With pandas (PyData SV 2013)

Data Wrangling Kung Fu with pandas

Wes McKinney@wesmckinn

PyData Conference 2013, 2013-3-20

Saturday, April 13,

Page 2: Data Wrangling Kung Fu With pandas (PyData SV 2013)

Agile Tools for Real World Data

Wes McKinney

Python for Data Analysis

Saturday, April 13,

Page 3: Data Wrangling Kung Fu With pandas (PyData SV 2013)

Me

• Started pandas in 2008

• Other Python projects I’ve been involved with: statsmodels, vbench, gpustats

• http://blog.wesmckinney.com

• New project in 2013...stay tuned

Saturday, April 13,

Page 4: Data Wrangling Kung Fu With pandas (PyData SV 2013)

Observations

• Data often in wrong format for analysis

• Storage format frequently not Analysis format

• Data preparation bottleneck in many workflows

Saturday, April 13,

Page 5: Data Wrangling Kung Fu With pandas (PyData SV 2013)

pandas

• Productivity-focused structured data manipulation tools for Python

• Fast, intuitive data structures

• Filling the gap between Python and more domain-specific languages like R

• Huge growth in 2011-2012, continuing in 2013

Saturday, April 13,

Page 6: Data Wrangling Kung Fu With pandas (PyData SV 2013)

Agenda

• Data reshaping

• Hierarchical indexing

• GroupBy

• Miscellaneous munging

Saturday, April 13,