Python performance profiling

17
©2013 DataStax Confidential. Do not distribute without consent. @rustyrazorblade Jon Haddad Technical Evangelist, DataStax Python Performance Profiling 1

Transcript of Python performance profiling

Page 1: Python performance profiling

©2013 DataStax Confidential. Do not distribute without consent.

@rustyrazorblade

Jon HaddadTechnical Evangelist, DataStax

Python Performance Profiling

1

Page 2: Python performance profiling

What are our goals?• Understand potential bottlenecks in dev • Testing • Call graphs

• Understand code once it's in production • Micro benchmarks • Automatic logging of slow DB queries / api calls

• Gather evidence •No guessing •We need insight into both environments

Page 3: Python performance profiling

Why do we need it in dev & prod?• Dev != production •No network latency on our desktops • Round trips are cheap in dev • Rarely hitting disk (DB fully in memory) • Zero CPU contention • Failure / failover rarely tested

Page 4: Python performance profiling

Before Production

Page 5: Python performance profiling

Approaches in Dev• Unit / functional tests • Code coverage is important • if you’re not testing it, it’s probably broken

•Must be reliable, repeatable • Always keep production in mind • Know your hardware • Load test regularly • Jenkins performance plugin

Page 6: Python performance profiling

Finding slow tests is easy

Page 7: Python performance profiling

Sometimes it's unavoidable…•Make sure you mark tests that are

expected to be slow • These are frequently testing offline tasks

in functional tests

Page 8: Python performance profiling

Profiler - Hotshot

Page 9: Python performance profiling

pycallgraph• Understand code structure and flow • Summarize times • Darker colors represent more time

spent

Page 10: Python performance profiling

Blocking I/O• Usually the problem with web servers • Apps can be CPU bound but it's less frequent

Page 11: Python performance profiling

Moving past blocking I/O• Event libraries! • libev most stable • gevent is a beautiful wrapper • Pool.map() is your friend • async can hide issues & make code

harder to profile

Page 12: Python performance profiling

Profiler - GreenletProfiler• Takes into account greenlets • Generates callgrind files •Mac Users: qcachegrind

Page 13: Python performance profiling

In Production

Page 14: Python performance profiling

Profile with minimal overhead•We need something really lightweight! • Our applications can time EVERYTHING • api requests • database queries • individual functions • small blocks of code

• statsd is our friend •microtimers, counters • Integrates w/ librato, graphite

Page 15: Python performance profiling

statsd + graphite / grafana

Page 16: Python performance profiling

Logging• Log slow database queries / api calls

automatically • Log & aggregate errors •What table was hit? • Read or write? •What was the query? • Can we duplicate? • Logstash / splunk / etc

Page 17: Python performance profiling

©2013 DataStax Confidential. Do not distribute without consent. 17