http://www.flickr.com/photos/kveton/2910536252/Tuesday, May 26, 2009
san francisco meetup
Pownce
• Large scale
• Hundreds of requests/sec
• Thousands of DB operations/sec
• Millions of user relationships
• Millions of notes
• Terabytes of static data
8
Tuesday, May 26, 2009
san francisco meetup
Pownce
• Encountered and eliminated many common scaling bottlenecks
• Real world example of scaling a Django app
• Django provides a lot for free
• I’ll be focusing on what you have to build yourself, and the rare places where Django got in the way
9
Tuesday, May 26, 2009
san francisco meetup
Scalability
11
• Speed / Performance
• Generally affected by language choice
• Achieved by adopting a particular technology
Scalability is NOT:
Tuesday, May 26, 2009
san francisco meetup
import time
def application(environ, start_response): time.sleep(10) start_response('200 OK', [('content-type', 'text/plain')]) return ('Hello, world!',)
A Scalable Application
12
Tuesday, May 26, 2009
san francisco meetup
def application(environ, start_response): remote_addr = environ['REMOTE_ADDR'] f = open('access-log', 'a+') f.write(remote_addr + "\n") f.flush() f.seek(0) hits = sum(1 for l in f.xreadlines()
if l.strip() == remote_addr) f.close() start_response('200 OK', [('content-type', 'text/plain')]) return (str(hits),)
A High Performance Application
13
Tuesday, May 26, 2009
san francisco meetup
Scalability
14
A scalable system doesn’t need to change when the size of the problem changes.
Tuesday, May 26, 2009
san francisco meetup
Scalability
• Accommodate increased usage
• Accommodate increased data
• Maintainable
15
Tuesday, May 26, 2009
san francisco meetup
Scalability
• Two kinds of scalability
• Vertical scalability: buying more powerful hardware, replacing what you already own
• Horizontal scalability: buying additional hardware, supplementing what you already own
16
Tuesday, May 26, 2009
san francisco meetup
Vertical Scalability
• Costs don’t scale linearly (server that’s twice is fast is more than twice as much)
• Inherently limited by current technology
• But it’s easy! If you can get away with it, good for you.
17
Tuesday, May 26, 2009
san francisco meetup
Vertical Scalability
18
Sky scrapers are special. Normal buildings don’t need 10 floor foundations. Just build!
- Cal Henderson
“
Tuesday, May 26, 2009
san francisco meetup
Horizontal Scalability
19
The ability to increase a system’s capacity by adding more processing units (servers)
Tuesday, May 26, 2009
san francisco meetup
Horizontal Scalability
20
It’s how large apps are scaled.
Tuesday, May 26, 2009
san francisco meetup
Horizontal Scalability
• A lot more work to design, build, and maintain
• Requires some planning, but you don’t have to do all the work up front
• You can scale progressively...
• Rest of the presentation is roughly in order if you’re scaling as you go...
21
Tuesday, May 26, 2009
san francisco meetup
Caching
• Several levels of caching available in Django
• Per-site cache: caches every page that doesn’t have GET or POST parameters
• Per-view cache: caches output of an individual view
• Template fragment cache: caches fragments of a template
• None of these are that useful if pages are heavily personalized
23
Tuesday, May 26, 2009
san francisco meetup
Caching
• Low-level Cache API
• Much more flexible, allows you to cache at any granularity
• At Pownce we typically cached
• Individual objects
• Lists of object IDs
• Hard part is invalidation
24
Tuesday, May 26, 2009
san francisco meetup
Caching
• Cache backends:
• Memcached
• Database caching
• Filesystem caching
25
Tuesday, May 26, 2009
san francisco meetup
Sessions
28
Or Tokyo Cabinethttp://github.com/ericflo/django-tokyo-sessions/
Thanks @ericflo
Tuesday, May 26, 2009
san francisco meetup
from django.core.cache import cache
class UserProfile(models.Model): ... def get_social_network_profiles(self): cache_key = ‘networks_for_%s’ % self.user.id profiles = cache.get(cache_key) if profiles is None: profiles = self.user.social_network_profiles.all() cache.set(cache_key, profiles) return profiles
Caching
29
Basic caching comes free with Django:
Tuesday, May 26, 2009
san francisco meetup
from django.core.cache import cachefrom django.db.models import signals
def nuke_social_network_cache(self, instance, **kwargs): cache_key = ‘networks_for_%s’ % self.instance.user_id cache.delete(cache_key)
signals.post_save.connect(nuke_social_network_cache, sender=SocialNetworkProfile)signals.post_delete.connect(nuke_social_network_cache, sender=SocialNetworkProfile)
Caching
30
Invalidate when a model is saved or deleted:
Tuesday, May 26, 2009
san francisco meetup
Caching
31
• Invalidate post_save, not pre_save
• Still a small race condition
• Simple solution, worked for Pownce:
• Instead of deleting, set the cache key to None for a short period of time
• Instead of using set to cache objects, use add, which fails if there’s already something stored for the key
Tuesday, May 26, 2009
san francisco meetup
Advanced Caching
34
• Memcached’s atomic increment and decrement operations are useful for maintaining counts
• But they’re not available in Django 1.0
• Added in 1.1 by ticket #6464
Tuesday, May 26, 2009
san francisco meetup
Advanced Caching
35
• You can still use them if you poke at the internals of the cache object a bit
• cache._cache is the underlying cache object
try: result = cache._cache.incr(cache_key, delta)except ValueError: # nonexistent key raises ValueError # Do it the hard way, store the result.return result
Tuesday, May 26, 2009
san francisco meetup
Advanced Caching
36
• Other missing cache API
• delete_multi & set_multi
• append: add data to existing key after existing data
• prepend: add data to existing key before existing data
• cas: store this data, but only if no one has edited it since I fetched it
Tuesday, May 26, 2009
san francisco meetup
Advanced Caching
37
• It’s often useful to cache objects ‘forever’ (i.e., until you explicitly invalidate them)
• User and UserProfile
• fetched almost every request
• rarely change
• But Django won’t let you
• IMO, this is a bug :(
Tuesday, May 26, 2009
san francisco meetup
class CacheClass(BaseCache): def __init__(self, server, params): BaseCache.__init__(self, params) self._cache = memcache.Client(server.split(';'))
def add(self, key, value, timeout=0): if isinstance(value, unicode): value = value.encode('utf-8') return self._cache.add(smart_str(key), value, timeout or self.default_timeout)
The Memcache Backend
38
Tuesday, May 26, 2009
san francisco meetup
class CacheClass(BaseCache): def __init__(self, server, params): BaseCache.__init__(self, params) self._cache = memcache.Client(server.split(';'))
def add(self, key, value, timeout=None): if isinstance(value, unicode): value = value.encode('utf-8') if timeout is None: timeout = self.default_timeout return self._cache.add(smart_str(key), value, timeout)
The Memcache Backend
39
Tuesday, May 26, 2009
san francisco meetup
Advanced Caching
40
• Typical setup has memcached running on web servers
• Pownce web servers were I/O and memory bound, not CPU bound
• Since we had some spare CPU cycles, we compressed large objects before caching them
• The Python memcache library can do this automatically, but the API is not exposed
Tuesday, May 26, 2009
san francisco meetup
from django.core.cache import cachefrom django.utils.encoding import smart_strimport inspect as i
if 'min_compress_len' in i.getargspec(cache._cache.set)[0]: class CacheClass(cache.__class__): def set(self, key, value, timeout=None, min_compress_len=150000): if isinstance(value, unicode): value = value.encode('utf-8') if timeout is None: timeout = self.default_timeout return self._cache.set(smart_str(key), value, timeout, min_compress_len) cache.__class__ = CacheClass
Monkey Patching core.cache
41
Tuesday, May 26, 2009
san francisco meetup
Advanced Caching
42
• Useful tool: automagic single object cache
• Use a manager to check the cache prior to any single object get by pk
• Invalidate assets on save and delete
• Eliminated several hundred QPS at Pownce
Tuesday, May 26, 2009
san francisco meetup
Advanced Caching
43
All this and more at:
http://github.com/mmalone/django-caching/
Tuesday, May 26, 2009
san francisco meetup
Advanced Caching
• Consistent hashing: hashes cached objects in such a way that most objects map to the same node after a node is added or removed.
44
http://www.flickr.com/photos/deepfrozen/2191036528/
Tuesday, May 26, 2009
san francisco meetup
Caching
48
Now you’ve made life easier for your DB server,next thing to fall over: your app server.
Tuesday, May 26, 2009
san francisco meetup
Load Balancing
• Out of the box, Django uses a shared nothing architecture
• App servers have no single point of contention
• Responsibility pushed down the stack (to DB)
• This makes scaling the app layer trivial: just add another server
50
Tuesday, May 26, 2009
san francisco meetup
Load Balancing
51
App Servers
Database
Load Balancer
Spread work between multiple nodes in a cluster using a load balancer.
• Hardware or software• Layer 7 or Layer 4
Tuesday, May 26, 2009
san francisco meetup
Load Balancing
52
• Hardware load balancers
• Expensive, like $35,000 each, plus maintenance contracts
• Need two for failover / high availability
• Software load balancers
• Cheap and easy, but more difficult to eliminate as a single point of failure
• Lots of options: Perlbal, Pound, HAProxy, Varnish, Nginx
Tuesday, May 26, 2009
san francisco meetup
Load Balancing
53
• Most of these are layer 7 load balancers, and some software balancers do cool things
• Caching
• Re-proxying
• Authentication
• URL rewriting
Tuesday, May 26, 2009
san francisco meetup
Load Balancing
54
A common setup for large operations is to use redundant layer 4 hardware balancers in front of a pool of layer 7 software balancers.
Hardware Balancers
Software Balancers
App Servers
Tuesday, May 26, 2009
san francisco meetup
Load Balancing
55
• At Pownce, we used a single Perlbal balancer
• Easily handled all of our traffic (hundreds of simultaneous connections)
• A SPOF, but we didn’t have $100,000 for black box solutions, and weren’t worried about service guarantees beyond three or four nines
• Plus there were some neat features that we took advantage of
Tuesday, May 26, 2009
san francisco meetup
Perlbal Reproxying
56
Perlbal reproxying is a really cool, and really poorlydocumented feature.
Tuesday, May 26, 2009
san francisco meetup
Perlbal Reproxying
57
1. Perlbal receives request
2. Redirects to App Server
1. App server checks auth (etc.)
2. Returns HTTP 200 with X-Reproxy-URL header set to internal file server URL
3. File served from file server via Perlbal
Tuesday, May 26, 2009
san francisco meetup
Perlbal Reproxying
• Completely transparent to end user
• Doesn’t keep large app server instance around to serve file
• Users can’t access files directly (like they could with a 302)
58
Tuesday, May 26, 2009
san francisco meetup
def download(request, filename): # Check auth, do your thing response = HttpResponse() response[‘X-REPROXY-URL’] = ‘%s/%s’ % (FILE_SERVER, filename) return response
Perlbal Reproxying
59
Plus, it’s really easy:
Tuesday, May 26, 2009
san francisco meetup
Load Balancing
60
Best way to reduce load on your app servers: don’t use them to do hard stuff.
Tuesday, May 26, 2009
san francisco meetup
Queuing
• A queue is simply a bucket that holds messages until they are removed for processing by clients
• Many expensive operations can be queued and performed asynchronously
• User experience doesn’t have to suffer
• Tell the user that you’re running the job in the background (e.g., transcoding)
• Make it look like the job was done real-time (e.g., note distribution)
62
Tuesday, May 26, 2009
san francisco meetup
Queuing
• Lots of open source options for queuing
• Ghetto Queue (MySQL + Cron)
• this is the official name.
• Gearman
• TheSchwartz
• RabbitMQ
• Apache ActiveMQ
• ZeroMQ
63
Tuesday, May 26, 2009
san francisco meetup
Queuing
• Lots of fancy features: brokers, exchanges, routing keys, bindings...
• Don’t let that crap get you down, this is really simple stuff
• Biggest decision: persistence
• Does your queue need to be durable and persistent, able to survive a crash?
• This requires logging to disk which slows things down, so don’t do it unless you have to
64
Tuesday, May 26, 2009
san francisco meetup
Queuing
• Pownce used a simple ghetto queue built on MySQL / cron
• Problematic if you have multiple consumers pulling jobs from the queue
• No point in reinventing the wheel, there are dozens of battle-tested open source queues to choose from
65
Tuesday, May 26, 2009
san francisco meetup
from django.core.management import setup_environfrom mysite import settings
setup_environ(settings)
Django Standalone Scripts
66
Consumers need to setup the Django environment
Tuesday, May 26, 2009
san francisco meetup
Django Standalone Scripts
67
Great blog post by James Bennett (@ubernostrum)
http://bit.ly/django-standalone-scripts
Tuesday, May 26, 2009
san francisco meetup
The Database
• Til now we’ve been talking about
• Shared nothing
• Pushing problems down the stack
• But we have to store a persistent and consistent view of our application’s state somewhere
• Enter, the database...
69
Tuesday, May 26, 2009
san francisco meetup
CAP Theorem
• Three properties of a shared-data system
• Consistency: all clients see the same data
• Availability: all clients can see some version of the data
• Partition Tolerance: system properties hold even when the system is partitioned & messages are lost
• But you can only have two
70
Tuesday, May 26, 2009
san francisco meetup
CAP Theorem
• Big long proof... here’s my version.
• Empirically, seems to make sense.
• Eric Brewer
• Professor at University of California, Berkeley
• Co-founder and Chief Scientist of Inktomi
• Probably smarter than me
71
Tuesday, May 26, 2009
san francisco meetup
CAP Theorem
• The relational database systems we all use were built with consistency as their primary goal
• But at scale our system needs to have high availability and must be partitionable
• The RDBMS’s consistency requirements get in our way
• Most sharding / federation schemes are kludges that trade consistency for availability & partition tolerance
72
Tuesday, May 26, 2009
san francisco meetup
The Database
• There are lots of non-relational databases coming onto the scene
• CouchDB
• Cassandra
• Tokyo Cabinet
• But they’re not that mature, and they aren’t easy to use with Django
73
Tuesday, May 26, 2009
san francisco meetup
The Database
• Django has no support for
• Non-relational databases like CouchDB
• Multiple databases (coming soon?)
• If you’re looking for a project, plz fix this.
• Only advice: don’t get too caught up in trying to duplicate the existing ORM API
74
Tuesday, May 26, 2009
san francisco meetup
I Want a Pony
• Save always saves every field of a model
• Causes unnecessary contention and more data transfer
• A better way:
• Use descriptors to determine what’s dirty
• Only update dirty fields when an object is saved
75
Tuesday, May 26, 2009
san francisco meetup
Denormalization
• Django encourages normalized data, which is usually good
• But at scale you need to denormalize
• Corollary: joins are evil
• Django makes it really easy to do joins using the ORM, so pay attention
77
Tuesday, May 26, 2009
san francisco meetup
Denormalization
• Start with a normalized database
• Selectively denormalize things as they become bottlenecks
• Denormalized counts, copied fields, etc. can be updated in signal handlers
78
Tuesday, May 26, 2009
san francisco meetup
Replication
• Typical web app is 80 to 90% reads
• Adding read capacity will get you a long way
• MySQL Master-Slave replication
80
Read & Write
Read only
Tuesday, May 26, 2009
san francisco meetup
Replication
• Django doesn’t make it easy to use multiple database connections, but it is possible
• Some caveats
• Slave lag interacts with caching in weird ways
• You can only save to your primary DB (the one you configure in settings.py)
• Unless you get really clever...
81
Tuesday, May 26, 2009
san francisco meetup
class SlaveDatabaseWrapper(DatabaseWrapper): def _cursor(self, settings): if not self._valid_connection(): kwargs = { 'conv': django_conversions, 'charset': 'utf8', 'use_unicode': True, } kwargs = pick_random_slave(settings.SLAVE_DATABASES) self.connection = Database.connect(**kwargs) ... cursor = CursorWrapper(self.connection.cursor()) return cursor
Replication
82
1. Create a custom database wrapper by subclassing DatabaseWrapper
Tuesday, May 26, 2009
san francisco meetup
class MultiDBQuerySet(QuerySet): ... def update(self, **kwargs): slave_conn = self.query.connection self.query.connection = default_connection super(MultiDBQuerySet, self).update(**kwargs) self.query.connection = slave_conn
Replication
83
2. Custom QuerySet that uses primary DB for writes
Tuesday, May 26, 2009
san francisco meetup
class SlaveDatabaseManager(db.models.Manager): def get_query_set(self): return MultiDBQuerySet(self.model, query=self.create_query())
def create_query(self): return db.models.sql.Query(self.model, connection)
Replication
84
3. Custom Manager that uses your custom QuerySet
Tuesday, May 26, 2009
san francisco meetup
Replication
85
http://github.com/mmalone/django-multidb/
Example on github:
Tuesday, May 26, 2009
san francisco meetup
Replication
• Goals:
• Read-what-you-write consistency for writer
• Eventual consistency for everyone else
• Slave lag screws things up
86
Tuesday, May 26, 2009
san francisco meetup
Replication
87
What happens when you become write saturated?
Tuesday, May 26, 2009
san francisco meetup
Federation
89
• Start with Vertical Partitioning: split tables that aren’t joined across database servers
• Actually pretty easy
• Except not with Django
Tuesday, May 26, 2009
san francisco meetup
Federation
• At some point you’ll need to split a single table across databases (e.g., user table)
• Now auto-increment won’t work
• But Django uses auto-increment for PKs
• So specify your own PKs in the save() method
• Not a bad idea to start with UUIDs from day one since it’s a pain in the ass to migrate
91
Tuesday, May 26, 2009
san francisco meetup
class Model(models.Model): def save(self, force_insert=False, force_update=False): if not self.id: force_insert = True self.id = uuid.uuid() return super(Model, self).save(force_insert, force_update) class Meta: abstract = True
Federation
92
Tuesday, May 26, 2009
san francisco meetup
UUID Generator
93
http://gist.github.com/117292
Tuesday, May 26, 2009
san francisco meetup
>>> Article.objects.filter(pk=3).query.as_sql()('SELECT "app_article"."id", "app_article"."name", "app_article"."author_id" FROM "app_article" WHERE "app_article"."id" = %s ', (3,))
Know your SQL
95
Tuesday, May 26, 2009
san francisco meetup
>>> import sqlparse>>> def pp_query(qs):... t = qs.query.as_sql()... sql = t[0] % t[1]... print sqlparse.format(sql, reindent=True, keyword_case='upper')... >>> pp_query(Article.objects.filter(pk=3))SELECT "app_article"."id", "app_article"."name", "app_article"."author_id"FROM "app_article"WHERE "app_article"."id" = 3
Know your SQL
96
Tuesday, May 26, 2009
san francisco meetup
>>> from django.db import connection>>> connection.queries[{'time': '0.001', 'sql': u'SELECT "app_article"."id", "app_article"."name", "app_article"."author_id" FROM "app_article"'}]
Know your SQL
97
Tuesday, May 26, 2009
san francisco meetup
Know your SQL
• It’d be nice if a lightweight stacktrace could be done in QuerySet.__init__
• Stick the result in connection.queries
• Now we know where the query originated
98
Tuesday, May 26, 2009
san francisco meetup
Monitoring & Measuring
99
Django Debug Toolbar
http://github.com/robhudson/django-debug-toolbar/
Tuesday, May 26, 2009
san francisco meetup
Monitoring & Measuring
• Ganglia (http://ganglia.info)
• Munin (http://munin.projects.linpro.no/)
• Cacti (http://cacti.net)
100
You can’t improve what you don’t measure.
Tuesday, May 26, 2009
san francisco meetup
Monitoring & Measuring
• All Servers
• CPU Usage
• Disk utilization
• IO Wait
• Memory Usage
• Bandwidth Usage
101
Tuesday, May 26, 2009
san francisco meetup
Monitoring & Measuring
• Database Servers
• Queries per second
• Open connections
• Slave lag
• Cache hit rate
102
Tuesday, May 26, 2009
san francisco meetup
Monitoring & Measuring
• Web Servers
• Requests per second
• Response time
• Apache children (or equivalent)
103
Tuesday, May 26, 2009
san francisco meetup
Monitoring & Measuring
• Cache servers
• Requests per second
• Eviction rate
• LRU reference age
• Average object size
• Cache hit ratio
104
Tuesday, May 26, 2009
san francisco meetup
Monitoring & Measuring
• Application level
• Queue lengths
• Registration rate
• Anything interesting
• You should be able to correlate your server level metrics (like DB QPS) with application level metrics (like API traffic)
105
Tuesday, May 26, 2009
san francisco meetup
url(r’^user/(\w+/$’, ..., name=‘user’))
{% url user username %}
from django.core.urlresolvers import reversereverse(‘user’, kwargs={‘username’: username})
URLs
109
Always name URLs.
Tuesday, May 26, 2009
san francisco meetup
token, created = Token.objects.get_or_create( key=access_token.key, defaults={'secret': access_token.secret})if not created: token.secret = access_token.secret token.save()
The ORM
111
Use Model.objects.get_or_create()
Tuesday, May 26, 2009
san francisco meetup
Managers
112
Custom Managers are awesome. You should use them.
Tuesday, May 26, 2009
san francisco meetup
Managers
113
• Custom managers are good for
• Caching
• Denormalization
• Custom SQL
• Complex relationships
• Anything on a model that you want to hide behind a pretty API
Tuesday, May 26, 2009
san francisco meetup
class FollowingDescriptor(object): def __get__(self, instance, cls): class RelationshipManager(models.Manager): def get_query_set(self): return User.objects.filter(follower_relationships__user=instance) def add(self, user): instance.following_relationships.create(to_user=user) def remove(self, user): try: relationship = instance.following_relationships.get(to_user=user) relationship.delete() except ObjectDoesNotExist: pass return RelationshipManager()
Managers
114
Tuesday, May 26, 2009
san francisco meetup
Class-based Views
115
Django views are callables that take a request object and return a response object.
Tuesday, May 26, 2009
san francisco meetup
Class-based Views
116
• Just implement the __call__() method
• Views instantiated when urls.py is imported
• View instances are global variables
• Not thread-safe
• Retain state between requests
Tuesday, May 26, 2009
san francisco meetup
Class-based Views
• Make your view subclass HttpResponse
• Kind of hacky, but it works
• Instantiated per request
• Thread safe
• Safe to maintain state in the view instances
• Jacob promises to fix the problems with the __call__()-based approach
117
Tuesday, May 26, 2009
san francisco meetup
Class-based Views
118
__call__()-based approach
http://www.djangosnippets.org/snippets/1071/
Tuesday, May 26, 2009
san francisco meetup
Class-based Views
119
http://www.djangosnippets.org/snippets/1072/
Subclass approach
http://gist.github.com/118277
Tuesday, May 26, 2009
san francisco meetup
Subclassy Models
120
• Abstract models added in Django 1.0
• Useful for creating a common base class
• Pownce: Note superclass would have been nice
• Work by creating multiple tables for superclass and subclasses
• But if you fetch an object via the superclass manager, you get an instance of the superclass... lame.
Tuesday, May 26, 2009
san francisco meetup
Subclassy Models
121
http://www.djangosnippets.org/snippets/1034/
Use a custom Manager & QuerySet to return an instance of the base class
Tuesday, May 26, 2009
san francisco meetup
Contact Me
123
Mike [email protected]
twitter.com/mjmalone
Tuesday, May 26, 2009
Top Related