Realtime web2012

35
Building Real-Time Web http://tinyurl.com/realtime2012 http:// Timothy Fitz .com CTO Canvas

Transcript of Realtime web2012

Page 1: Realtime web2012

Building Real-Time Webhttp://tinyurl.com/realtime2012

http:// Timothy Fitz .com

CTO Canvas

Page 2: Realtime web2012

What is “Realtime web”

Page 3: Realtime web2012

What does “Realtime” look like?

Page 4: Realtime web2012

What does “Realtime” look like?

Page 5: Realtime web2012

What does “Realtime” look like?

Page 6: Realtime web2012

REALTIME WEB“Push, not pull.”

Page 7: Realtime web2012

3 HARD PROBLEMS

Talking to the browserHigh concurrencyScaling up

Page 8: Realtime web2012

Talking to the browser

• Short Polling• Long Polling• WebSocket• Flash Socket

Page 9: Realtime web2012

Short Polling

Page 10: Realtime web2012

Long Polling

Page 11: Realtime web2012

Flash Socket

Page 12: Realtime web2012

WebSocket

Page 13: Realtime web2012

High Concurrency

• Blocking I/O– Thread per process– Tops out at 200 to 1k connections

• Non-blocking I/O– One process, one thread– 10k to 100k connections

Page 14: Realtime web2012

Django

Page 15: Realtime web2012

DjangoApache

Page 16: Realtime web2012

There is no apache for realtime

Page 17: Realtime web2012

Non-blocking I/O Servers

• Python– Twisted– Tornado– gevent

• Not python– Node.js– Erlang something

Page 18: Realtime web2012

Twisted

• Pro– Can talk every protocol ever– Oldest and most widely used in production

• Con– Overkill for web-only tasks– Not simple

Page 19: Realtime web2012

Tornado

• Pro– Simple– Does HTTP stuff simply

• Con– Might not interface with what you need

• Confusing– You can run Tornado (HTTP layer) on top of

Twisted (networking layer)

Page 20: Realtime web2012

gevent

• Pro– Coroutines are a better model than callbacks– As such, very easy to write complicated logic

• Con– Least well documented– Least consensus on best practices– New, uncertain about production readiness

Page 21: Realtime web2012

Node.js

• Pro– Best documentation by far– Socket.IO abstracts away browser communication

• Con– Can’t share logic between Django app– New, but has fairly large install base

Page 22: Realtime web2012

Erlang

• Pro– Hands down best for complex realtime tasks– Forces you to think about concurrency/scale– Abstracts away the network– Old and reliable

• Con– Forces you to think about concurrency/scale– Can’t share logic between Django app– High spin-up cost (functional, concurrency driven)

Page 23: Realtime web2012

SCALING UP!

Just oneFrontend nodes x Backend nodesMore architecture decisions!

Page 24: Realtime web2012

Just one

• Everything in memory• Django nodes talk directly to box• Spare for availability• Failover = realtime data loss– Make realtime 100% redundant

Page 25: Realtime web2012

Probably good enough!

– WARNING: NAPKIN MATH– 10k daily visits * 10.0min avg visit

= 70 average concurrent users– One box can easily be built out to handle 3-5k

= Roughly 450k-700k daily visits

Page 26: Realtime web2012

Frontend nodes x Backend nodes

• Frontend handle users / connections• Backend handles channels

Page 27: Realtime web2012

More architecture decisions!

• In memory backend– Redis Pub/Sub– ZeroMQ– Roll your own

• Persisted to Disk:– ActiveMQ– RabbitMQ– Amazon SQS

Page 28: Realtime web2012

Redis Pub/Sub

• Simplest to setup• Simplest model• SUBSCRIBE channel_name• PUBLISH channel_name “Hello World!”

Page 29: Realtime web2012

ZeroMQ

• Publish/Subscribe semantics• Request/Response• Push/Pull (round robin)• Extremely fast

Page 30: Realtime web2012

Roll your own

• Same language as your frontend – (Twisted/Node/Whatever)

• Only do this if you have per-channel business logic– You probably don’t.

• Erlang maps really really well to this domain.

Page 31: Realtime web2012

Full Stack Services

• REST APIs to push to the browser• http://pusher.com• http://beaconpush.com

Page 32: Realtime web2012

Canvas

Amazon ELB Nginx + Twisted Redis

Page 33: Realtime web2012

Final Recommendations

• Need python? Twisted• Don’t? Node.js/SocketIO• Need scale/reliability? Redis backend. • Complex? Going big? Erlang all the way.

Page 34: Realtime web2012

Questions?

Page 35: Realtime web2012

Further Reading• IMVU IMQ talk http://www.slideshare.net/JonWatte

/message-queuing-on-a-large-scale-imvus-stateful-realtime-message-queue

• Twilio talk on gevent + zeromq (given by Jeff Lindsay, highly recomended): http://www.twilio.com/conference/video/distributed-systems-with-gevent-and-zeromq

• Last.fm scaling Eralng/Mochiweb to 1 million concurrent connections on one machine: http://www.metabrew.com/article/a-million-user-comet-application-with-mochiweb-part-1

• The original Comet blog post: http://infrequently.org/2006/03/comet-low-latency-data-for-the-browser/

• Django + Socket.IO + gevent: http://codysoyland.com/2011/feb/6/evented-django-part-one-socketio-and-gevent/