Software at Scale

29

Transcript of Software at Scale

Page 1: Software at Scale
Page 2: Software at Scale

Why is scale important?

0

10000

20000

30000

40000

50000

60000

70000

80000

Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

Usage Difficulty

“Do things that don’t scale!”

Missed opportunity

Permanent scaling need

But scale if it’s on the way.

Page 3: Software at Scale

A tale of two startups(“Or how I spent 2013…”)

Clipless

Built to scale.v1 developed in 3 months.PR blast to TechCrunch, AndroidPolice, etc. led to 1700% month over month growth.Handling over 10,000 QPS.Acquired 3 months from launch.

Shark Tank Startup

Scaling ignored.v1 developed in 3 months.Reran on Shark Tank, service and website went down almost immediately.Still slow (but steady) growth.

Page 4: Software at Scale

What was different?

Clipless (Tomcat, 1-3 Digital Ocean VMs)

Load balanced, replicated servers and DBs.Well-written RESTful API, any server could answer any query.Multithreaded backend.Batched, asynchronous DB operations.Caching by locality and time.Queued network operations.

S.T. Startup (Ruby on Rails, Heroku)

No load balancing.Replicated DB via Heroku postgres.Not truly REST, backends kept state.Single-threaded backend (one request blocked entire Heroku dyno).Direct, blocking DB access.DB caching via ActiveRecord.

Page 5: Software at Scale

Potential Bottlenecks

• Client resources• CPU

• Memory

• I/O

• Server resources

• Database resources• Open connections

• Running queries

• Network resources• Bandwidth

• Connections / open sockets

• Availability (esp. on Wifi / mobile networks)

Page 6: Software at Scale

Potential Bottlenecks

• Client resources• CPU

• Memory

• I/O

• Server resources

• Database resources

• Network resources• Bandwidth

• Connections / open sockets

• Availability (esp. on Wifi / mobile networks)

Profile your algorithmsCrunch less dataReuse more old workOffload some processing to the server

Page 7: Software at Scale

Potential Bottlenecks

• Client resources

• Server resources• CPU

• Memory

• I/O

• Database resources

• Network resources• Bandwidth

• Connections / open sockets

• Availability (esp. on Wifi / mobile networks)

Profile your algorithmsCrunch less dataReuse more old work (across users)Divide and Conquer (“shard”)Spin up and balance more servers

Page 8: Software at Scale

Potential Bottlenecks

• Client resources

• Server resources

• Database resources• Open connections

• Running queries

• Network resources• Bandwidth

• Connections / open sockets

• Availability (esp. on Wifi / mobile networks)

Optimize your queriesConnection poolingAdd a second-level cacheReuse more old work (across users)Divide and Conquer (“shard”)Batch DB requestsSpin up and replicate more DBs

Page 9: Software at Scale

Potential Bottlenecks

• Client resources

• Server resources

• Database resources

• Network resources• Bandwidth

• Connections / open sockets

• Availability (esp. on Wifi / mobile networks)

Add a local cacheSend diffsCompress responses (CPU tradeoff)Connection poolingBatch network requests

Page 10: Software at Scale

Profiling

Purpose: find the “hotspots” in your program.

Things you care about:• “CPU time” – time spent processing your program’s instructions.

• “Memory” – RAM being used to store your program’s data.

• “Wall time” – overall time spent waiting for the program.

• Methods:• Basic: “Stopwatch”

• Advanced: Profiler (e.g. jprof, jprofiler, hprof, Netbeans, Visual Studio)

(Diagnosing the problem)

Page 11: Software at Scale

Stopwatch

• Easy: just time methods.

Matlab:

function [result] = do_something_expensive(data)

tic

toc

end

• In Java, use Guava’s Stopwatch class (start() and stop() methods).

Page 12: Software at Scale

Profiler

Page 13: Software at Scale

Strategies

Page 14: Software at Scale

Caching and Reuse

• Trades off CPU for space.

• Look for repetition of input.(Including subproblems)

• Compute a key from the input.

• Associate the result with the key.

• Important: algorithm must be a deterministic mapping from input to output.

• Important: if you change what the algorithm depends on, update the cache key.

“There are only two hard things in Computer Science:cache invalidation and naming things.” --Phil Karlton

Name: Alice

Job: Developer

Salary: 100,000

<Alice, [email protected]>

Cache

Page 15: Software at Scale

Computing a Cache Key

• Hashing is a good strategy.

• Object.hash (JDK7) / Objects.hashCode (Guava)

• Beware: Hashes can collide – sanity check results!

• Searching:• Hash data

• Query cache for hash key.

• If found, return associated value.

• If not, query live service and store the result in the cache.

<Alice, [email protected]>

0xAF724…

Page 16: Software at Scale

Concurrency

Work Work Work

Work

Work

Work

A lot of time

Less time

Sequential programs run like this:

Concurrent programs run like this:

Page 17: Software at Scale

Race Conditions

Problem: Two threads can simultaneously write to the same variables.

If you ran this code in two threads:

if (x < 1) { x++; }

Then x would usually end up at 1.

But sometimes it would be 2!

• Race conditions such as that one are among the hardest bugs to find + fix.

• Three ways to manage this:

• Immutability

• Local state

• Synchronization

• Race conditions only happen when you write to shared, mutable state.

Page 18: Software at Scale

Immutability

• General tip: try to minimize the number of states your program can end up in.

• Concurrency

• REST

• (And your programs will just have less state, so you’ll produce fewer bugs)

• Declare variables final where possible, set them in the constructor, and don’t write setters unless you must:

// String is an immutable type - can’t change it at runtime.

// foo is an immutable variable - can’t reassign it.

private final String foo;

public Bar(String foo) {

this.foo = Preconditions.checkNotNull(foo);

}

Page 19: Software at Scale

Local State

• Sometimes you need to modify state.

• But you can still avoid locking if it’s only visible to you:

• Two threads can write copies of same data.

• Optionally, can be merged back in single thread afterwards.

• (This is how MapReduce works)

Java inner classes help tremendously with this!

// Every time you run sendToNetwork, you’ll use a new channel. No shared state!

void sendToNetwork() {

final Channel channel = new HttpChannel(context);

channel.connect();

Thread foo = new Thread() {

@Override

public void run() {

channel.send(“I am the jabberwocky”);

}

}

}

Page 20: Software at Scale

Synchronization

• If you do need to write shared state, you need to synchronize access to it.

• Last resort: slows your program and deadlock-prone.

Object lock;

synchronized (lock) {

if (x < 1) { x++; }

}

Now x is always 1! No interruption possible between read and write.

• More advanced: read/write locks (ReentrantReadWriteLock…)

• Also check out Java “Atomic” classes and “concurrent” collections:• AtomicBoolean, AtomicInteger, …• ConcurrentHashMap…

Page 21: Software at Scale

Futures

• Threads compute asynchronously.

• Caller wants some way of knowing the result when it’s ready.

• Future: handle to a result that may or may not be available yet.• future.get(): waits for a result and returns it, with optional timeout.

• Futures allow for asynchronous calls to immediately return, and for the program to wait for the results when it’s convenient.

• Also see Guava’s ListenableFuture.

The usual pattern:

ThreadPoolExecutor pool;

Callable<String> action = new Callable<String>() {

@Override

public String call() throws NetworkException {

return askTheNetworkForMyString();

}

};

Future<String> result = pool.submit(callable);

String myString = result.get(); // Waits until the result is available. Throws if an exception was thrown inside the Callable.

Page 22: Software at Scale

REST

• Scalable client / server architecture.

• Sockets are complicated, usually uses HTTP.

• Each HTTP request hits an “endpoint”, which does one thing.

e.g. GET http://api.clipless.co/json/deals/near/Times_Square

• Principles:

• Server does not store state (see immutability)

• Responses can be cached (see caching)

• Client doesn’t care if server is final endpoint or proxy.

• State usually ends up in DB, server communicates with client using tokens.

Page 23: Software at Scale

Clipless Architecture

10,000 reqs / second

Protobuf over HTTP

Apache (mod_proxy_balancer)

Tomcat

MySQL

Content-Addressable

Cache

Content-Addressable

Cache

Page 24: Software at Scale

Static Content

• Static content (e.g. HTML, images) is highly cacheable.

• Easiest way to cache: use a CDN• Akamai, S3, CloudFlare, CloudFront, MaxCDN, …

• Cache key:• Some HTTP headers (inc. Cache-Control header)

• Page requested

• Last-modified (e.g. from a “HEAD” to your server)

• Added bonus: most CDNs are “closer” to your users than your server.

• Compressing content reduces bandwidth:• Browsers usually support gzip decompression.

• Apache, nginx: Gzip compression plugins

• Javascript / CSS: minification

• Images: Google PageSpeed service / CloudFlare

• Program data: Protocol Buffers, Thrift

• Why use your bandwidth when you can use someone else’s?

Page 25: Software at Scale

Sharding

Requests A-L Requests M-Z

Alice

BobMallory

Page 26: Software at Scale

Batching Network RequestsThe Operation Queue / Proactor Pattern

Producer

Producer

Thread-safe queue

Worker Thread Pool

Work

NetworkListener

onUp: queue.resume()onDown: queue.suspend()

Work Work

Producer

ListenableFuture<Result>

Page 27: Software at Scale

How to Test

• Mock large amounts of data, measure performance• Can be automated so you never encounter performance regressions

• Network stress tests• ab

• blitz

• loader.io

• ulimits

• Packet sniffers

• Round trip time services, e.g. NewRelic.

Page 28: Software at Scale

General Principles

• Scale when you anticipate the need.

• Scale eagerly when you don’t need to go far out of the way.• CDNs and Gzip compression good examples.

• Or when retrofitting will be painful.• RESTful architecture from the beginning: much easier than tacking it on later!

• But caching is usually easy to add later.

• Focus on the big improvements:• 80/20 rule

• Profile and knock out the biggest CPU / memory hogs first.

• Practice and internalize to reduce scaling costs!• Concurrency is much easier with mastery.

• Caching seems much easier with mastery, often isn’t.

• Internalize immutability and you’ll just write better code.

Page 29: Software at Scale

Thanks!

Good luck, and always bring mangosteens to acquisition talks.