Inside election night at The New York Times | Altitude NYC

32

Transcript of Inside election night at The New York Times | Altitude NYC

Inside Election Night at The New York Times Or, Panic in the Newsroom... Nick Rockwell, CTO NYT @nicksrockwell 03.21.17

Part I Oh sh!t, the election is coming up!

Preparation, News Style...

Good: Ready for Anything Better: Ready for Anything + Exhaustive Prep

Pre-Post-Mortem

✘  Who’s responsible? ✘  What if something goes wrong? ✘  Oh it did go wrong in 2012… ✘  What if there’s more load than we expect?

✓  Team, Roles & Responsibilities ✓  Build an Election Night Runbook (16 pages!) ✓  Dry runs around debates ✓  Integrate a CDN...

Timeline

8/21 - Olympics are a wrap

8/24 - First Election prep meeting

9/21 - Meet w/ Fastly

9/23 - Commit to using Fastly

10/25 - In production

11/5 - Agreement signed

11/8 - Election night!

Before Fastly

Hello Fastly

Jon: can use 90 80 70, ok 60 percent of VCL code.

Plan B for Elections

8 Additional www-varnish for content requests

8 Additional www-varnish for userinfo requests

8 Additional www-fe (just in case)

4 Additional www-varnish for elections app

Mobileweb and video load tests next week to inform possible buildout

Final test tonight for MobileWeb

Also Auth Scaling, Warming Amazon ELBs.. etc..

Part II Why a CDN, why Fastly?

You already know this but... “A DDoS attack is like someone anonymously placing a press ad including your phone number and offering an Aston Martin for sale at $200. You’re bombarded by calls, your life is misery, the callers aren’t aware they’re part of a trick, and your attacker is almost impossible to trace.” -  https://www.sidewaysdictionary.com/

Joys of CDN Obvious: � Scaled caching � Better performance due to edge delivery � DDoS protection Slightly less obvious: � Consistent performance � Better everything - TLS negotation, compression, etc. � Cascading effects of smaller, simpler infrastructure

Keynote Homepage US Avg

Pingdom EU

Catchpoint TLS negotiation time (lower is better)

Fastly Our Datacenter

Part III #TheFailingNewYorkTimes

What is risk? It’s not risk if someone else is responsible. It’s not risk if there’s no chance of

consequential failure. It’s still risk if you mitigate it. It’s still risk if you hedge, create

contingencies, and plan.

Risk and Accountability Our current ideologies of risk-taking and

accountability are at odds. Risk-taking can only take place within a

context of judgment that is opaque. A culture that values “boldness”, action-bias,

or the appearance of certainty, usually destroys true risk-taking.

Boring bullet list of stuff we’re changing

Logic changes in varnish if the request came from Fastly

Moving Abra back to the client-side (yay Ken)

Userinfo back to the client-side (can’t decrypt the session cookie..yet)

Audit what www services we can cache in Fastly

Connected CREAM to Fastly’s purge API

???????????????????? SO MANY THINGS

When are things happening

10/4-5 - First rounds of production tests (WWW)

10/09 - Testing during debate (WWW)

10/13-19 - Testing with Mobile Web (internally, public, debate)

10/25 - Production launch

11/08 - Hide somewhere and hope Trump doesn’t win

11/10’ish - back to datacenter if necessary (it wasn’t…)

Jon, Vinessa and Chase

Part IV Election Night

Chartbeat

Yikes

Yikes.

To end: we are just getting started...

What’s next: � Continuing to shrink provisioning � Continuing to “purge” or replace downstream caching � Logs into BigQuery � Looking at edge processing opportunities: ⛈ Load balancing, WAF ⛈  Image service ⛈ Auth & Meter

Thank You!