Software architecture for high traffic website

Post on 15-Apr-2017

2.011 views 4 download

Transcript of Software architecture for high traffic website

Software architecture for high traffic website

Case study - Stack Overflow

Presenter: Ngô Xuân Hòa (Novaon Adnetwork - Novanet)Hanoi .Net Meetup

Contents

About Stack Overflow

● Beginning

● Restructure #1

● Restructure # 2

● Founders

● Principles

SO architecture

● StackExchange.Redis

● Dapper

● Jil

Open-source Libs

About Stack Overflow

Founders

Jeff Atwood

Joel Spolsky

2008

Stack Overflow

2009 2010 2011

Server Fault

Stack Exchange 1.0

Stack Exchange 2.0

Stack Overflow Carees

Rome wasn’t build in a day!

● 100+ Q&A Sites

● 600+ million pageviews a month

● 3000+ requests per second

● 16+ million users

● 8+ million question

● 40+ million answers

Principles

Perfomance Is a FeatureCache All The Thing!Reinvention is OK

Stack Overflow Architecture

2 times restructuringStack Exchange 1.0

● ASP.NET MVC

● SQL Server

● LINQ to SQL

● Wikipedia DB Design

Stack Exchange Network

LINQ to SQL

HAProxy

Redis

Lucene.NET

Scale Up

● Cache every things

● Elastic Search

● Reinvention

Stack Exchange 1.0 Structure

Windows NLBLoad balancing

IIS Server IIS ServerWeb server

SQL ServerDatabase

Window NLB

● Cons:

○ Limit to 8 Nodes

○ Cannot detect service failed

Web-tier

ASP.NET MVC

LINQ to SQL

SQL Server

● All-in-memory

● Full text search

● 16 million pageviews a month

● 3 million unique visitors a month

● 6 million visits a month

Follow none but learn from everyone!

Pros

● Bottleneck: Database SQL Server

● High cost to scale up● Simple

Cons

Restructure #1 - Stack Exchange Network

HAProxyRedis CacheLucene.NETTag Engine

Stack Exchange Network Structure

HAProxy

Redis

IIS ServersDatabase

protobuf

sqlhttp http

Load Balancing

● HAProxy:

○ Run in Linux

○ Free

Web-tier

ASP.NET MVC 3

LINQ to SQL

jQuery 1.4.5

Lucene.Net

Redis

● In-memory cache

● Master-slave

● Messaging notification

3 Type Cache

Local Cache Site Cache

● Use Redis● Cache Site’s

data:- Q&As- Acceptance rates- ...

Global Cache

● Use Redis● Cache System

Data:- User info- Inbox- ...

● Use HttpRunTime.Cache

● Cache: - User Session- View Count- ...

Update cache flow - Local cache

Local Cache

Redis

DB

Other sites

1 3

2.1

2.2

41 - OnStartup - Subcribe invalidation message to Redis2.1 - Data changed (by other sites, apps…)2.2 - Send message to Redis3 - Redis send Notification to Subscribers4 - Get data from DB - update Local cache

Deployment flow with HAProxy

● Tell HAProxy to take the server out of rotation via a POST● Delay to let IIS finish current requests (~5 sec)● Stop the website● Copy files● Start the website● Local testing, update local cache, etc…● Re-enable HAProxy via another POST

● High performance

● Low-cost Load Balancing (use HAProxy)

● Use Messaging của Redis for cache invalidation

Pros

● Too many SQL query

Cons

● 95 million pageviews a month

● 800 requests per second

● 16 million users

Restructure #2 - Scale Up

Cache All the ThingElastic SearchReinvention

Stack Exchange Network Structure

Elastic SearchTag Engine

Databases

Redis

HAProxy

5 Level cacheNetwork

LevelLocal Cache

Redis Cache

SQL SV Cache SSD

● Network Level: Browser cache…● Local Cache: HttpRuntime.Cache - Cache all data in memory● Redis Cache: Cache all data● SQL Server Cache: Cache all data in memory (the database servers have

384GB of RAM)

Cache Flow

● Check Local Cache

● Else, check Redis Cache and update Local Cache

● If Cache Redis doesn’t have data, fetch from databases, then update Redis Cache and Local Cache

Cache All the Things!

Pros

● Data has latency

● Very, Very Fast (<400ms)

● Low servers load:

○ IIS: 10-15% CPU usage

○ DB: 10% CPU usage

● 99% request served by cache

Cons

● 95 million pageviews a month

● 800 requests per second

● 16 million users

Open-source Libs

• StackExchange.Redis - high perfomance Redis client

• Dapper - a micro ORM - very fast• Jil - fast JSON Serializer

Reinvention is OK!

Reference sources

● http://stackoverflow.com

● http://highscalability.com

● http://codinghorror.com

● http://www.joelonsoftware.com

● http://nickcraver.com

● http://josephwoodward.co.uk/2014/02/the-architecture-of-stackoverflow/

Thank you!

Ngô Xuân Hòaxuanhoa862001@gmail.com