Introduction to Content-aware Switch Presented by Li Zhao.

20
Introduction to Content-aware Switch Presented by Li Zhao

Transcript of Introduction to Content-aware Switch Presented by Li Zhao.

Introduction to Content-aware Switch

Presented by

Li Zhao

Content-aware Switch (CS)

Switch

Image Server

Application Server

HTML Server

www.yahoo.comInternet

GET /cgi-bin/form HTTP/1.1 Host: www.yahoo.com…

APP. DATATCPIP

• Front-end of a web cluster• Route packets based on layer 5/7 (content)

information

Why use CS

• Servers can be specialized for certain types of request– Content segregation

• Exploit locality – Affinity-based routing– Increase the performance because of the improved hit

rate

• Partial replication of server file set– Partition the server’s file set over different nodes

Content-aware Switch Architecture

• Two way architectureServer returns theresponse to the switch

• One way architectureServer returns theresponse to the client

serverswitchclient

Layer 7 Two-way Architecture

Layer-7 Two-way Mechanisms

• TCP gateway An application level proxy

running on the web switch mediates the communication between the client and the server

• TCP splicing reduce the overhead in TCP

gateway. Packet forwarding occurs at network level between the network interface driver and the TCP/IP stack, is carried out directly by OS

kernel

user

kernel

user

TCP Splicingclient

content switch server

step1

step2

SYN(CSEQ)

SYN(DSEQ)

ACK(CSEQ+1)

DATA(CSEQ+1)

ACK(DSEQ+1)step3

step7

step8

step4

step5

step6

SYN(CSEQ)

SYN(SSEQ) ACK(CSEQ+1)

DATA(CSEQ+1) ACK(SSEQ+1)

DATA(SSEQ+1) ACK(CSEQ+lenR+1)

DATA(DSEQ+1) ACK(CSEQ+LenR+1)

ACK(DSEQ+lenD+1) ACK(SSEQ+lenD+1)

lenR: size of http request. lenD: size of return document

.

TCP Splicing w/ Pre-forked Connections

client

switch

server

step1

step2

SYN(CSEQ)

SYN(DSEQ)

ACK(CSEQ+1)

DATA(CSEQ+1)

ACK(DSEQ+1)

step3

step7

step8

step4

step5

step6

DATA(PSEQ+1)

ACK(SSEQ+1)

DATA(SSEQ+1)

ACK(PSEQ+lenR+1)

DATA(DSEQ+1) ACK(CSEQ+LenR+1)

ACK(DSEQ+lenD+1) ACK(SSEQ+lenD+1)

lenR: size of http request. lenD: size of return document

.

SYN(PSEQ)

SYN(SSEQ)ACK(PSEQ+1)

ACK(SSEQ+1)

step9

Pre-Allocate Server Schemeclient

content switch Pre-allocatedserver

step1

step2

SYN(CSEQ)

SYN(SSEQ)

ACK(CSEQ+1)

DATA(CSEQ+1)

ACK(SSEQ+1)step3

step4

step5

SYN(CSEQ)

SYN(SSEQ) ACK(CSEQ+1)

DATA(CSEQ+1) ACK(SSEQ+1)

DATA(SSEQ+1)ACK(CSEQ+lenR+1)

DATA(SSEQ+1)ACK(CSEQ+LenR+1)

ACK(SSEQ+lenD+1) ACK(SSEQ+lenD+1)

• Use a guess routing decision based on IP/Port#/History• Advantage:

• Faster than TCP splicing.• Reduce session processing overhead

no need to convert server sequence #

Degenerated to TCP Splicing If Guess Wrong

client content switch

Pre-allocatedserver

step1

step2

SYN(CSEQ)

SYN(SSEQ)

ACK(CSEQ+1)

DATA(CSEQ+1)

ACK(SSEQ+1)step3

SYN(CSEQ)

SYN(SSEQ) ACK(CSEQ+1)

step4

step5

DATA(RSEQ+1)ACK(CSEQ+lenR+1)

DATA(SSEQ+1)ACK(CSEQ+LenR+1)

ACK(DSEQ+lenD+1) ACK(SSEQ+lenD+1)

FIN(CSEQ+1)

step4

step5

step6

SYN(CSEQ)

SYN(RSEQ) ACK(CSEQ+1)

DATA(CSEQ+1) ACK(SSEQ+1)

Right server

Sequence # conversion needed

Case Study

• Linux-based content aware switch [Yang99]

• IBM Layer 5 [Pradhan00]

Functional Overview of Content-aware Distributor

Results

• Overhead of the switch• 89usec reduced pre-forked

connections

• CS vs. Layer 4 switch• Affinity-based routing vs. WRR• Content-segregation vs. WRR

• CGI: 27%• Static: 36%

IBM Switch Architecture

• Switch core• Port controller:

– Identify packets (layer 5) and send them to CPU

– Processing all other packets

• CPU: PowerPC 603e – Parse http request– URL based routing

Flow Diagram on Layer 5 System

• Client ports vs. server ports• Classifier: Identify packets

Results

• CS vs. Layer 4 switch– Entire set of

files are replicated

– Some servers share files by NFS

– Partitioned file set

Layer-7 one-way architecture

Layer-7 one-way mechanisms

• TCP handoffThe switch hands off the TCP connection endpoint to the server

• TCP connection hop– Software-based proprietary solution– encapsulating the IP packet in an RPX packet

and sending it to the server.

TCP Handoffclient

content switch server

step1

step2

SYN(CSEQ)

SYN(DSEQ)

ACK(CSEQ+1)

DATA(CSEQ+1)

ACK(DSEQ+1)step3

step4

step5

step6

DATA(DSEQ+1)

ACK(CSEQ+lenR+1) ACK(DSEQ+lenD+1) ACK(DSEQ+lenD+1)

Migrate(Data, CSEQ, DSEQ)

• Migrate the created TCP connection from the switch to the back-end sever– Create a TCP connection at the back-end without going through the TCP

three-way handshake– Retrieve the state of an established connection and destroy the connection

without going through the normal message handshake required to close a TCP connection

• Once the connection is handed off to the back-end server, the switch must forward packets from the client to the appropriate back-end server

References

• [Pradhan00] G.Apostolopoulos, et. al, Design, Implementation and Performance of a Content-Based Switch, proceedings of IEEE INFOCOM-2000

• [Pai98] V.S. Pai, et. al, Locality-Aware Request Distribution in Cluster-based Network Servers. In Proceedings of the 8th Conference on Architectural Support for Programming Languages and Operating Systems, San Jose, CA, Oct.1998

• [Aron00] Mohit Aron et. al, Scalable Content-aware Request Distribution in Cluster-based Network Servers, Proc. of the 2000 Annual Usenix Technical Conference, June 2000

• [Edward] C. Edward Chow Chow, Introduction to content switch• [Valeria01] Valeria Cardellini, et. al, The state of the Art in Locally

Distributed Web-server Systems, IBM research report• [Yang99] Chu-Sing Yang, et. Al, Efficient support for content-based

rouging in web server clusters, Proc. Of USITS’ 99