Networking
description
Transcript of Networking
![Page 1: Networking](https://reader034.fdocuments.in/reader034/viewer/2022051700/568164a1550346895dd6949b/html5/thumbnails/1.jpg)
Networking
Ethan KaoCS 6410Oct. 18th 2011
![Page 2: Networking](https://reader034.fdocuments.in/reader034/viewer/2022051700/568164a1550346895dd6949b/html5/thumbnails/2.jpg)
Papers
Active Messages: A Mechanism for Integrated Communication and Control, Thorsten von Eicken, David E. Culler, Seth Copen Goldstein, and Klaus Erik Schauser. In Proceedings of the 19th Annual International Symposium on Computer Architecture, 1992.
U-Net: A User-Level Network Interface for Parallel and Distributed Computing, Von Eicken, Basu, Buch and Werner Vogels. 15th SOSP, December 1995.
![Page 3: Networking](https://reader034.fdocuments.in/reader034/viewer/2022051700/568164a1550346895dd6949b/html5/thumbnails/3.jpg)
Parallel vs. Distributed Systems
Parallel System: Multiple processors – one machine Shared Memory Supercomputing
http://en.wikipedia.org/wiki/File:Distributed-parallel.svg
![Page 4: Networking](https://reader034.fdocuments.in/reader034/viewer/2022051700/568164a1550346895dd6949b/html5/thumbnails/4.jpg)
Parallel vs. Distributed Systems
Distributed System: Multiple machines
linked together Distributed memory Cloud computing
http://en.wikipedia.org/wiki/File:Distributed-parallel.svg
![Page 5: Networking](https://reader034.fdocuments.in/reader034/viewer/2022051700/568164a1550346895dd6949b/html5/thumbnails/5.jpg)
Challenges How to efficiently communicate?
Between processors Between machines
Active Messages U-Net
http://en.wikipedia.org/wiki/File:Distributed-parallel.svg
![Page 6: Networking](https://reader034.fdocuments.in/reader034/viewer/2022051700/568164a1550346895dd6949b/html5/thumbnails/6.jpg)
Active Messages: Authors Thorsten von Eicken
Berkeley Ph.D. -> Assistant professor at Cornell -> UCSB Founded RightScale, Chief Architect at Expertcity.com
David E. Culler Professor at Berkeley
Seth Copen Goldstein Berkeley Ph.D. -> Associate professor at CMU
Klaus Erik Schauser Berkeley Ph.D. -> Associate professor at UCSB
![Page 7: Networking](https://reader034.fdocuments.in/reader034/viewer/2022051700/568164a1550346895dd6949b/html5/thumbnails/7.jpg)
Active Messages: Motivation Existing message passing multiprocessors had
high communication costs
Message passing machines made inefficient use of underlying hardware capabilities nCUBE/2 CM-5 Thousands of nodes interconnected
Poor overlap between computation and communication
![Page 8: Networking](https://reader034.fdocuments.in/reader034/viewer/2022051700/568164a1550346895dd6949b/html5/thumbnails/8.jpg)
Active Messages: Goals
Improve overlap between computation & communication
Aim for 100% utilization of resources
Low start-up costs for network usage
![Page 9: Networking](https://reader034.fdocuments.in/reader034/viewer/2022051700/568164a1550346895dd6949b/html5/thumbnails/9.jpg)
Active Messages: Takeaways Asynchronous communication
Minimal buffering
Handler interface
Weaknesses: Address of the message handler must be
known Design needs to be hardware specific?
![Page 10: Networking](https://reader034.fdocuments.in/reader034/viewer/2022051700/568164a1550346895dd6949b/html5/thumbnails/10.jpg)
Active Messages: Design
Asynchronous communication mechanism
Messages contain user-level handler address
Handler executed on message arrival Takes message off network Message body is argument Does not block
![Page 11: Networking](https://reader034.fdocuments.in/reader034/viewer/2022051700/568164a1550346895dd6949b/html5/thumbnails/11.jpg)
Active Messages: Design Sender blocks until messages can be injected
into network
Receiver interrupted on message arrival - runs handler
User level program pre-allocates receiving structures Eliminates buffering
![Page 12: Networking](https://reader034.fdocuments.in/reader034/viewer/2022051700/568164a1550346895dd6949b/html5/thumbnails/12.jpg)
Traditional Message Passing
• Traditional send/receive models
![Page 13: Networking](https://reader034.fdocuments.in/reader034/viewer/2022051700/568164a1550346895dd6949b/html5/thumbnails/13.jpg)
Active Messages: Performance Key optimization in AM vs. send/receive is
reduction of buffering.
AM can achieve near order of magnitude reduction: nCUBE/2 AM send/handle: 11us/15us overhead nCUBE/2 async send/receive: 160us overhead
CM-5 AM : <2us overhead CM-5 blocking: 86us overhead Prototype of blocking send/receive on top of AM: 23us
overhead
![Page 14: Networking](https://reader034.fdocuments.in/reader034/viewer/2022051700/568164a1550346895dd6949b/html5/thumbnails/14.jpg)
Active Messages: Split-C
Non-blocking implementations of PUT and GET
Implementations consist of a message formatter and a message handler
![Page 15: Networking](https://reader034.fdocuments.in/reader034/viewer/2022051700/568164a1550346895dd6949b/html5/thumbnails/15.jpg)
Active Messages: Matrix Multiply
Multiplication of C = A x B . Processor GETS one column of A after another to perform rank-1 update with its own columns of B.
Achieves 95% of peak performance
![Page 16: Networking](https://reader034.fdocuments.in/reader034/viewer/2022051700/568164a1550346895dd6949b/html5/thumbnails/16.jpg)
Message Driven Architectures Computation occurs in the message handler.
Specialized hardware -> Monsoon, J-Machine Memory allocation and scheduling required upon message
arrival Tricky to implement in hardware Expensive
In Active Messages, handler only removes messages from the network.
Threaded Abstract Machine (TAM) Parallel execution model based on Active Message Typically no memory allocation upon message arrival No test results
![Page 17: Networking](https://reader034.fdocuments.in/reader034/viewer/2022051700/568164a1550346895dd6949b/html5/thumbnails/17.jpg)
Active Messages: Recap Good performance
Not a new parallel programming paradigm “Evolutionary not
Revolutionary”
AM systems?
Multiprocessor vs. Cluster
![Page 18: Networking](https://reader034.fdocuments.in/reader034/viewer/2022051700/568164a1550346895dd6949b/html5/thumbnails/18.jpg)
U-Net: Authors
Thorsten von Eicken Anindya Basu
Advised by von Eicken Vineet Buch
M.S. from Cornell Co-founded Like.com -> Google
Werner Vogels Research Scientist at Cornell -> CTO of Amazon
![Page 19: Networking](https://reader034.fdocuments.in/reader034/viewer/2022051700/568164a1550346895dd6949b/html5/thumbnails/19.jpg)
U-Net: Motivation Bottleneck of local area communication at kernel
Several copies of messages made Processing overhead dominates for small messages
Low round-trip latencies growing in importance Especially for small messages
Traditional networking architecture inflexible Cannot easily support new protocols or send/receive
interfaces
![Page 20: Networking](https://reader034.fdocuments.in/reader034/viewer/2022051700/568164a1550346895dd6949b/html5/thumbnails/20.jpg)
U-Net: Goals Remove kernel from critical path of
communication
Provide low-latency communication in local area settings
Exploit full network bandwidth even with small messages
Facilitate the use of novel communication protocols
![Page 21: Networking](https://reader034.fdocuments.in/reader034/viewer/2022051700/568164a1550346895dd6949b/html5/thumbnails/21.jpg)
U-Net: Takeaways Flexible
Low latency for smaller messages
Off the shelf hardware – good performance
Weaknesses : Multiplexing resources between processes not in
kernel Specialized NI needed?
![Page 22: Networking](https://reader034.fdocuments.in/reader034/viewer/2022051700/568164a1550346895dd6949b/html5/thumbnails/22.jpg)
U-Net: Design
User level communication architecture independent
Virtualizes network devices
Kernel control of channel set-up and tear-down
![Page 23: Networking](https://reader034.fdocuments.in/reader034/viewer/2022051700/568164a1550346895dd6949b/html5/thumbnails/23.jpg)
U-Net: Design
Remove kernel from critical path: send/recv
![Page 24: Networking](https://reader034.fdocuments.in/reader034/viewer/2022051700/568164a1550346895dd6949b/html5/thumbnails/24.jpg)
U-Net: Control
U-Net: Multiplexes NI among all processes accessing
network Enforces protection boundaries and resource
limits
Process: Contents of each message and management of
send/recv resources (i.e. buffers)
![Page 25: Networking](https://reader034.fdocuments.in/reader034/viewer/2022051700/568164a1550346895dd6949b/html5/thumbnails/25.jpg)
U-Net: Architecture Main building blocks of U-Net:
Endpoints Communication Segments Message Queues
Each process that wishes to access the network Creates one or more endpoints Associates a communication segment with each
endpoint Associates set of send, receive and free message queues
with each endpoint
![Page 26: Networking](https://reader034.fdocuments.in/reader034/viewer/2022051700/568164a1550346895dd6949b/html5/thumbnails/26.jpg)
U-Net: Send & Receive
![Page 27: Networking](https://reader034.fdocuments.in/reader034/viewer/2022051700/568164a1550346895dd6949b/html5/thumbnails/27.jpg)
Network
U-Net: Send Prepare packet -> place it in the comm
seg Place descriptor on the Send queue U-Net takes descriptor from queue Transfer packet from memory to network
packetU-Net NI
From Itamar Sagi
![Page 28: Networking](https://reader034.fdocuments.in/reader034/viewer/2022051700/568164a1550346895dd6949b/html5/thumbnails/28.jpg)
Network
U-Net: Receive U-Net receives message and identifies Endpoint Takes free space from free queue Places message in communication cegment Places descriptor in receive queue Process takes descriptor from receive queue and
reads message
packetU-Net NI
From Itamar Sagi
U-Net NI
![Page 29: Networking](https://reader034.fdocuments.in/reader034/viewer/2022051700/568164a1550346895dd6949b/html5/thumbnails/29.jpg)
U-Net: Protection Boundaries Only owning process can access:
Endpoints Communication Segments Message queues
Outgoing messages tagged with the originating endpoint
Incoming messages demultiplexed by U-Net
![Page 30: Networking](https://reader034.fdocuments.in/reader034/viewer/2022051700/568164a1550346895dd6949b/html5/thumbnails/30.jpg)
U-Net: “zero-copy”
Base-level: “zero-copy” Comm segment not regarded as memory regions 1 copy betw application data structure and buffer in
comm segment Small messages held entirely in queue
Direct-access: “true zero copy” Comm segments can span entire process address space Sender can specify offset within destination comm seg
for data Difficult to implement on existing workstation hardware
![Page 31: Networking](https://reader034.fdocuments.in/reader034/viewer/2022051700/568164a1550346895dd6949b/html5/thumbnails/31.jpg)
U-Net: “zero-copy”
U-Net implementations support Base-level Hardware for direct-access not available Copy overhead not a dominant cost
Kernel emulated endpoints
![Page 32: Networking](https://reader034.fdocuments.in/reader034/viewer/2022051700/568164a1550346895dd6949b/html5/thumbnails/32.jpg)
U-Net: Implementation
Implemented on SPARCstations running SunOS 4.13 Fore SBA-100 interface
▪ Lack of hardware for CRC computation = overhead Fore SBA-200 interface
▪ Uses custom firmware to implement base-level architecture▪ i960 processor reprogrammed to implement U-Net directly
Small messages: 65us RTT vs. 12us for CM-5 Fiber saturated with packet sizes of 800
bytes
![Page 33: Networking](https://reader034.fdocuments.in/reader034/viewer/2022051700/568164a1550346895dd6949b/html5/thumbnails/33.jpg)
UAM: Performance
![Page 34: Networking](https://reader034.fdocuments.in/reader034/viewer/2022051700/568164a1550346895dd6949b/html5/thumbnails/34.jpg)
U-Net: Split-C Benchmarks
![Page 35: Networking](https://reader034.fdocuments.in/reader034/viewer/2022051700/568164a1550346895dd6949b/html5/thumbnails/35.jpg)
U-Net: TCP/IP and UDP/IP
Traditional UDP and TCP over ATM performance disappointing < 55% max bandwidth for TCP
Better performance with UDP and TCP over U-Net Not bounded by kernel
resources More state awareness = better application-network
relationships
![Page 36: Networking](https://reader034.fdocuments.in/reader034/viewer/2022051700/568164a1550346895dd6949b/html5/thumbnails/36.jpg)
U-Net: TCP/IP and UDP/IP
![Page 37: Networking](https://reader034.fdocuments.in/reader034/viewer/2022051700/568164a1550346895dd6949b/html5/thumbnails/37.jpg)
U-Net: Discussion Main goals were to achieve low latency
communication and flexibility
NetBump