3.3 Cell Switching (ATM) Asynchronous Transfer Mode (ATM) connection-oriented packet-switched...
-
Upload
zaire-hedges -
Category
Documents
-
view
224 -
download
3
Transcript of 3.3 Cell Switching (ATM) Asynchronous Transfer Mode (ATM) connection-oriented packet-switched...
3.3 Cell Switching (ATM)
Asynchronous Transfer Mode (ATM) connection-oriented packet-switched network used in both WAN and LAN settings Q.2931: signaling (connection setup) protocol
discover a suitable route across an ATM network allocate resources at the switches along the
circuit (to ensure the circuit a particular quality of service)
Cells packets are called cells (fixed length)
53-byte cell = 5-byte header + 48-byte payload commonly transmitted over SONET
other physical layers possible
Variable vs. fixed-length packets no optimal packet length
if small: high header-to-data overhead if large: low utilization for small messages
variable-length packets lower bound
minimum amount of information that needs to be contained in the packet
typically a header with no optional extensions
upper bound (set by a variety of factors) the maximum FDDI packet size, for example,
determines how long each station is allowed to transmit without passing on the token, and thus determines how long a station might have to wait for the token to reach it
fixed-length packets it is easier to build hardware to do simple jobs, and the
job of processing packets is simpler when you already know how long each one will be
if all packets are the same length, then each switching element takes the same time to do its job
packet vs. cell most packet-switching technologies use variable-length
packets cells are fixed in length and small in size
Big vs. Small Packets
Small improves queue behavior cell: finer control over the behavior of queues examples
cell cell length = 53 bytes link speed = 100Mbps the longest wait ≈ 53 × 8/100 = 4.24μs [exactly
4.04μs] for ATM
variable-length packets maximum packet length = 4KB link speed = 100Mbps transmission time ≈ 4096 × 8/100 = 327.68μs
[exactly 312.5μs] a high priority packet that arrives just after the
switch starts to transmit a 4KB packet will have to sit in the queue 327.68μs waiting for access to the link
cell: shorter queue length than that of packets when a packet begins to arrive in an empty queue,
the switch have to wait for the whole packet to arrive before it can start transmitting the packs on an outgoing link
it means that the link sits idle while the packet arrives
if you imagine a large packet being replaced by a “train” of small cells, then as soon as the first cell in the train has entered the queue, the switch can transmit it
example variable-length packets
if two 4-KB packets arrived in a queue at about the same time
the link would sit idle for 327.68μs while these two packets arrive
at the end of that period we would have 8KB in the queue
Cell if those same two packets were sent as trains of
cells, then transmission of the cells could start 4.24μs after the first train started to arrive
at the end of 327.68μs, the link would have been active for a little over 323μs (≈ 327.68μs – 4.24μs)
there would be just over 4 KB of data left in the queue, not 8KB as before
shorter queues mean less delay for all the traffic
Small improves latency (for voice) voice digitally encoded at 64Kbps (8-bit samples at 8KHz) need full cell’s worth of voice samples before transmitting a
cell a sampling rate of 8KHz means that 1 byte is sampled every
125μs (= (1/8000) * 103), so the time it takes to fill an n-byte
cell with samples is n × 125μs a 1000-byte cells implies 125ms to collect a full cell of
samples before you even start to transmit it to the receiver
Cell Format
Two different cell formats User-Network Interface (UNI) format
host-to-switch format interface between a telephone company and one of its
customers
Network-Network Interface (NNI) format switch-to-switch format interface between a pair of telephone companies
GFC HEC (CRC-8)
4 16 3 18
VPI VCI CLPType Payload
384 (48 bytes)8
ATM cell format at the UNI
User-Network Interface (UNI) GFC (4 bits): Generic Flow Control(not wildly used)
Provide a means to arbitrate access to the link if the local site used some shared medium to connect to ATM
VPI (8 bits): Virtual Path Identifier VCI (16 bits): Virtual Circuit Identifier
For now, we can think of them as a single 24-bite identifier that is used to identify a virtual conncection
Type (3 bits): management, user data CLP (1 bit): Cell Loss Priority
A user or network element may set this bit to indicate cells that should be dropped preferentially in the event of overload
User-Network Interface (UNI) ….
HEC (8 bits): Header Error Check (CRC-8) Protecting the cell header is particularly important because an error
in the VCI will cause the cell to be misdelivered
Network-Network Interface (NNI) GFC becomes part of VPI field (no GFC and becomes 12-bit
VPI)
ATM Headers
Architecture of an ATM network
Segmentation and Reassembly Segmentation and Reassembly (SAR)
in ATM, the packets handed down from above are often larger than 48 bytes, and thus, will not fit in the payload of an ATM cell
solution fragment the high-level message into low-level
packets at the source transmit the individual low-level packets over
the network reassemble the fragments back together at the
destination
Segmentation is not unique to ATM But it is much more of a problem than in a
network with a maximum packet size of, say, 1,500 bytes
To address the issue, ATM Adaptation Layer was added
ATM Adaptation Layer (AAL) a protocol layer sits between ATM and the variable-
length packet protocols that might use ATM, such as IP
the AAL header simply contains the information needed by the destination to reassemble the individual cells back into the origins message
■ ■ ■ ■ ■ ■
AAL
ATM
AAL
ATM
Segmentation and reassembly in ATM
Because ATM was designed to support all sorts of services, including voice, video, and data, it was felt that different services would have different AAL needs
ATM Adaptation Layer (AAL) AAL1 and 2 designed for applications that need
guaranteed bit rate (e.g., voice, video) AAL 3/4 designed for packet data
AAL3 used by connection-oriented packet services (such as X25)
AAL4 used by connectionless services (such as IP) AAL5 is an alternative standard for packet data
AAL3 and AAL4 are merged into one known as AAL ¾, so there are now four AALs
ATM Layers
ATM Layers in Endpoint Devices and Switches
AAL 3/4
AAL3/4 provide enough information to allow variable-length packets
to be transported across ATM network as a series of fixed-length cells
AAL supports the segmentation and reassembly process the task of segmentation / reassembly involves two different
packet formats Convergence Sublayer Protocol Data Unit (CS-PDU)
[AAL3/4 packet format] ATM cell format for AAL3/4
Convergence Sublayer Protocol Data Unit (CS-PDU) PDU (Protocol Data Unit), a new name for packet CS-PDU defines a way of encapsulating variable-length
PDUs prior to segmenting them into cells the PDU passed down to the AAL layer is encapsulated by
adding a header and a trailer [CS-PDU header and a trailer], and the resultant CS-PDU is segmented into ATM cells
CPI Btag BASize Pad 0 Etag Len
8 16 0─24 8 8 16< 64 KB8
User data
PDUAAL 3/4 packet
(CS-PDU)ATM cell
encapsulate(header &
trailer)
segment
Encapsulation and segmentation
CS-PDU format CPI (8 bits): Common Part Indicator (version field)
Only the value 0 is currently defined
Btag/Etag (8 bits): Beginning and Ending tag BASize (Buffer Allocation) (8 bits): a hint to the reassembly
process as to how much buffer space to allocate for the reassembly
Pad: make sure the data is one byte less than a multiple of 4 bytes
0-filled byte ensures the trailer is 32 bite Len (16 bits): length of the PDU
CPI Btag BASize Pad 0 Etag Len
8 16 0─24 8 8 16< 64 KB8
User data
CPI Btag BASize Pad 0 Etag Len
8 16 0─24 8 8 16< 64 KB8
User data
ATM Adaptation Layer 3/4 packet format
In addition to CS-PDU header and trailer, AAL 3/4 specifies a AAL 3/4 header and trailer [AAL 3/4 header and trailer]
The CS-PDU is segmented into 44-byte chunks, an AAL 3/4 header and trailer is attracted to each one, bringing it up to 48 bytes, which is then carried as the payload of an ATM cell
CS-PDUheader
CS-PDUtrailer
User data
44 bytes 44 bytes 44 bytes 44 bytes
ATM header
AAL header
Cell payload
AAL trailer
Padding
ATM Cell Format (AAL3/4) Type (2 bits) SEQ (4 bits)
sequence number detect cell loss or misordering
MID (10 bits) multiplexing identifier multiplex several PDUs onto a single connection
Payload (352 bits;44 bytes) the segmented CS-PDU chunk
ATM header Length CRC-10
40 2 4
SEQ MIDType Payload
352 (44 bytes)10 6 10
AAL3/4 Type field
Length (6 bits) number of bytes of PDU contained in this cell it must be 44 for BOM and COM cells
CRC-10 error detection anywhere in the 48-byte cell payload
ATM header Length CRC-10
40 2 4
SEQ MIDType Payload
352 (44 bytes)10 6 10
ATM header Length CRC-10
40 2 4
SEQ MIDType Payload
352 (44 bytes)10 6 10
ATM cell format for AAL3/4
5 bytes
48 bytes5 bytes
2 bytes 44 bytes 2 bytes
ATM header
Payload
AAL 3/4 trailer44-byte chunkAAL 3/4 header
ATM header
Encapsulation and segmentation for AAL3/4 the user data is encapsulated with the CS-PDU header
and trailer the CS-PDU is then segmented into 44-bye payloads,
which are encapsulated as ATM cells by adding the AAL3/4 header and trailer as well as the 5-byte ATM header
the last cell is only partially filled whenever the CS-PDU is not an exact multiple of 44 bytes
CS-PDUheader
CS-PDUtrailer
User data
44 bytes 44 bytes 44 bytes 44 bytes
ATM header
AAL header
Cell payload
AAL trailer
Padding
CS-PDUheader
CS-PDUtrailer
User data
44 bytes 44 bytes 44 bytes 44 bytes
ATM header
AAL header
Cell payload
AAL trailer
Padding
Encapsulation and segmentation for AAL3/4
With 44 bytes of data to 9 bytes of header, the best possible bandwidth utilization would be 83%
AAL5
PDUAAL 5 packet
(CS-PDU)ATM cellencapsulate segment
CS-PDU Format Data portion trailer (8-byte)
2-byte Reserved + 2-byte Len + 4-byte CRC-32
Pad (up to 47 bytes) so trailer always falls at end of ATM cell
Length size of PDU (data only)
CRC-32 (detects missing or misordered cells)
CRC-32
< 64 KB 0─47 bytes 16 16
ReservedPad Len
32
Data
CRC-32
< 64 KB 0─47 bytes 16 16
ReservedPad Len
32
Data
ATM Adaptation Layer 5 packet format
Encapsulation and segmentation for AAL5 user data is encapsulated to form a CS-PDU the resulting PDU is then cut up into 48-byte
chunks, which are carried directly inside the payload of ATM cells without any further encapsulation
User data
48 bytes 48 bytes 48 bytes
ATM header Cell payload
Padding
CS-PDUtrailer
User data
48 bytes 48 bytes 48 bytes
ATM header Cell payload
Padding
CS-PDUtrailer
Encapsulation and segmentation for AAL5
3.3.3 Virtual Path ATM uses a 24-bit identifier for vircuit circuits
8-bit virtual path identifier (VPI) 16-bit virtual circuit identifier (VCI)
Example a corporation has two sites that connect to a public ATM
network, and that at each site the corporation has a network of ATM switches
we could establish a virtual path between two sites using only the VPI field
within the corporate sites, however, the full 24-bit space is used for switching
Public network
Network BNetwork A
Public network
Network BNetwork A
Example of a virtual path
Advantage of virtual path although there may be thousands or millions of
virtual connections across the public network, the switches in the public network behave as if there is only one connection
there needs to be much less connection-state information stored in the switches, avoiding the need for big, expensive tables of per-VCI information
TP 、 VPs 、 and VCs
Example of VPs and VCs
Connection Identifiers
Virtual Connection Identifiers in UNIs and NNIs
ATM Cell
Routing with a Switch
3.3.4 Physical Layers for ATM
In the ATM standard, it is assumed that ATM would run on top of a SONET physical layer many ATM-over-SONET products
Actually, these two are entirely separable, e.g., lease a SONET link from a phone company and
send whatever you want over it, including variable-length packets
send ATM cells over many other physical layers instead of SONET, e.g., over Digital Subscriber Line (DSL) links of various types
3.4 Implementation and Performance
A very simple way to build a switch buy a general-purpose workstation and equip it with
a number of network interfaces run suitable software to receive packets on one of
its interfaces perform any of the switching functions send packets out another of its interfaces
I/O bus
Interface 1
Interface 2
Interface 3
CPU
Main memory
I/O bus
Interface 1
Interface 2
Interface 3
CPU
Main memory
A workstation used as packet switch
The figure shows a workstation with three network interfaces used as a switch a path that a packet might take from the time it
arrives on interface 1 until it is output on interface 2
I/O bus
Interface 1
Interface 2
Interface 3
CPU
Main memory
we assume DMA (Direct Memory Access) the workstation has a mechanism to move data directly
from an interface to its main memory, i.e., direct memory access (DMA)
once the packet is in memory, the CPU examines its header to determine on which interface the packet should be out it then uses DMA to move the packet out to the
appropriate interface the packet does not go to the CPU because the CPU
inspects only the header of the packet
I/O bus
Interface 1
Interface 2
Interface 3
CPU
Main memory
Main problem with using a workstation as a switch its performance is limited by the fact that all packets
must pass through a single point of contention in the example shown, each packet crosses the I/O bus
twice and is written to and read from main memory once
the upper bound on aggregate throughput of such a device is, thus, either half the main memory bandwidth or half the I/O bus bandwidth, whichever is less (usually it’s the I/O bus bandwidth)
I/O bus
Interface 1
Interface 2
Interface 3
CPU
Main memory
example a workstation with a 133-MHZ, 64-bit wide I/O bus can
transmit data at a peak rate of a little over 8 Gbps (= 133 × 220 × 64)
since forwarding a packet involves crossing the bus twice, the actual limit is 4 Gbps
this upper bound also assumes that moving data is the only problem a fair approximation for long packets a bad one when packets are short
the cost of processing each packet- (1) parsing its header and (2) deciding which output link to transmit it on-is likely to dominate
example, a workstation can perform all the necessary processing to switch 1 million packets each second (packet per second (pps) rate)
if the average packet is short, say, 64 bytes throughput = pps × (bits per packet)
= 1 × 106 × 64 × 8 (bits per second)
= 512 × 106 (bits per second) this 512 Mbps would be shared by all users connected to
the switch example, a 10-port switch with this aggregate throughput
would only be able to cope with an average data rate of 51.2 Mbps on each port
To address this problem a large array of switch designs that reduce the
amount of contention and provide high aggregate throughput
some contention is unavoidable if every input has data to send to a single output, then
they cannot all send it at once if data destined for different outputs is arriving at
different inputs, a well-designed switch will be able to move data from inputs to outputs in parallel, thus increasing the aggregate throughput
Switchfabric
Controlprocessor
Outputport
Inputport
3.4.1 Ports
Switch
fabric
Control
processor
Output
port
Input
port
A 4 × 4 switch
The 4 × 4 switch in the figure consists of ports (input ports and output ports)
communicate with the outside world contain fiber-optic receivers and buffers to hold packets that
are waiting to be switched or transmitted, and often a significant amount of other circuitry that enables the switch to function
switch fabric when presented with a packet, deliver it to the right output
port control processor (at least one)
in charge of the whole switch
Switchfabric
Controlprocessor
Outputport
Inputport
Input port the first place to look for performance bottlenecks has to receive a steady stream of packets, analyze
information in the header of each one to determine which output port (or ports) the packet must be sent and pass the packet on to the fabric
Another key function of ports: buffering it can happen in either the input or the output port it can also happen within the fabric (sometimes
called internal buffering)
simple input buffering has some serious limitations example, an input buffer implemented as a FIFO as packets arrive at the switch, they are placed in the
input buffer the switch then tries to forward the packets at the front
of each FIFO to their appropriate output port if the packets at the front of several different input
ports are destined for the same output port at the same time, then only one of them can be forwarded; the rest must stay in their input buffers
Switch
2
21
Port 1
Port 2
Simple illustration of head-of-line blocking
drawback (head-of-line blocking) occurs at input buffering those packets left at the front of the input buffer prevent
other packets further back in the buffer from getting a chance to go to their chosen outputs
buffering wherever contention is possible
input port (contend for fabric) internal (contend for output port) output port (contend for link)
Switch
2
21
Port 1
Port 2
3.4.2 Fabrics
Should be able to move packets from input ports to output ports with minimal delay and in a way that meets the throughput goals of the switch Means fabrics display some degree of parallelism
Parallelism a high-performance fabric with n ports can often
move one packet from each of its n ports to one of the output ports at the same time
Types of fabric shared bus shared memory crossbar self-routing
Shared bus found in a conventional workstation used as a switch the bus bandwidth determines the throughput of the switch,
high-performance switches usually have specially designed busses rather than the standard busses found in PCs
Shared memory packets are written into a memory location by an input port
and then read from memory by the output ports the memory bandwidth determines switch throughput, so
wide and fast memory is typically used in this sort of design it usually uses a specially designed, high-speed memory bus
Crossbar a matrix of pathways that can be configured to
connect any input port to any output port in their simplest form, they require each output port
to be able to accept packets from all inputs at once Main problem: each port would have a memory
bandwidth equal to the total switch throughput
A 4 × 4 crossbar switches
Self-routing rely on some information in the packet header to direct
each packet to its correct output usually a special “self-routing header” is appended to the
packet by the input port after it has determined which output the packets needs to go to
this extra header is removed before the packet leaves the switch
self-routing fabrics are often built from large numbers of very simple 2×2 switching elements interconnected in regular patterns (i.e., banyan switching fabric)
001
011
110
111
001
011
110
111
000001010011
100101110111
Switch fabric
Output port
Input port
Original packetheader
Switch fabric
Output port
Input port
Self-routing header
Switch fabric
Output port
Input port
(a)
(b)
(c)
A self-routing header is applied to a packet at input to enable the fabric to send the packet to the correct output, where it is removed: (a) packet arrives at input port; (b) input port attaches self-routing header to direct packet to correct output (c) self-routing header is removed at output port before
packet leaves switch
Banyan Network constructed from simple 2 x 2 switching elements self-routing header attached to each packet elements arranged to route based on this header
look at 1 bit in each self-routing header route packets toward the upper output if it is zero or
toward the lower output if it is one
001
011
110
111
001
011
110
111
000
001
010
011
100
101
110
111
if two packets arrive at the same time and both have the bit set to the same value, then they want to be routed to the same output and a collision will occur
the banyan network routes all packets to the correct output without collisions if the packets are presented in ascending order
Routing packets through a banyan network. The 3-bit numbers represent values in the self-routing headers of four arriving packets.
001
011
110
111
001
011
110
111
000
001
010
011
100
101
110
111