Network interface card(nic)

15
Network Interface Card Written By:- Rishu

Transcript of Network interface card(nic)

Page 1: Network interface card(nic)

Network Interface CardWritten By:- Rishu

Page 2: Network interface card(nic)

● What is NIC ?

Network Interface Card (NIC) is also commonly referred to as an Ethernet card and network adapter and is an expansion card that enables a computer to connect to a network (such as Internet) using an Ethernet cable with a RJ-45 connector.

The network controller (NIC) implements the electronic circuitry required for communication within a network using a specific physical layer and data link layer standard such as Ethernet, Wi-Fi or Token Ring.

When building a LAN, a network interface card is installed in each computer on the network and each one must use the same architecture. For example, all the cards must be Ethernet cards, Token Ring cards, or an alternate technology. An Ethernet network interface card is installed in an available slot inside the computer, typically on the motherboard.

Page 3: Network interface card(nic)

● Basic Concept

The core idea behind NIC is having an intelligent data bus interface. The intelligent data bus interface has two parts:

One part is for transferring data between the network interface and the host bus interface (and vice versa) when the goal is to move the data as quickly as possible between the network and the host.

The other part is for performing manipulation of the data stream as the data is transferred from the network to the host (or vice versa).

Page 4: Network interface card(nic)

● Data ConversionNetwork cards change data from a parallel format (used by computers) to a serial format (necessary for data transfer over network cables) and then back again for received information.

Before the 'sending network card' transmits its data, it interacts electronically with the receiving card to resolve the following issues:

Maximum size of data blocks that will be sent.

Amount of data to send before confirmation.

Intervals of time between partial data transmissions.

Waiting period before sending confirmation.

Volume of data that each card may build up before releasing it to its CPU.

Data transmission speed

Both cards must accept and adjust to the other card's settings before data can be sent and received.

Page 5: Network interface card(nic)

● MAC AddressThe NIC assigns a unique Media Access Control (MAC) address (of 48 bits) to the machine, which is used to direct traffic between the computers on a network.

Page 6: Network interface card(nic)

● Network Interface Processing-Sending packet

step 1, the device driver first creates a buffer descriptor, which contains the starting memory address and length of the packet that is to be sent, along with additional flags to specify options or commands. If a packet consists of multiple non-contiguous regions of memory, the device driver creates multiple buffer descriptors.

The device driver then writes to a memory-mapped register on the NIC with information about the new buffer descriptors, in step 2.

In step 3, the NIC initiates one or more direct memory access (DMA) transfers to retrieve the descriptors.

PCI

Page 7: Network interface card(nic)

● Network Interface Processing-Sending packet

Then, in step 4, the NIC initiates one or more DMA transfers to move the actual packet data from the main memory into its transmit buffer using the address and length information in the buffer descriptors.

After the packet is transferred, the NIC sends the packet out onto the network through its medium access control (MAC) unit in step 5. The MAC unit is responsible for implementing the link-level protocol for the underlying network such as Ethernet.

Finally, in step 6, the NIC informs the device driver that the descriptor has been processed, possibly by interrupting the CPU.

Page 8: Network interface card(nic)

● Network Interface Processing-Recieving packet

In step 1, a packet arriving over the network is received by the MAC unit and stored in the NIC’s local receive buffer.

In step 2, the NIC initiates a DMA transfer of the packet into a preallocated main memory buffer.

In step 3, the NIC produces a buffer descriptor with the resulting address and length of the received packet and initiates a DMA transfer of the descriptor to the main memory, where it can be accessed by the device driver.

Finally, in step 4, the NIC notifies the device driver about the new packet and descriptor, typically through an interrupt. The device driver may then check the number of unused receive buffers in the main memory and replenish the pool for future packets.

PCI

Page 9: Network interface card(nic)

● Performance with 10GbE

With the introduction of 10-GbE, network I/O re-entered the “fast network, slow host” scenario.

Three major system bottlenecks may limit the efficiency of high-performance I/O adapters:

PCI-X bus bandwidth,

CPU utilization and

memory bandwidth.

The performance of the PCI-X bus operating at 133 MHz, which has a peak bandwidth of 8.5 Gb/s, has been overcome by that of the PCI-Express (PCIe) bus, that has a peak bandwidth of 20 Gb/s using 8 lanes.

CPUs have definitely entered the multi-core era.

The memory data rate has increased from e.g. 533 MT/s (megatransfers per second) of the Front Side Bus (FSB) running at 533 MHz to 6.4 GT/s of the Intel QuickPath Interconnect.

Page 10: Network interface card(nic)

● Hardware Offload Functions

Modern network adapters, usually implement various kinds of hardware offload functionalities, where the kernel can delegate heavy parts of its tasks to the adapter. This is one of the most effective means available to improve the performance and free up the CPU.

TOE (TCP offload engine) is a technology used in NIC to offload processing of the entire TCP/IP stack to the network controller.

TCP segmentation offload (TSO) or Large Send Offload (LSO)- When a data packet larger than the MTU (Maximum Transmission Unit) is sent to the network adapter, the data must first be sub-divided into MTU-sized packets. With the old adapters this task was commonly performed at the kernel level, by the TCP layer of the TCP/IP stack. Offloading this work to the NIC is called TCP segmentation offload (TSO).

Large Receive Offload (LRO) - On the receiver side, another functionality assists the host in processing incoming TCP packets, by aggregating them at the NIC level into fewer larger packets. It may reduce considerably the number of physical packets actually processed by the kernel, hence offloading it in a significant way.

Page 11: Network interface card(nic)

● Hardware Offload FunctionsScatter-Gather I/O - The process of creating a packet ready to be transmitted through the network, starting from the transmission requests coming from the TCP layer, in general requires data buffering in order to assemble packets of optimal size, to evaluate checksums and to add the TCP, IP and Ethernet headers. This procedure can require a fair amount of data copying into a new buffer to make the final linear packet, stored in contiguous memory locations.

Checksum Offload - IP/TCP/UDP checksum is performed to make sure that the packet is correctly transferred, by comparing, at the receiver side, the value of the checksum field in the packet headers (set by the sender) with the value calculated by the receiver from the packet payload. The task of evaluating the TCP checksum can be offloaded to the network card (checksum offload).

Acknowledgment of packets as they are received by the far end, adding to the message flow between the endpoints and thus the protocol load.

Sliding window calculations for packet acknowledgement and congestion control.

Moving some or all of these functions to dedicated hardware, a TCP offload engine, frees the system's main CPU for other tasks.

Page 12: Network interface card(nic)

Receive Side Scaling (RSS)The phrase “Receive Side Scaling” (RSS) refers to the idea that all receive data processing is shared (scaled) across multiple processors or processor cores. Without RSS all receive data processing is performed by a single processor, resulting in less efficient system cache utilization.

With Receive-side Scaling, incoming traffic will be balanced across multiple processors while preserving the ordered delivery of packets.

Additionally, Receive-side Scaling allows the incoming traffic to be dynamically adjusted as the system load varies.

As a result, any application with heavy networking traffic running on a multi-processor server will benefit.

RSS is independent of the number of connections, allowing it to scale well. This will make RSS particularly valuable to web servers and file servers handling heavy loads of short-lived traffic.

Page 13: Network interface card(nic)

Receive Side Scaling (RSS)RSS can improve network system performance by reducing:

Processing delays by distributing receive processing from a NIC across multiple CPUs. This helps to ensure that no CPU is heavily loaded while another CPU is idle.

Spin lock overhead by increasing the probability that software algorithms that share data execute on the same CPU.

Spin lock overhead occurs, for example, when a function executing on CPU0 possesses a spin lock on data that a function running on CPU1 must access. CPU1 spins (waits) until CPU0 releases the lock.

Reloading of caches and other resources by increasing the probability that software algorithms that share data execute on the same CPU.

Such reloading occurs, for example, when a function that is executing and accessing shared data on CPU0, executes on CPU1 in a subsequent interrupt.

Page 14: Network interface card(nic)

Integrated NICAlthough TOE reduces the communication overhead between processors and NICs, it lacks scalability due to the limited processing and memory capacity. It also requires extensive modification of OS and development of firmware in NICs.

Recently, an alternative approach to integrating NICs onto CPUs has been shown to be more promising and is gaining more and more popularity

Integrating NICs not only reduces the latency of accessing I/O registers, but also leverages extensive resources in multi-core CPUs.

Integrated NIC can eliminate the overhead due to device driver and DMA descriptor management and data copy.

Page 15: Network interface card(nic)

● Improving efficiency

A partitioned memory organization enables low-latency access to control data and high-bandwidth access to frame contents from a high-capacity memory.

A novel distributed task-queue mechanism enables parallelization of frame processing across many low-frequency cores, while using software to maintain total frame ordering.

The addition of new atomic read-modify-write instructions reduces frame ordering overheads by 50%.