Cnet Architecture in NetFPGA

7/28/2019 Cnet Architecture in NetFPGA

1/27

CNET Architecture

Last modified: 05/31/06

Greg Watson ([email protected])

This document describes the architecture of the CNET device one oftwo FPGAs used in the NetFPGA board.

See http://klamath.stanford.edu/NetFPGA/

Table of Contents

CNET microarchitecture........................................................................2Clock domains.....................................................................................3

Write data path....................................................................................5Read Data Path....................................................................................6CNET/CPCI Bus protocol.....................................................................7

Write Transaction.............................................................................7Read Transaction Register read.....................................................8

Read Transaction Packet read..................................................9MAC/Core interface.............................................................................9

MAC Transmit logic..........................................................................9MAC Receive logic..........................................................................10

Ingress FIFO arbitration....................................................................11SRAM interface.................................................................................13

Write protocol.................................................................................14Read protocol.................................................................................14

DMA FIFO..........................................................................................14Ingress FIFO Controller.....................................................................16Diagnostics........................................................................................18

Clock checks...................................................................................18Appendix A CNET Address Map........................................................19

CNET Registers.................................................................................19Tx FIFOs............................................................................................26PHY Interface....................................................................................26

To write to a PHY register:.............................................................26To read from a PHY register:..........................................................26


2/27

CNET microarchitecture

The main block diagram for the CNET device is shown in figure 1 be-low. Subsequent sections provide more detail on each block.

Figure 1 CNET block diagram

MAC 0

SRAMInterface

CPCIInterface

and registers

Tx FIFO

Rx FIFO INGRESSARBITER

MAC 1Tx FIFO

Rx FIFO

MAC 2Tx FIFO

Rx FIFO

MAC 3Tx FIFO

Rx FIFO

INGRESSFIFO

CONTROLLER

DMAFIFO

CNET


3/27

Clock domains

The CNET device is complicated by the presence of the four MACs.Each MAC requires five clocks, three of which are common to all MACsand two of which are unique to each MAC. See the section MultipleCores in the chapter Special Design Considerations of [UG-138]

The clocking structure is shown in figure 2. (Note: the domains showup in color!)


4/27

Figure 2 The clock domains used in the CNET device

IBUFG

GTX_CLK (125MHz)

CLK(62.5MHz)?

CLOCK

LOGIC

IBUF

MII_TX_CLK

MAC 0GMII_TX_CLK

GMII_RX_CLK

RXCORECLK

TXCORECLK

HOST_CLK

CLOCKLOGIC

MAC 0GMII_TX_CLK

GMII_RX_CLK

RXCORECLK

TXCORECLK

HOST_CLK

CLOCKLOGIC

MAC 0

GMII_TX_CLK

GMII_RX_CLK

RXCORECLK

TXCORECLK

HOST_CLK

CLOCKLOGIC

MAC 0GMII_TX_CLK

GMII_RX_CLK

RXCORECLK

TXCORECLK

HOST_CLK

CLOCKLOGIC


5/27

Write data path

Figure 3 shows the write data path for the CNET device.

Write transactions occur through the PCI bus and are terminated withinthe CPCI device. Note: write transactions may also occur as a result ofa PCI DMA read (transferring a packet from kernel memory to the rele-vant MAC Tx FIFO). The CPCI needs to arbitrate between these access-es, though in practice they should be independent.

Internally, the CNET will separate writes into two types: writes to FI-FOs (packet data) and writes to everything else (registers).

Writes to FIFOs should not be initiated by the CPCI unless there isspace in the relevant FIFO for a complete packet (via the Pro-grammable Nearly Full signals). Thus writes to FIFOs should simplystream through the interface as there is no conflict. In addition, theCPCI needs to use the address bits to indicate additional information

Figure 3 Write path in CNET

MAC 0

MAC 1

MAC 2

MAC 3

Prog Almost Full

(space for >= 1max pkt) ALMOSTFULL

(WR_RDY)

Registers

CPCI device

(terminates PCI write transactions)

CNET ADDR DATA CLK(62.5MHz)


6/27

with each FIFO write: this includes the number of valid bytes in the 32-bit word, and whether this is the last word in the packet (EOP). The ad-dress map is shown in Appendix A.

Writes will be stored in a FIFO (shown). In general this is not required,however some of the write timing is not yet determined and writes to

PHY registers (MDC/MDIO) are very slow. Before passing write data tothe CNET the CPCI device should check the Almost Full signal from theCNET. If Almost Full is asserted then the CPCI should discard the writetransaction and signal an appropriate error (mechanism TBD: registeror interrupt or bus retry).

The actual protocol between CNET and CPCI is described in a later sec-tion.

Read Data Path

Figure 4 shows the read data path from the CNET to the CPCI.

There are two read data paths: one for register reads and a separatephysical bus for packet DMA transfers to kernel memory.

The register read path uses the same address/data pins as for the writepath described in the previous section. The read protocol is a simple re-quest/grant handshake and is described in the following section.

Figure 4 Read data path

CPCI_ADDR

CPCI_DATA

CPCI device

CNET

CLK

(62.5MHz)

Read datafrom registers,FIFOs, etc.

Addressdecode

CPCI_RD_RDY

CPCI_REQ and

CPCI_RD_WR_L

__

RD/WR,

CPCI_

DMA_

NEARLY_

FUL

L

Data from

external SRAM

CPCI_DMA_DATA


7/27

Packet reads (DMA writes to kernel memory) require a separate path inorder to make the transfer efficient. Once the CPCI has started a burstwrite to kernel memory then it must ensure that it does not underrun(for efficiency purposes). On the basis that Pull protocols are slow,Push protocols are fast, the CNET device effectively pushes the pack-

et to the CPCI. The kernel driver initiates the transfer by writing to aregister in the CNET device. This causes the CNET to push the packetinto the CPCI's PKT_DATA FIFO.

The actual size of the various FIFOs is TBD but will depend on busthroughput between the devices.

CNET/CPCI Bus protocol

Register reads and writes, and packet writes (transmit packets) arehandled through the CNET/CPCI Bus. Packet reads (ingress packets)are handled by a separate interface which is described later.

Write Transaction

Write transactions are optimized for burst writes to the packet Tx FI-FOs. A single write is thus just a very short burst write. The waveformis shown in figure 5 below.

All signals are CPCI->CNET except WR_RDY which is CNET->CPCI andDATA which is bi-directional (but always driven by CPCI during awrite).

From the writer's viewpoint (CPCI) this looks like a FIFO interface provided that WR_RDY is high then the CPCI can write data. Data is ac-cepted at every rising clock edge that REQ and WR_RDY are both high.

Note: WR_RDY may be de-asserted for many cycles.

Figure 5 Write transactions

CPCI_REQ

CPCI_WR_RDY

CPCI_ADDR

CPCI_DATA

CLK62

D

A

D

A

D

A

D

A

T1 T2 T3 T4 T5

__

CPCI_RD_WR


8/27

Read Transaction Register read

A read transaction is simple a single read request is followed with a32-bit read response. The REQ and RD_RDY lines provide a full four-way handshake.

All signals are CPCI->CNET except RD_RDY which is CNET->CPCI andDATA which is bi-directional.

There may be many cycles from issuing the request to receiving the da-ta.

The CPCI must not issue another read REQ until it has seen RD_RDYde-asserted at the end of the current transaction.

The complete set of signals used for the CPCI bus interface are in thetable below.

CPCI Signal CNET Width Description

In CLK62 In 1 System Clock (62.5MHz)

Out CPCI_REQ In 1 Request

Out CPCI_RD_WR_L In 1 Read (1) or Write (0)

Out CPCI_ADDR In 24 Address

InOut CPCI_DATA InOut 32 Data

In CPCI_TX_FULL Out

4

Indicates if Tx FIFO hasspace for a max packet (0= space, 1= not enoughspace)

In CPCI_RD_RDY Out 1 Read data is ready

In CPCI_WR_RDY Out 1 Write is accepted

Figure 6 Read Transaction

__

CPCI_RD_WR

CPCI_REQ

CPCI_RD_RDY

CPCI_ADDR

CPCI_DATA

CLK62

ADDR

DATA


9/27

Read Transaction Packet read

A packet read is initiated by the DMA controller in the CPCI.

See the section DMA FIFO on page 14 for more details of how packetsare transferred from the CNET to the CPCI device.

MAC/Core interface

This section describes the interface exported to the core from eachMAC. It is divided into transmit and receive sections. Note: this is theinterface to be used in both Control and User applications.

The Management interface (stats, configuration, PHY, etc) is not shownbut will be driven from the CNET/CPCI bus interface.

MAC Transmit logic

The transmit logic consists of an asymmetric FIFO and a controlling

state machine as shown in figure 7.

The FIFO is written to from the core side with data 36 bits wide, withbits 35,26,17,8 being 1 iff the corresponding byte is the final byte of thepacket. See NetFPGA Architecture document for more details.

The NEARLY_FULL signal is asserted (high) when there remains insuf-ficient space in the FIFO for a maximum sized packet. The actual valueat which NEARLY_FULL is asserted is not yet decided, but will be amaximum sized packet plus some extra to allow for latency between theCNET and CPCI, so it will be 1518 + latency_clocks*4 bytes. The NEAR-

Figure 7 Transmit interface

DOUT[8:0]

UNDERFLOW

Tx CLK 125

RD_EN

TX data vld

DOUT[8]

EOP

MAC_Tx_SM

MAC

DIN[35:0]

NEARLY FULL

WR_EN

DIN[35,26,17,8](EOP)

WR_EN

PKT_SENT_OK

PKT_UNDERRUN

CORE_CLK_62.5

FULL


10/27

LY_FULL signals must be synchronized to the PCI clock domain insidethe CNET (adding some latency).

The Transmit state machine (MAC_Tx_SM) will initiate packet transmis-sion to the actual MAC once the EOP has been observed on the ingressside of the FIFO. Data will be read out until either EOP is observed on

the egress side, or else an underrun occurs.

Once a packet has been transmitted then either PKT_SENT_OK orPKT_UNDERRUN will be pulsed high for one clock.

MAC Receive logic

The receive logic is shown below. It requires as large a FIFO as possi-ble because the MAC will deliver an incoming packet as soon as itstarts to arrive the MAC has no buffering.

The MAC_Rx_SM state machine manages the receive FIFO. The sameformat is used as is used in the transmit direction - each byte is associ-ated with an extra bit which, if set to one, indicates the last byte of thepacket.

Figure 8 Receive Interface

RX MACDATA[7:0]

VALID

GOOD

BAD

RX_CLK

DIN[7:0]

UNDERRUN

RD_EN

DOUT[35:0]

ALMOST FULL

WR_EN

EMPTY

DIN[8]

MAC_Rx_SM

DOUT[35,26,17,8]

(EOP) PKT_AVAIL

CORE_CLK_62.5

Notes:1. SM MUST store multiple of 4 bytes.2. SM MUST store EOP even in overrun condition.3. SM MUST NOT start to store packet unless

ALMOST_FULL == 04. SM MUST always store one extra 36-bit word

at the end of each packet. 0 = GOOD 1 = BAD.


11/27

Again the FIFO is asymmetrical it is 9 bits on the write side and 36bits on the read side. Consequently the state machine must alwayswrite bytes in groups of four. If the last byte of the packet is byte 65,then three additional pad bytes must be written.

Also, one extra full 36-bit status word is always stored after the last

data word. This serves two purposes:

1. Bit 0 of the status word indicates if the packet was good (0) or bad(1). The read side must always read out an entire packet and the sta-tus word will then indicate whether the packet should be kept or dis-carded.

2. It provides the read process with one extra clock cycle in which tode-assert RD_EN after seeing the EOP bit.

So, for example, if a bad 65-byte packet was received then the read pro-cess would see the last few words as:

35 34.....27 26 25.....18 17 16.....9 8 7......0

word 16 0 0 0 0

word 17 0 0 0 1

word 18 0 0 0 0 < 0x1 >

On the read side the reading process should check PKT_AVAIL to see ifthere is at least one packet available. The PKT_AVAIL signal will go low(invalid) on the cycle after any of the four EOP bits go high, and willthen be valid on the following cycle, as shown in figure 9.

Note: there are two reasons for the status word indicating a bad pack-et. The first is that the MAC saw a bad FCS. The second is that there

was insufficient space in the FIFO to store the packet and so some ofthe packet was discarded.

Ingress FIFO arbitration

The four ingress MACs contend for access to the SRAM, and so an ar-biter is needed. The arbiter will service the MACs in round-robin order,with no distinction to packet length the arbiter will always service a

Figure 9 PKT_AVAIL timing

PKT_AVAIL

EOP

CLK


12/27

complete packet.

Figure 10 shows the main signals used by the arbiter. A separateIngress FIFO Controller manages the various queue pointers and pro-vides relevant signals for each queue, where represents the queuenumber (N=0..3).

The arbiter reads a packet from the RX MAC FIFO and stores it in theappropriate location in the SRAM. The last word indicates if the packetwas good or bad. If bad, then the WR_INCR signal is not pulsed.

The arbiter will store the packet in the SRAM at the offset specified byWR_PTR_. The first word stored will be the length (in bytes) of thepacket. This word will also contain the ID of the MAC from which it wasreceived. The last word stored will contain the last byte of the packet(the good/bad indicator is not stored).

The FIFO SRAM is organized for simplicity rather than maximum uti-lization. It is split into 4 sections, one per MAC. Each section is then di-

Figure 10 The arbiter services each Rx MAC in round-robin order

WR_DATA[35:0]WR_ADDR[X:0]

WR_REQ

WR_RDY

RD_EN_1

DOUT_1 [35:0]

PKT_AVAIL_1

RD_EN_0

DOUT_0 [35:0]

PKT_AVAIL_0

RD_EN_2

DOUT_2 [35:0]

PKT_AVAIL_2

RD_EN_3

DOUT_3 [35:0]

PKT_AVAIL_3

RX MAC 0

RX MAC 1

RX MAC 2

RX MAC 3

INGRESSARBITER

WR_INCR_

WR_PTR_[X:0]

FULL_

INGRESS FIFO CONTROLLER

SRAM INTERFACE


13/27

vided into chunks of 2KBytes large enough to hold a maximum sizedpacket. The SRAM size is TBD, but assuming 2MB then each MACshould be able to store 256 packets.

SRAM interface

The SRAM is managed by the SRAM interface logic. This acts as an ar-biter between two write ports and two read ports as shown in figure 11.

The write and read ports are independent, though in practice the CPUinterface will use both one read and one write port.

In order to achieve high throughput the interface logic might imple-ment an internal FIFO of requests (reads and writes). Consequentlyread data might be delayed by several clocks after the read request isaccepted.

Figure 11 SRAM Interface

WR_0_DATA[35:0]

SRAM

WR_0_ADDR[X:0]

WR_0_REQ

WR_0_RDY

WR_1_DATA[35:0]

WR_1_ADDR[X:0]

WR_1_REQ

WR_1_RDY

RD_0_ADDR[X:0]

RD_0_REQ

RD_0_RDY

RD_0_DATA[35:0]RD_0_VLD

RD_1_ADDR[X:0]

RD_1_REQ

RD_1_RDY

RD_1_DATA[35:0]

RD_1_VLD


14/27

Write protocol

The write protocol is shown in figure 12. The writer issues a request.The interface logic will store the write information on each rising clockedge that WR_REQ and WR_RDY are both asserted (T2, T3, T5, and T6in the figure).

The actual write operation to SRAM may happen several clocks later.Writes and reads are performed in order.

Read protocol

The SRAM read protocol is shown in figure 13 below.

Note: the latency from when the address is latched to the data valid isshown as 2 cycles in the figure, but in practice may be longer RD_VLDmust be used to determine when the read data is valid.

DMA FIFO

The Linux driver is notified (or can read) when a packet has been

Figure 12 SRAM Interface - write protocol

WR_REQ

WR_RDY

ADDR

WR_DATA

CLK

D

A

D

A

T1 T2 T3 T4 T5

D

A

D

A

T6 T7

Figure 13 SRAM Interface - read protocol

RD_REQ

RD_RDY

ADDR

RD_DATA

CLK

D

AA A

DD D

RD_VLD


15/27

stored in the external SRAM (via the CPCI_DMA_PKT_AVAIL signal).The driver then must set up the DMA transfer from NetFPGA to kernelmemory. The driver must write to internal CPCI registers, specifyingwhich queue should be read from (it is always the packet at the head ofthe queue).

The CPCI must initiate the packet transfer by asserting the appropriateCPCI_DMA_SEND[X] signal. The DMA FIFO unit in the CNET will thentransfer the packet at the head of queue X from the SRAM to the FIFOin the CPCI. The entire packet will be transferred, including the length(in bytes) in the first word.

The CPCI can continue to assert CPCI_DMA_SEND[X] until it sees thefirst word transferred (CPCI_DMA_WR_EN = 1), at which time it shouldde-assert CPCI_DMA_SEND[X].

Once the CPCI has started to receive the packet from the CNET, thenthe CPCI can start to DMA the packet into kernel memory. The detailsof the DMA operation are decribed in [CPCI-ARCH].

If the CPCI asserts CPCI_DMA_NEARLY_FULL then the CNET will de-assert CPCI_DMA_WR_EN within two clocks (so the CPCI needs to al-low for this pipeline delay). See figure 15 for details.

Figure 14 DMA FIFO Controller

RD_INCR_

RD_PTR_[X:0]

EMPTY_

INGRESS FIFO CONTROLLER

DMA FIFO

RD_DATA[35:0]

RD_ADDR[X:0]

RD_REQ

RD_RDY

SRAM INTERFACE

RD_VLDCPCI_DMA_NEARLY_FULL

CPCI_DMA_SEND[3:0]

CPCI_DMA_PKT_AVAIL[3:0]

CPCI_DMA_DATA[31:0]

CPCI_DMA_WR_EN

CPCI


16/27

Ingress FIFO Controller

The Ingress FIFO Controller acts as a central queue manager for theingress FIFOs. It provides full/empty and pointer information to thewrite and read controllers, and also queue occupancy information forthe CPU. The CPU might need to read the RD_PTR in order to deter-mine the address of the first word of the head-of-queue packet (to ob-tain the length of that packet).

The protocol used for the various signals is shown below in figure 16.Note that the RD_PTR and WR_PTR point to the same location for bothfull and empty conditions.

The NUM_IN_Q signal indicates the actual number of packets in the FI-FO.

Figure 15 Timing diagram for ingress DMA interface

CPCI_DMA_PKT_AVAIL

CPCI_DMA_PKT_SEND

CLK

CPCI_DMA_WR_EN

CPCI_DMA_NEARLY_FULL


17/27

Figure 16 Ingress FIFO Controller protocol

FULL_N

WR_INCR_N

WR_PTR_N

CLK

AA-1

EMPTY_N

RD_INCR_N

RD_PTR_N A+1A B+1

B+2

B

B+1

MAXMAX-1 MAX-1 01 1NUM_IN_Q


18/27

Diagnostics

Clock checks

There are 7 clocks on the board:

PCI clock (33.33 MHz) can be checked via the registers in CPCI. sysclk (62.5 MHz) can be checked via the registers in CPCI.

Then there are 5 clocks used by the Ethernet MACs:

1 common GTX_CLK used by all four MACs.

1 receive clock per MAC.

The MAC clocks and sysclk can be checked via some diagnostic regis-ters in the CNET: MAC_CLK_CTRL and MAC_CLK_COUNTER

The CNET has a clock checker module (cnet_mac_clk_checker.v) thathas a counter connected to each of these 6 clocks. Before trying tocheck these clocks you must verify that the PCI clock is functioning(should be obvious if it isn't you will not be able to access the board!)

Then, to check the clocks do the following:

1. Stop the counters (set RUN bits to 0)

2. Clear the counters (set CLEAR to 1 and then 0)

3. Start the counters (set RUN bits to 1)

4. Stop the counters (set RUN bits to 0 again)

5. Read the counters by setting the SELECT bits to the counter you

want to read ( 0 = GTX_CLK, 1 = RX for MAC 0, 2 = RX for MAC 1,etc.)

e.g. run the counters for 1msec and you should see a value in the coun-ters of about 125,000 for the MAC clocks and about 62,500 for thesysclk. Allow for things such as clock precision (typically +/1 200ppm)and operating system timings.


19/27

Appendix A CNET Address Map

Each NetFPGA board occupies a 16MB memory space. Within thisspace the CNET sub-divides the address space into different areas:

Address range Size Function

00_0000 to 3F_FFFF 4MB CPCI

40_0000 to 4F_FFFF 1MB CNET registers

50_0000 to 5F_FFFF 1MB CNET Tx FIFOs

60_0000 to 6F_FFFF 1MB PHY interface (MDC/MDIO)

70_0000 to 7F_FFFF 1MB Not used

80_0000 to BF_FFFF 4MB SRAM1 (Queue SRAM - only 2MBpresent)

C0_0000 to FF_FFFF 4MB SRAM2 (Scratch SRAM only 2MBpresent)

CNET Registers

The CNET registers are located in address range 40_0000 to 4F_FFFFwithin the 16MB address space allocated to the board.


20/27

Register and byteaddress

Ac-cess

Value onreset

Function

ID

0x000

RO c4e7

Identifier.31:16 = Version.15:0 = 0xc4e7

Control

0x004

RW 0 Scratchpad (32 bit read/write)

Reset

0x008

RW 0 MAC reset. Write a 1 to the MACyou want to reset [3:0]. The resetwill automatically clear you do notneed to write a zero after a one.

This will read as zero.

Error

0x00C

RW 0 Indicates when a hardware error oc-curred in the Tx FIFO:

7:4 = 1 indicates a Tx packet under-

run error for that MAC3:0 = 1 indicates a Tx FIFO overrunfor that MAC.

These bits are sticky and will remain1 until you overwrite with a zero.Note: the error must be cleared (byresetting the MAC) before clearingthis bit.

The OR of this register is propagat-ed to the CPCI via pin CNET_ERR.

Enable

0x010

RW 0xFF07 Enable various subsystems withinthe CNET.

15:12 = Enable RX FIFO output (If 0then packets will remain in the RXFIFO)

11:8 = Enable Tx MAC transmis-sion. (If 0 then packets will remainin the Tx FIFO)

2 = Enable Debug bus tri-state.

1 = Enable Ingress Arbiter

0 = Enable Rx DMA

WR_SRAM1_EOP

0x0F0

RW 0 The SRAMs are 36 bits wide. Writeto this register [3:0] to specify thedata that will be written to bits35:32 of SRAM1 whenever the CPUwrites to SRAM1.

RD_SRAM1_EOP

0x0F4

RO - 3:0 contain the data from bits 35:32of the last read from SRAM1.


21/27


Ac-cess

Value onreset

Function

WR_SRAM2_EOP

0x0F8

RW 0 The SRAMs are 36 bits wide. Writeto this register [3:0] to specify thedata that will be written to bits

35:32 of SRAM2 whenever the CPUwrites to SRAM2.

RD_SRAM2_EOP

0x0FC

RO - 3:0 contain the data from bits 35:32of the last read from SRAM2.

MF_STATUS_0

0x100

RO - MAC FIFO Status for MAC 0:

25 = 1 if the Rx FIFO for this MACis empty24 = 1 if at least one packet is avail-able to be read out of the Rx FIFO.23:16 = number of packets waitingin Rx FIFO.

9 = 1 if the Tx FIFO for this MAC iscompletely full.8 = 1 if the Tx FIFO for this MACcannot accept a maximum sizedpacket (1518B).7:0 = number of packets waiting inTx FIFO.

MF_TX_PKTS_SENT_0

0x104

W

CoR

0 Number of packets transmitted onthis MAC (read and clear).

You can write to this counter.

MF_RX_PKTS_RCVD_0

0x108

W

CoR

0 Number of packets received by thisMAC (read and clear).


MF_RX_PKTS_LOST_0

0x10C

W

CoR

0 Number of packets lost at ingress bythis MAC due to FIFO full. (read andclear).


MAC_CONFIG_0

0x110

NOTE: ALL four MACs

use bits 1:0 of this regis-ter to specify the datarate.

RW 2 Configuration for MAC 0.

5 = 0 for full duplex, 1 for half du-plex DEFAULT: 04 = 1 if you supply FCS bytes on Txside, else 0. DEFAULT: 0

3 = 1 if you want Rx to provide theFCS bytes, else 0. DEFAULT: 02 = 1 if you want to enable Jumboframes, else 0. DEFAULT: 01:0 = MAC rate: 00 = 10 Mbit/s

01 = 100 Mbit/s10 = 1000 Mbits/s

(DEFAULT is 10 = 1Gb/s)


22/27


Ac-cess

Value onreset

Function

CNET_REG_MF_RX_PKTS_LOST_BAD_FCS_0

1x114

W

CoR

0 Number of packets lost due to badfcs

CNET_REG_MF_RX_PKTS_LOST_FULL_FIFO_0

1x118

W

CoR

0 Number of packets lost due to thereceive buffer being full

CNET_REG_MF_RX_GOOD_PKTS_RCVD_0

1x11C

W

CoR

0 Number of packets Received with-out any errors. i.e. Real number ofpackets received

CNET_REG_MF_RX_GOOD_BYTES_RCVD_0

1x120

W

CoR

0 Number of useful data bytes re-ceived

CNET_REG_MF_TX_BYTES_SENT_0

1x124

WCoR

0 Number of bytes sent

MF_STATUS_1

0x140

RO - See description for MAC 0

MF_TX_PKTS_SENT_1

0x144

W

CoR

0 See description for MAC 0

MF_RX_PKTS_RCVD_1

0x148

W

CoR


MF_RX_PKTS_LOST_10x14C

WCoR


MAC_CONFIG_1

0x150

NOTE: MAC data rate setby bits 1:0 of MAC_CON-FIG_0


5 = 0 for full duplex, 1 for half du-plex DEFAULT: 04 = 1 if you supply FCS bytes on Txside, else 0. DEFAULT: 03 = 1 if you want Rx to provide theFCS bytes, else 0. DEFAULT: 02 = 1 if you want to enable Jumboframes, else 0. DEFAULT: 01:0 not used.


1x154

W

CoR



1x158

W

CoR



23/27


Ac-cess

Value onreset

Function


1x15C

W

CoR



1x160

W

CoR



1x164

W

CoR


MF_STATUS_2

0x180


MF_TX_PKTS_SENT_2

0x184

W

CoR


MF_RX_PKTS_RCVD_2

0x188

W

CoR


MF_RX_PKTS_LOST_2

0x18C

W

CoR


MAC_CONFIG_2

0x190

NOTE: MAC data rate set

by bits 1:0 of MAC_CON-FIG_0


5 = 0 for full duplex, 1 for half du-plex DEFAULT: 04 = 1 if you supply FCS bytes on Txside, else 0. DEFAULT: 0

3 = 1 if you want Rx to provide theFCS bytes, else 0. DEFAULT: 02 = 1 if you want to enable Jumboframes, else 0. DEFAULT: 01:0 not used.


1x194

W

CoR



1x198

W

CoR



1x19C

W

CoR



1x1A0

W

CoR



24/27


Ac-cess

Value onreset

Function


1x1A4

W

CoR


MF_STATUS_3

0x1C0


MF_TX_PKTS_SENT_3

0x1C4

W

CoR


MF_RX_PKTS_RCVD_3

0x1C8

W

CoR


MF_RX_PKTS_LOST_3

0x1CC

W

CoR


MAC_CONFIG_30x1C0

NOTE: MAC data rate setby bits 1:0 of MAC_CON-FIG_0

RW 2 Configuration for MAC 3.5 = 0 for full duplex, 1 for half du-plex DEFAULT: 04 = 1 if you supply FCS bytes on Txside, else 0. DEFAULT: 03 = 1 if you want Rx to provide theFCS bytes, else 0. DEFAULT: 02 = 1 if you want to enable Jumboframes, else 0. DEFAULT: 01:0 not used.


1x1D4

W

CoR



1x1D8

W

CoR



1x1DC

W

CoR



1x1E0

W

CoR



1x1E4

W

CoR


RXQ_NUM_PKTS_0

0x200

RO 0 8:0 = Number of packets in SRAMfrom MAC 0 (0-256).


25/27


Ac-cess

Value onreset

Function

RXQ_POINTERS_0

0x204

RO 0 23:16 = Write pointer

7:0 = Read pointer

RXQ_NUM_PKTS_10x240


RXQ_POINTERS_1

0x244


7:0 = Read pointer

RXQ_NUM_PKTS_2

0x280


RXQ_POINTERS_2

0x284


7:0 = Read pointer

RXQ_NUM_PKTS_3

0x2C0

RO 0 8:0 = Number of packets in SRAM

from MAC 3 (0-256).

RXQ_POINTERS_3

0x2C4


7:0 = Read pointer

MAC_CLK_CTRL

0xf00

RW 0x3f00(cleared)

18:16 = Counter read select choose which counter's current val-ue will be read via theMAC_CLK_COUNTER address.

5 = RX MAC 34 = RX MAC 23 = RX MAC 12 = RX MAC 0

1 = TX MAC CLK0 = SYSCLK

13:8 = Clear counter. 1 = clearcounter, 0 = no effect. This over-rides the run bit.

5:0 = Run counter. 1 = counter isrunning. 0 = stopped. This is overri-den by the clear counter bit above.

MAC_CLK_COUNTER

0xf04

RO 0 23:0 = current value of counter se-lected by the Counter read selectbits in the MAC_CLK_CTRL register.

NOTE: counter 0 counts at62.5MHz; the others count at125MHz. They are only 24 bits wide,so be aware that they will overflowafter a few msecs.


26/27

Tx FIFOs

Address bits are used to indicate metadata about each word when theCPCI transfers packets to the Tx FIFOs (for transmission by the CNET):

Bit 7 1 = EOP else 0

Bits 5:4 MAC Id (0-3)

Bits 3:2 Number of bytes in final word (0=1, 1=2, 2=3, 3=4)(only valid when bit 7 EOP is set)

Bits 1:0 Always 0 (no byte addressing)

PHY Interface

The CNET device drives the MDC/MDIO pins that control the quad PHYdevice. The PHY itself has many internal registers; this section explainshow to access these registers via the CNET. Access to the PHY is

achieved via two registers: a command register (CMD) and a status reg-ister (STATUS). All PHY registers are 16 bits wide.

The sequence of operations are:

To write to a PHY register:

Write the appropriate address and data to the CMD register.

To read from a PHY register:

Write the appropriate address to the CMD register.

Read the STATUS register until the m.s.bit (bit 31) is 1. Then the 16 bitdata is in bits 15:0.

Note: the PHY interface is slow it can take 30secs to actually per-form a read. If you issue a read command and then read the STATUSregister and see that the msbit is 0 then you will also see a value in bits20:16 this is a field that will countdown from 31 to 0. Once thecounter reaches zero then the data should be valid.

The format of commands that you write to the CMD register is shownbelow.

Please see the appropriate PHY document for details of the PHY regis-

ters.


27/27

Bibliography

UG-138: Xilinx, Tri-Mode Ethernet MAC User Guide,

CPCI-ARCH: Glen Gibb, CPCI Architecture,

15:0 Write Data

0151631

20:16 PHY Register(0-31)

25:24 PHY Channel (0-3)

31 Command: 0=READ 1=WRITE

Cnet Architecture in NetFPGA

Documents

Transcript of Cnet Architecture in NetFPGA