Hardware Design
description
Transcript of Hardware Design
This material exempt per Department of Commerce license exception TSU
Hardware Design
Hardware Design 2
Objectives
After completing this module, you will be able to:• List the functionality that defines an arbiter, a master, and a slave• List various buses available for the PowerPC processor• Discuss the JTAG interface in Virtex-II Pro devices
Hardware Design 3
Outline
• Supported Busses– Processor Local Bus (PLB)– On-chip Peripheral Bus (OPB)– Device Control Register (DCR)– On-Chip Memory Bus (OCM)
• Processor Use Models• PowerPC Processor Programmer’s Model • Reset Logic in PowerPC• JTAG Configurations in Virtex-II Pro
PowerPC Bus Example
PPC405
ISOCM
DSOCM
DSPLBISPLBINTC
BRAM
BRAMDDR
PLB ARB
BRAM
SDRAM
DCR
PLB2OPB
IIC
OPB ARB
GPIO
UART
Ethernet
LCD
BRAM
INTC
OPB2PLB
ISOCM Bus Data- 64 bits
Address- 32 bits
PLB Bus Data- 64 bits
Address- 32 bits
OPB Bus Data- 32 bits
Address- 32 bits
DCR Bus Data- 32 bits
Address- 10 bits
DSOCM Bus Data- 32 bits
Address- 32 bits
CoreConnect
• The IBM CoreConnect standard provides three buses for interconnecting cores, library macros, and custom logic:– Processor Local Bus (PLB)– On-chip Peripheral Bus (OPB)– Device Control Register (DCR) bus
• IBM offers a no-fee, royalty-free CoreConnect architectural license– Licenses receive the PLB arbiter, OPB arbiter, and PLB/OPB bridge
designs along with bus-model toolkits and bus-functional compilers for the PLB, OPB, and DCR buses
– Required only if you create your own CoreConnect peripheral or you are using Bus Functional Model (BFM)
Hardware Design 6
Outline
• Supported Buses– Processor Local Bus (PLB)– On-Chip Peripheral Bus (OPB)– Device Control Register (DCR)– On-Chip Memory (OCM)
• Processor Use Models• MicroBlaze Processor Programmer’s Model• MicroBlaze Configurations• PowerPC Processor Programmer’s Model • Reset Logic in PowerPC• JTAG Configurations in Virtex-II Pro
PLB Bus– Connection infrastructure for high-bandwidth master and slave devices– Fully synchronous to one clock– Centralized bus arbitration—PLB arbiter– 64-bit data bus– Addresses high-performance, low-latency, and design-flexibility issues
through:• Decoupled address and read and write data buses with split transaction capability• Concurrent read and write transfers yielding a maximum bus utilization
of two data transfers per clock• Address pipelining that reduces bus latency by overlapping a new write
request with an ongoing write transfer and up to three read requests with an ongoing read transfer
• Ability to overlap the bus request and grant protocol with an ongoing transfer
PLB Interconnect / Architecture
• One to 16 PLB masters, each connect all of their signals to the PLB arbiter
• The PLB arbiter multiplexes signals from masters onto a shared bus to which all the inputs of the slaves are connected
• One to n PLB slaves OR together their outputs to drive a shared bus back to the PLB arbiter
• The PLB arbiter handles bus arbitration and the movement of data and control signals between masters and slaves
PLB Bridge
• The PLB-to-OPB bridge translates PLB transactions into OPB transactions
• This bridge functions as a slave on the PLB side and a master on the OPB side
• The bridge contains a DCR slave interface to provide access to its bus error status registers
• The bridge is necessary in systems where a PLB master device, such as a CPU, requires access to OPB peripherals
Hardware Design 10
Outline
• Buses 101: Arbiter, Master, Slave– Processor Local Bus (PLB)– On-Chip Peripheral Bus (OPB)– Device Control Register (DCR)– On-Chip Memory (OCM)– Local Memory Bus (LMB)
• Processor Use Models• MicroBlaze Processor Programmer’s Model• MicroBlaze Configurations• PowerPC Processor Programmer’s Model • Reset Logic in PowerPC• JTAG Configurations in Virtex-II Pro
OPB Bus• The OPB bus decouples lower bandwidth devices from the PLB • It is a less complex protocol than PLB
– No split transaction or address pipelining capability
• Centralized bus arbitration—OPB arbiter• Connection infrastructure for the master and slave peripheral
devices• The OPB bus is designed to alleviate system performance
bottlenecks by reducing capacitive loading on the PLB– Fully synchronous to one clock– Shared 32-bit address bus, shared 32-bit data bus – Supports single-cycle data transfers between the master and the slaves – Supports multiple masters, determined by arbitration implementation – The bridge function can be the master on the PLB or OPB
OPB Bus
• Supports 16 masters and an unlimited number of slaves (limited by the expected performance)
• The OPB arbiter receives bus requests from the OPB masters and grants the bus to one of them
– Fixed and dynamic (LRU) priorities
• Bus logic is implemented with AND-OR logic. Inactive devices drives zeros
• Read and write data buses can be separated to reduce loading on the OPB_DBus signal
Hardware Design 13
Outline
• Buses 101: Arbiter, Master, Slave– Processor Local Bus (PLB)– On-Chip Peripheral Bus (OPB)– Device Control Register (DCR)– On-Chip Memory (OCM)
• Processor Use Models• PowerPC Processor Programmer’s Model • Reset Logic in PowerPC• JTAG Configurations in Virtex-II Pro
DCR Bus
• Device-control register bus– IBM CoreConnect standard– Used to talk to control registers (1024 total)– 32-bits-wide dataall cycles word-oriented– Supports read and write only, no burst cycles– Simple acknowledgement termination– CPU supports special privileged instructions for access to the DCR
• Normal DCR requires special CPU assembly code to access– There is a “fixed” 1024-word I/O space– Must be privilege mode to access registers– Requires macros or inline assembly
DCR Bus
C405DCRABUS
C405DCRDBUSOUT
C405DCRREAD
DCRC405ACK
DCRC405DBUSIN
C405DCRWRITE
PPC405 DCR Devices
dcr_Ack
dcr_Write
dcr_Read
dcr_RdData
dcr_WrData
dcr_ABus
dcr_Clk
Memory Mapped DCR
• DCR bridges allow memory mapping of DCR space anywhere within the system memory– OPB DCR bridge– Allows DCR devices to exist within 4 KB of contiguous space– Must be accessed on word boundaries and one word at a time– Easier to use, but it requires a PLB and OPB transaction
Hardware Design 17
Outline
• Supported Buses– Processor Local Bus (PLB)– On-Chip Peripheral Bus (OPB)– Device Control Register (DCR)– On-Chip Memory (OCM)
• Processor Use Models• PowerPC Processor Programmer’s Model • Reset Logic in PowerPC• JTAG Configurations in Virtex-II Pro
OCM Bus
• 405 OCM I/Fs– PPC405 has a separate interface used for high-speed access of
on-chip memory– PPC405 presents address on both the PLB bus and the OCM bus
• Addresses cannot exist in both PLB and OCM space– OCM addresses are non-cacheable, leaving the cache resources for
the PLB accesses
• The processor block contains the OCM controllers– The processor block contains dedicated controllers to interface
between the OCM I/F and FPGA BRAM– There are separate independent controllers for the I-side and D-side to
provide higher performance
• All signals are in big-endian format
OCM Bus
• Features– Independent 16-MB logical space for each of the DSOCM and
ISOCM• 16 MB must be reserved regardless of actual memory used
– 64-bit ISOCM and 32-bit DSOCM– Up to 128 KB / 64 KB (ISOCM / DSOCM) using programmable
BRAM aspect ratios• Programmable processor versus BRAM clock ratio
– DSBRAM load: BRAM initialization (Data2MEM), CPU, and FPGA using dual-port BRAM
– ISBRAM load: BRAM initialization (Data2MEM) and DCR• CPU DCR-accessible registers
OCM Bus
• Benefits– Avoids loads into cache, reducing pollution and thrashing– Has fast-fixed latency of execution– On the D-side, dual-port BRAM enables a bidirectional data
connection with the processor
• Sample uses– I-side: Interrupt service routines, boot-code storage– D-side: Scratch-pad memory, bidirectional data transfer
Hardware Design 21
Bus Timing
• Use timing constraints to determine which ratio to use• *There are two independent clocks for each OCM controller:
– BRAMDSOCMCLK– BRAMISOCMCLK
PLB CLK OPB CLK DCR CLK OCM CLK *
Transaction synchronous with
Processor clock
PLB clock Processor clock
Processor clock
Clock ratio 1:1 to 16:1 1:1 to 4:1 1:1 to 8:1 1:1 to 4:1
Example Processor clock at 300 MHz, PLB at 100 MHz
PLB at 100 MHz, OPB at 50 MHz
Processor clock at 300 MHz, DCR at 100 MHz
Processor clock at 300 MHz, OCM at 150 MHz
Hardware Design 22
Bus Summary
Bus Summary
Review Questions
• What is the advantage of using the memory-mapped DCR component?
• What is the disadvantage of using the memory-mapped DCR component?
• Which buses are included in the CoreConnect standard?
Answers
• What is the advantage of using the memory-mapped DCR component?– Does not require inline ASM instructions to access the bus
• What is the disadvantage of using the memory-mapped DCR component?– Requires a PLB and an OPB transaction
• Which buses are included in the CoreConnect standard?– PLB, OPB, and DCR
Hardware Design 27
Outline
• Supported Busses– PLB– OPB– DCR– OCM
• Processor Use Models• PowerPC Processor Programmer’s Model • Reset Logic in PowerPC• JTAG Configurations in Virtex-II Pro
Hardware Design 28
Processor Use Models
• Highest Integration, Extensive Peripherals, RTOS & Bus Structures
• Networking & Wireless• High Performance
• Medium Cost, Some Peripherals, Possible RTOS & Bus Structures
• Control & Instrumentation• Moderate Performance
• Lowest Cost, No Peripherals, No RTOS & No Bus Structures
• VGA & LCD Controllers• Low/High Performance
1 2 3
State Machine Microcontroller Custom Embedded
Range of Use Models
Hardware Design 29
Outline
• Supported Busses– PLB– OPB– DCR– OCM
• Processor Use Models• PowerPC Processor Programmer’s Model • Reset Logic in PowerPC• JTAG Configurations in Virtex-II Pro
PowerPC Processor
Note: The OCM bus does not connect to the cache controller
Hardware Design 31
PowerPC Processor
• A 32-bit implementation of the PowerPC embedded-environment architecture
• Support for embedded-systems applications– Flexible memory management– Multiply and accumulate instructions for computationally intensive applications– Enhanced debug capabilities– 64-bit time base– Programmable interval (PIT), fixed interval (FIT), and watchdog timers
• Performance-enhancing features– Static branch prediction– Five-stage pipeline – Hardware multiply/divide for faster integer arithmetic – Enhanced string and multiple-word handling– Minimized interrupt latency
PowerPC
• Memory and peripherals– PPC405 uses 32-bit addresses
• Special addresses– Every PowerPC system
should have the boot section starting at 0xFFFFFFFC
– The default program space occupies a contiguous address space from 0xFFFF0000 to 0xFFFFFFFF
– If interrupt handlers are present, vector table must start at 64K boundary
0x0000_0000
0xFFFF_0000
0xFFFF_FFFC
Peripherals
PLB/OPB Memory
PLB/OPB Memory
Reset Address
Hardware Design 33
Outline
• Buses 101: Arbiter, Master, Slave– PLB– OPB– DCR– OCM
• PowerPC Processor Programmer’s Model • Reset Logic in PowerPC • JTAG Configurations in Virtex-II Pro
Reset Sequence• Sequencing of reset signals coming out of reset managed by PROC_SYS_RESET:
– First — Bus structures come out of reset• PLB and OPB arbiter and bridges for example
– Second — Peripherals come out of reset 16 clocks later• UART, SPI, and IIC, for example
– Third — The CPUs come out of reset 16 clocks after the peripherals
1 2 3 4 5 6
Hardware Design 35
Outline
• Supported Busses– PLB– OPB– DCR– OCM
• Processor Use Model• PowerPC Processor Programmer’s Model • Reset Logic in PowerPC • JTAG Configurations in Virtex-II Pro
Hardware Design 36
JTAG TAP Options
• At design time, you have control over whether each of the PowerPC JTAG TAP (in case of multiple PowerPCs) is incorporated into the FPGA JTAG TAP chain after FPGA configuration, or whether it remains a separate chain
• This is accomplished by instantiating or not instantiating the dedicated JTAGPPC block
Hardware Design 37
Virtex-II ProSplit JTAG Chains
• The isolated chain supports embedded development and debug tools
• User-defined JTAG pins– Provides a direct and
isolated connection to the PPC405 JTAG I/F
– JTAGPPC block is not used in this configuration
TDOTDI
PPC
405
User-Defined JTAG Pins on
the FPGA
Fixed/Dedicated JTAG Pins on the
FPGA
CPU JTAG DEBUG PORT
FPGA JTAG CONFIG PORT
Hardware Design 38
Virtex-II ProCombined JTAG Chains
• The combined chain supports development and debug tools
– ChipScope Pro (PC4)– iMPACT (PC4) – GDB (PC4)– SingleStep XE (visionPROBEII)
• Using the JTAGPPC block – Integrates the PPC405
with the FPGA fabric JTAG chain (dedicated JTAG pins)
PPC
405
User-Defined JTAG Pins on
the FPGA
Fixed/Dedicated JTAG Pins on the
FPGA
CPU JTAG DEBUG PORT
FPGA JTAG CONFIG PORT
TDO
TDI
JTAG PPC
Hardware Design 40
Review Questions
• What connections must be made to debug software on the IBM PowerPC processor?
• Where does the reset vector reside in the PowerPC processor?
Hardware Design 41
Answers
• What connections must be made in to debug software on the IBM PowerPC processor?– PowerPC JTAG ports connecting either the JTAGPPC
component or the external pins
• Where does the reset vector reside in the PowerPC processor?– 0xFFFFFFFC
Where Can I Learn More?
• Tool documentation– Processor IP Reference Guide
• Processor documentation– PowerPC™ Processor Reference Guide– PowerPC 405 Processor Block Reference Guide– MicroBlaze™ Processor Reference Guide
• Support Website– EDK Website: www.xilinx.com/edk