NS9750 - Training Hardware. NS9750 System Overview.
-
Upload
harold-chesterfield -
Category
Documents
-
view
215 -
download
0
Transcript of NS9750 - Training Hardware. NS9750 System Overview.
NS9750 - Training
Hardware
NS9750 System Overview
NS9750 - A NET+ARM™ ProcessorA NET+ARM™ Processor The highest performance Network Attached The highest performance Network Attached
processor available in the marketprocessor available in the market
• The most advanced ARM9 processor
• Rich set of components & peripherals
• Deterministic performance
• Low latency
NS9750352-pin, 35mm X 35mm BGA, 35mmx35mm, 1.27mm pitch
4K
SIM100MHz
GP
IO (
50
Pin
s)
50,
40.5
, or
31 M
Hz
Peri
ph
era
l B
us B
rid
ge
32b
-D,
32b
-A
27-C
han
nel D
MA
USB
HDLC
1284
I2C
LCD Controller
Power Manager
CLK Generation
Interrupt Controller
AHB Arbiter
ARM926EJ-S200, 162, or 125MHz
8kB I-Cache4kB D-Cache
JTAG Testand Debug
10/100 Ethernet MII/RMII
MAC
Multiple Bus Master/
Distributed DMA
Memory Controller
Ext. Peripheral Controller
PCI/CardBus Bridge 33 MHz
100, 81, or 62.5 MHz AMBA AHB Bus 32b-D, 32b-A
ARMARM
16 General Purpose Timers/Counters
Serial Module
x4
UART
SPI
Mercury System Overview
• Two Internal System Buses– Main System Bus: 100 MHz AHB Bus– Peripheral Bus: 50 MHz B-Bus
• Two External Buses– 100 MHz Memory Bus: Supports SDRAM,
FLASH, SRAM, Peripheral w/ SRAM interface– 33 MHz PCI Bus: Supports 3 external PCI
devices
Mercury System Overview
• System Block Diagram
System Block Diagram
CPU
PrinterInterface
&JBIG
MemoryController
PCI
BBusBridge
LCD
1284
I2C
USBDevice
BBus DMA
BBus Utility(GPIO)
BBus
AHB
PWM
SystemController
Ethernet
USBHost
SPIUARTHDLC
Mercury System Overview
• Main System Bus Components– CPU: 200 MHz ARM 926EJ-S, Master– Ethernet Controller w/o PHY, Master/Slave– Color LCD Controller, Master/Slave– PCI Host/Device Bridge, Master/Slave– B-Bus Bridge, Master/Slave– Color Laser Printer Controller, Master/Slave– Memory Controller, Slave Only
Mercury System Overview
• Main System Bus Utilities– System Clock Generation– System Control Timers and General Purpose
Timers– Prioritized Vectored Interrupt Controller– Bus Bandwidth Configurable Bus Arbiter– System Sleep/Wake-up Processor
Mercury System Overview• Peripheral Bus Components
– USB Host/Device: 1.5 Mbytes/sec.
– I2C Controller: 400Kbits/sec.
Clock Stretching , Bus Arbitration.
– IEEE1284 Device Interface: 1MBytes/sec.
– Serial Module: 4 Independent Ports, Software selectable UART (Standard Rates), HDLC (6 Mbits/sec.), SPI (6Mbits/sec.)
– GPIO
Mercury System Overview
• Peripheral Bus Utilities– DMA Controller– Bus Monitor Timers
Mercury System Performance
• System Performance is Memory Centric
• Operating Frequencies
• Memory System Throughput
• Data Bandwidth Allocation
• Interrupt Latency
• Power Consumption
Mercury System Performance
• Operating Frequencies– CPU: 200 MHz
– All AHB Bus Components: 100 MHz
– Printer Data Clock: 100 MHz
– All B-Bus Components: 50 MHz
– USB: 12 Mbits/sec.
– Async Serial Comm: Standard Rates up to 1.9 Mbits/sec
– Sync Serial Comm: up to 6 Mbits/sec
Mercury System Performance
• Memory System (SDRAM) Performance– Worstcase Data Bandwidth: 200 Mbytes/sec. Burst of 8
Access
– Average Data Bandwidth: 230 Mbytes/sec. Burst of 8 Access
– 50% Bandwidth Prioritized to CPU
– 50% Bandwidth Prioritized to all other AHB Masters
– Bandwidth Allocation Controlled by AHB Bus Arbiter
Mercury System Performance
• Guaranteed Data Bandwidth Allocation– Worstcase Bandwidth Calculation Formula
• (100Mclks / 2) / (16clks X # of non-CPU masters) X 32 Bytes/sec X # of slots occupied
– Total Of 16 non-CPU Master Slots
– A Master Can Occupy More Than One Slot
– A Master Can Occupy a Fraction of One Slot
– A Master May Occupy No Slot (disabled)
Mercury System Performance
• Interrupt Latency– Latency is measured from Interrupt Assertion to the
execution of the 1st instruction of the ISR
– Latency Calculation Formula:• Clks of current instruction + clks of reading interrupt vector +
clks of jump to top level IRS + clks of parse thru interrupt sources + clks of jump to final ISR
– Total Of 32 Interrupt Vectors (entries) to minimize # of interrupt sources per vector
– Deterministic Interrupt Latency Achievable
Mercury System Performance
• Power ConsumptionOperation Sleep Mode w/ wake up on CPU
Clock
Full No PCI No PCI, LCD
All ports
BBUS ports
AHB Bus ports
No wake up ports
Total @ 200 MHz - core - I/O
1.7 W 1.05 W 0.65 W
1.55 W 1 W 0.55 W
1.5 W 1 W 0.5 W
350 mW 260 mW 90 mW
285 mW 210 mW 75 mW
240 mW 220 mW 20 mW
180 mW 170 mW 10 mW
Total @ 162 MHz - core - I/O
1.4 W 0.9 W 0.5 W
1.25 W 0.8 W 0.45 W
1.2 W 0.8 W 0.4 W
285 mW 210 mW 75 mW
235 mW 170 mW 65 mW
200 mW 180 mW 20 mW
145 mW 140 mW 5 mW
Total @ 125 MHz - core - I/O
1.05 W 0.65 W 0.4 W
1 W 0.65 W 0.35 W
950 mW 640 mW 310 mW
220 mW 160 mW 60 mW
180 mW 130 mW 50 mW
150 mW 140 mW 10 mW
110 mW 105 mW 5 mW
ARM 926EJ-S CPU
• Reduced Instruction Set Computer (RISC)• Five-Stage Pipe Line• Harvard Architecture• 8K I-Cache, 4K D-Cache• Memory Management Unit• JAVA Accelerator• DSP Extension• Thumb Mode
ARM 926EJ-S CPU
• Reduced Instruction Set– 32-bit ARM Instructions is a superset of 16-bit Thumb
Instructions
– 32-bit ARM Instructions• Move, Arithmetic, Logical, Branch, Load, Store, Cache Hint,
Swap, Software Interrupt, Software Breakpoint
– 16-bit Thumb Instructions• Move, Arithmetic, Logical, Shift/Rotate, Branch, Load, Store,
Push/Pop, Software Interrupt, Software Breakpoint.
– Thumb mode has Full 32-bit register advantage
ARM 926EJ-S CPU
• Five Stage Pipe Line
ARM 926EJ-S CPU
• Harvard Architecture– Separated Data and Instruction Path– Separated Data and Instruction Cache– Balanced CPU access to data and instructions
benefits the most– Example: DSP Processor
ARM 926EJ-S CPU
• 8K I-Cache, 4K D-Cache– 4-Way Set Associative Cache– D-Cache Write-Thru mode is recommended
due to no bus snooping– Programmable Pseudo-random or Round-Robin
Replacement – Write buffer to improve system performance
ARM 926EJ-S CPU
• Memory Management Unit– Required by Symbian OS, WindowsCE, and Linux
– Two level page tables in main memory to control• Address translation
• Permission checks
• Memory region attributes
– Use Translation Lookaside Buffer (TLB) to cache page tables
– TLB entries can be locked down
ARM 926EJ-S CPU
• JAVA Accelerator– Efficient JAVA Byte Code Execution– Similar JAVA performance to JIT w/o
associated code overhead
ARM 926EJ-S CPU
• DSP Extension– Combines system control and signal processing (DSP)
into one processor.– Intel has adopted ARM’s DSP extension in their Xscale
Microarchitecture.– New powerful Multiply instructions– New Saturation extension for stable control loops and
bit-exact algorithm– Cache Preload instruction– New instructions to load and store pairs of registers.
ARM 926EJ-S CPU
• Thumb Mode– 16-bit instruction set improves Code Density
• Thumb code is typically 65% of the size of ARM code
– Full 32-bit register advantage– Interchangeable with ARM mode dynamically– All exception handlings are in ARM mode– Power up in ARM mode
System Boot• Low cost boot from serial EEPROM through SPI port
• High speed boot from 8-bit, 16-bit, or 32-bit ROM or Flash
NS9750
Memory CTL
External System Memory
Flash or ROM
Memory Bus
BB
US
to A
HB
Bri
dg
e
AHB
Serial EEPROM
SPI