MIPS R4000 Technical Overview - Hot Chips · MIPS R4000 Technical Overview 64 Bits/100 MHz orBust...
Transcript of MIPS R4000 Technical Overview - Hot Chips · MIPS R4000 Technical Overview 64 Bits/100 MHz orBust...
MIPS R4000 Technical Overview
64 Bits/100 MHz or Bust
Earl Killian
August2, 1991
Overview
1 9mips
• Integrated I and D primary caches (8K->32K).
• Improved pipeline «1/2 # of gates/cycle).
• Flexible system and secondary cache interface.
• Integrated floating point.
• Multi-processor support.
• 64 bit Integer Datapath and 64 bit TLB.
, 1.6 August2. 1991 2
R4000 Block Diagram
I I':
Decode Data Tag."..:
I ャセセ SCache
128 data
..< .• ····1·············:... -· •.•: ... -...•.-.---
......
I\64 data LJliacr e
ci]]|ゥセ_[ i"nc:t,\-===--rr::====:::::;,I PCacheGRORE:}gisters
FPRegisterFile Control .... セェjQRbM ..
IGacne
FPPipelineLBypass··
·FP·Statas Register
FPMultiply
FPDivide
FPAdd,Sqrt,:Cvt. ..'
System,SCacheControl
Pipeline-- Control
RC.lnetementer:····· .'............_..... - -....
:':':Reoistef-Rile .•••••••i...·:.·..·....·.- ,j\(iUJ_···>·.·.·.>•.-···.·.··.>· .:.
ᄋ[N「セ`NYQーエqエセᄋセiLゥNァァ・イᄋᄋᄋᄋᄋ
サャャQエ・ァ・イᄋGNlゥゥャGャZセュゥカNャ ••••• i ••;\. ... セケウャョエ64 addr/data
August2,1991 3
R4000 Technology
• 1.0 micron CMOS technology.
• 2 Layer Metal technology.
• 1.3 Million transistors.
• 100 MHz internal clock.
August2, 1991 4 .mips1.7
MIPS R3000 Pipeline
Instruction Fetch
Register Access
ALU
Memory (cache)
Write (register)
1.8
August2. 1991
MIPS R3000 Pipeline
+--+1 R3000 System Clock (-30 n5, 1990)
August 2. 1991
5
6
--sセセセ
l.. Nセセ,セ W セセ セセ セ+, J J
セ 1セ
セ M I W セ,_I セ セ
ldW'If IWlIlv _.Nl セ
MIPS R4000 Pipeline
August2, 1991
Instruction Fetch
Register Fetch
Execution
Data Access
Tag Check
Write Back (register)
7 Gmips
MIPS R4000 Pipeline
セ M M M K1 R4000 System Clock (-20 ns, 1991)
August2. 1991 8 1.9
R3000 vs. R4000 Pipeline Details
セBBBBBBBBGセBBBBBBBBBBGGGGGGセB ..LGGGGセBBBBBBBBGセBBBBBBBBBBBBBBBBBGGᄃセ 18 セ ICache ヲ t c セ IDee セ OP セs s s s: s セN
セBBBBBBBBGセBBBBBBBBセBBBBセBBBGセLBBBBBBBGGᄃNGGGGGGGGGGGGGGGGセGGGGGGGGGGGGL .."..LGセGGGGBBBBBBBBBBBGセBBBBセBBBBBBBBセs § § s s §or- § Nセ
セ IT セ RF セ DA セ DT セ dc。」ィ・セQ cセ WB セセ B B B B B B B B セ B B B B B B B B セ B B B B B B B B G セ G B B B B B B B G セ B B B B B B B G G G M G G G セ セ G G G G セ G G G G G G G G G G G G G G G G G セ
§ IA § §LA§§ § § §§ セ セ セセBBBBBBBBセ セBBBGセ
セBBBBBBBBGセBBBBBBBBGセBBBBBBBGBセBBBBBBBBGセBBBBBBBBGセBBBBBBBGBセBBBBBGGGGGGGャGGGGGGGGGGGGGGGGGGGGャ
セ IF セ IS セ RF セ EX セ DF セ OS セ TC セ WB セセ B B B B B B B B G セ B B B B B B B B G L G L G G G G G G G G G G G G G G G G セ B B B B B B B B L セ B B B L L L L L L L L L L L L ャ L L L L L L L L L L L L L L L L L セ L L L L L L L B セ B B B B G セ B B B B B G G G G G G G G セ
ᄃ B B B B B B B B B B B B B B B B B B セ B B B B B B B G B セ B B B B B B B B セ B B B B B B B B B B G G G G G G セ G G G G G G G G セセ ICache セ IDee セGop セ DCache セ l a セセ B B B B B B B B B B B B B B B B B L セ B B B B B B B L L セ L G B G G G G G G G G G G G G セ G G G G G G G G G G G G G G G G G G G G G G G G G G セ G G G G G G G セ G G G G G B B B B B G セ B B B B B G G G G G G G セセ IT セ RF セ IA セ DT セ TC セ WB セセ B B B B B B B B B B B B B B B B B G ゥ G B B B B B B B L L セ L G B G G G G G G G G G G G G セ G G G G G G G G G G G G G G G G G G G G G G G G G G G G G G G G G G G G セ G G G B B B G G G G G G G G セ G G G G G G G G G G G G G G G G N ᄃ
§ ITC §§ §セ セセBBBBBBBBGセ
August2, 1991 9 セュゥーウ
R3000 vs. R4000 Cache Access
18
R
ICache TC IDee
RFIA
DA
A
OP
DT
M
DCacheTC
W
WB
R3000 vs. R4000: Address Translation
A M W
OP
DA DT
IF IS RF EX WB
ICache IDee OP
IT RF fA TC WB
ITC
August2, 1991 11 セュゥーウ
18
R3000 vs. R4000: Other operations
R A M W
IF IS
ICache
IT
18 ICache TC
IT
OP
DA DT WB
• August2. 1991 12 GIDips 1.11
R3000 vs. R4000: ALU operations
18
IF
ICache
August2, 1991
IDec OPIA
DCache
DT
13
Hypothetical MIPS Superscalar Pipeline
1.12 AupR2, 1991 14 emlps
MIPS R3000 Branch Delay
• One Branch Delay Cycle
• MIPS Architectural Branch Delay
BD = 1
August2,1991 15
R4000 v.s. Superscalar: Branch Delay
IF IS RF EX F OS TC:WB
IF IS RF X OF OS TCWB·N._,"·....,..·.,., """.",
IF IS RF EX DF' DS TC'WB
I IS RF EX OF OS' TCWB .
IF IS RF EX DF DS TCWB
BO=3
R
RI
I
M W
M W
A M W. ·.· ᄋLNLNLᄋセN⦅Nオ .... .•..... <.,.,.",.. ".,.,.,.,•..,,, BBBBGLNLJNセLNLNLN B0 = 2 or 3
A M WNLNセNM ...__Z N セ Z N Z Z N G N セ B セ G [ M Z N ":-:.;"" ··x·.·. .;.: ...:-.,•. -.;.J.......;.:.:.• -:w.«-;.(- 0'
RAM W-v N M Z M ク Z N セ セ I セ セ セ •NLZZNZNGILNMNNNMNGィッエMセZNn⦅NZᄋNᄋᄋᄋᄋᄋNZGocMᄋ[ᄋセセ[NセセゥォセセM[L[MLNセBBセGZセセセセセセBB I ...
RAM W
August 2. 1991 16 Omips 1.13
MIPS R3000 Load Delay
LD = 1
• One Load Delay Cycle
• MIPS Architectural Load Delay
August 1991 17 Omips
R4000 v.s. Superscalar: Load Delay
IF
LD=2
LD = 2 or 3
RRI
I
AA
RRII
M
MA
A
R. ,,' ::";
R
wW
M W
M W
A M W, . : _·.:u "";,;>..-.:._: ..:.:__" セ ⦅BGGGZセNセGG[NGH • ; NNLNNJBZセSエBvnnILZNBNᄋNᄋ _ ·.-- セセセ[N[MャoGILNセ .. セセセセᄋセZ
A M W'
1.14 August2."1991 18
R3000, R4000 v.s. Superscalar: Dependent ALU ops
R
I
W
M w AD=O
AD=O
R A
HaセW
R A W AD = aor 1I R M WI R A M W
August2,1991 19 セ ュ ゥ ー ウ
R4000 v.s. Superscalar: Other Issues
• R4000 can issue twice as many load/store insts.
• Fewer functional units required.
• Simpler pipeline controller.
• Fewer requirements on compiler.
August2, 1991 20 セュゥーウ 1.15
R4000 Configurations
SMALL
R4000167pin
MEDIUM
SCache
128
R4000447 pin
August2, 1991
R4000 Configurations
"DRAM
LARGE
21
SCache
128
-mips
"SCaclia
DRAM
1.18 August. 2. 1991
110R4000447pin
22
R4000447pin
-mips
Flexible System Interface
• 64 bit wide System Interface for Addr/Data.
• Configurable clock divisors.
• Configurable transmit/receive data patterns.
• Overlapped operation for write backsecondary cache systems.
August2. 1991 23 Smips
Configurable Clock Rates for 50 MHz interface
+2 --7 100MHz.
SysAD < A D D D D +3 --7 150MHz.R4000 Clock
SysAD < A D D D +4 --7 200MHz.R4000 Clock
August2. 1991 24 Smips 1.17
Configurable TransmiVReceive Patterns
SysAD< A XD X .0 X
SysAD < A XD X'----JX 0 X,----,XnX-----."X 0 )
August2. 1991 25 Omips
Overlapped Operation v.s. Non-Overlapped Operation
Overlapped
Read Latency
Non-Overlapped
Read AddressWrite AddressRead Data (block)Write Data (block)
⦅セbENヲJセセ It t-------Read Latency
1.18 Aup8t2. 1991 26 Omlps
Conclusionj
• Third Generation RISC.Integrated Caches.Integrated Floating Point.64-Bit Datapath and TLB.
• Pipeline chosen for performance and economy.
• Flexible system interface for wide range of applications.
August2, 1991 27 セュゥーウ
1.19