Interactions between Processor Design and Memory System Design
description
Transcript of Interactions between Processor Design and Memory System Design
11/4/25 1
Interactions between Processor Design and
Memory System Design
David E. Culler
CS61CL
Nov 25, 2009
Lecture 12
UCB CS61CL F09 Lec 12
A Processor Centric View
11/4/09 UCB CS61CL F09 Lec 10 2
Memory
Datapath
Control
Fundamental Mem. Design concepts
• Caches
• Virtual memory
• Without these, processing as we know it would not be possible
11/4/25 UCB CS61CL F09 Lec 12 3
A more balanced view
• “Princeton Architecture” – common instruction and data memory
11/4/25 UCB CS61CL F09 Lec 12 4
Processor
Memory
A more balanced view
• “Harvard Architecture” – separate instruction and data memory
11/4/25 UCB CS61CL F09 Lec 12 5
Processor
Instruction Memory
Data Memory
Or really
• Memory systems are extremely sophisticated
• Parallelism, caching, controllers, protocols, …
11/4/25 UCB CS61CL F09 Lec 12 6
Processor
Memory
Pipeline design: I-miss handling
• Insert a no-op “bubble” till i-fetch completes
11/4/09 UCB CS61CL F09 Lec 10 7
°°°
PC
+A
B Ci
IR
IR_
ex
IR_
me
m
IR_
wb
imem
Dm
em
Pipeline Design: D-miss
• Stall entire pipeline behind mem stage for data miss penalty
• Bubble the remainder (WB)
11/4/09 UCB CS61CL F09 Lec 10 8
°°°
PC
+A
B Ci
IR
IR_
ex
IR_
me
m
IR_
wb
imem
Dm
em
Performance “Iron Triangle”
• Execution Time = Seconds / Program
= Seconds X Cycles X Instructions
Cycle Instruction Program
= CycleTime X CPI X Inst.Count
• What primarily determines…– Cycle Time?
– Instruction Count?
– CPI ?
11/4/25 UCB CS61CL F09 Lec 12 9
CPI
CycleTime
Inst.Count
Bringing Cache into the Picture
• Recall MAT = Timehit + Pmiss * Penaltymiss
• Timehit < Cycle Time
• Penaltymiss = Pipeline Stalls/Bubbles during miss
• Ideal CPI is CPI with perfect memory system
• CPI = Ideal_CPI + Pmiss* Penaltymiss
11/4/25 UCB CS61CL F09 Lec 12 10
Example
• Instruction Mix:– 50% arith, 30% load/store, 20% jumps/branches
• Pipeline hazards– Ideal CPI = 1.2
• Cache behavior– 0.2% instruction miss rate (99.8% hit rate)
– 3% data miss rate (97% hit rate)
– 100 cycle miss penalty
• Without Cache: CPI = 1.2 + 100 + 0.30 x 100 = 131.2– processor pipeline is 0.7% utilized !!!!
• Cache: CPI = 1.2 + 1 x 0.002 x 100 + 0.30 x 0.03 x 100
= 1.2 + 0.2 + 0.9 = 2.3
on average ~half the time is spent waiting for mem.
11/4/25 UCB CS61CL F09 Lec 12 11
Administration
• Midterm II results– Max: 99 Mean: 75.2 (without bonus)
– Max: 105.5 Mean 77
• HW 8 due 12/7 midnight
• Project 4 due 12/9 midnight
• Review Week– review in Tu/W lab + optional threads
lab
– review in lecture
• Final Exam: Dec 15 12:30 -3:30
11/4/25 UCB CS61CL F09 Lec 12 12
Virtual Memory
• Each Program runs in its own Virtual Address Space (VAS)
• Distinct from the Physical Address Space (PAS) of the machine
• Hardware transparently maps the Virtual Address Spaces onto physical resources
• Only a small fraction of the VAS’s in physical memory at any time!
11/4/25 UCB CS61CL F09 Lec 12 13
Timesharing, MultiProcessing, Multitasking
11/4/25 UCB CS61CL F09 Lec 12 14
Multiple Process Address Spaces in Mem
11/4/25 UCB CS61CL F09 Lec 12 15
PhysicalMemory
00000000
FFFFFFFF
00000000
00FD0000
With Virtual Memory
11/4/25 UCB CS61CL F09 Lec 12 16
PhysicalMemory
00000
FFFFF
00000000
00FD0000
FFFFFFFF
A Processor Supporting Virtual Memory
• Is able to access a Page Table to translate Virtual Page Number => Physical Frame
• on EVERY memory reference
• Page Table lives in memory
• How many memory accesses per instruction?– Instruction Fetch VA Translation
» PF = Mem[ PTbase + PC_page]
– Fetch the Actual Instructions
» IR = Mem[ PF + PC_offset]
– Load/Store VA Translation
» PF = Mem[ PTbase + (R[rs]+Sx)_page ]
– Load/Store the actual location
» R[rt] = Mem[ PF + (R[rs]+Sx)_offset ]
• How many cache accesses?
11/4/25 UCB CS61CL F09 Lec 12 17
TLB ????
• Translation Lookaside Buffer is a specialized cache for the page table
• It was invented (by Sir Maurice Wilkes) to make virtual memory possible
• He then realized it could be used to make all memory accesses faster.
• Should TLBs and caches be different?
11/4/25 UCB CS61CL F09 Lec 12 18
What must happens in the processor on a Page Fault?• It could happen in instruction fetch, LW or SW
• The translation fails
• The actual page is out on disk– 10 ms @ 3 GHz => 30 Million cycles to access it!
• We need to run a special program (The Operating System) to go and get it
– allocate a frame in memory
– read the page from disk
» seek
» transfer, …
– update the page table
• But we are in the middle of an instruction…
11/4/25 UCB CS61CL F09 Lec 12 19
Page Fault
• Cannot just stall the pipeline
• Must “trap” the current instruction
• Put it aside and start executing other (OS) instructions
11/4/09 UCB CS61CL F09 Lec 10 20
°°°
PC
+A
B Ci
IR
IR_
ex
IR_
me
m
IR_
wb
imem
Dm
em
More Key Concepts
• Exception: unprogrammed transfer of control
• Interrupt – asynchronous
– occurs between instructions
– used for efficient I/O
• Fault– synchronous
– occurs within an instruction
• Preserve state associated with trap in special registers
– EPC + BADVad + Cause in MIPS
• Modify PC register to be exception handler– PC := trapHandlerAddr
11/4/25 UCB CS61CL F09 Lec 12 21
What information must be recorded on a page fault?• The PC of offending instruction
• The offending address
• other cause-related info
11/4/25 UCB CS61CL F09 Lec 12 22
Page Fault in Action
11/4/25 UCB CS61CL F09 Lec 12 23
PC
ePC badVA
page 0040
Page Table
Regs TLB
IR0040 0010
07 0000
0040 => 07PTB
v: 070040
Physical Memory
Processor
Disk
ProgramVirtualAddressSpace
Inst Fetch: VA 0040xxxx => PA 07xxxx
11/4/25 UCB CS61CL F09 Lec 12 24
PC
ePC badVA
page 0040
Page Table
Regs TLB
IR0040 0010
07 0000
0040 => 07PTB
v: 070040
Physical Memory
Processor
Disk
ProgramVirtualAddressSpace
Inst Fetch: mem[07 0010] => IR
11/4/25 UCB CS61CL F09 Lec 12 25
PC
ePC badVA
page 0040
Page Table
Regs TLB
IR0040 0010
07 0000
0040 => 07PTB
v: 070040
Physical Memory
Processor
Disk
ProgramVirtualAddressSpace
lw $3 20($4)
Exec: EA = 0053 1000 + 20
11/4/25 UCB CS61CL F09 Lec 12 26
PC
ePC badVA
page 0040
Page Table
Regs TLB
IR0040 0010
07 0000
0040 => 07PTB
v: 070040
Physical Memory
Processor
Disk
ProgramVirtualAddressSpace
lw $3 20($4)
0053 1000
Exec: VA 00531020 => ??? TLB miss
11/4/25 UCB CS61CL F09 Lec 12 27
PC
ePC badVA
page 0040
Page Table
Regs TLB
IR0040 0010
07 0000
0040 => 07PTB
v: 070040
Physical Memory
Processor
Disk
ProgramVirtualAddressSpace
lw $3 20($4)
0053 1000
Exec: PT lookup(0053) => ??? Fault
11/4/25 UCB CS61CL F09 Lec 12 28
PC
ePC badVA
page 0040
Page Table
Regs TLB
IR0040 0010
07 0000
0040 => 07PTB
v: 070053
Physical Memory
Processor
Disk
ProgramVirtualAddressSpace
lw $3 20($4)
0053 1000
N:
Exec: Trap to OS Page Fault Handler
11/4/25 UCB CS61CL F09 Lec 12 29
PC
ePC badVA
page 0040
Page Table
Regs TLB
IR0040 0010
07 0000
0040 => 07PTB
v: 070053
Physical Memory
Processor
Disk
ProgramVirtualAddressSpace
lw $3 20($4)
0053 1000
N:
0040 0010
00001 FF00
0053 1020
Fetch and execute OS instructions
11/4/25 UCB CS61CL F09 Lec 12 30
PC
ePC badVA
page 0040
Page Table
Regs TLB
IR0040 0010
07 0000
0040 => 07PTB
v: 070053
Physical Memory
Processor
Disk
ProgramVirtualAddressSpace
j flt_hndlr
0053 1000
N:
0040 0010
00001 FF00
0053 1020
OS page
Fetch and execute OS instructions
11/4/25 UCB CS61CL F09 Lec 12 31
PC
ePC badVA
page 0040
Page Table
Regs TLB
IR0040 0010
07 0000
0040 => 07PTB
v: 070053
Physical Memory
Processor
Disk
ProgramVirtualAddressSpace
jxzyxzyxz
0053 1000
N:
0040 0010
000YY xxxx
0053 1020
OS page
Load page from Disk to Memory
11/4/25 UCB CS61CL F09 Lec 12 32
PC
ePC badVA
page 0040
Page Table
Regs TLB
IR0040 0010
07 0000
0040 => 07PTB
v: 070053
Physical Memory
Processor
Disk
ProgramVirtualAddressSpace
j flt_hndlr
0053 1000
N:
0040 0010
00001 FF00
0053 1020
OS page
page 0053
Update Page Table
11/4/25 UCB CS61CL F09 Lec 12 33
PC
ePC badVA
page 0040
Page Table
Regs TLB
IR0040 0010
07 0000
0040 => 07PTB
v: 070053
Physical Memory
Processor
Disk
ProgramVirtualAddressSpace
j flt_hndlr
0053 1000
v: 14
0040 0010
00001 FF00
0053 1020
OS page
page 005314 0000
ReturnFromException (RFE)
11/4/25 UCB CS61CL F09 Lec 12 34
PC
ePC badVA
page 0040
Page Table
Regs TLB
IR0040 0010
07 0000
0040 => 07PTB
v: 070053
Physical Memory
Processor
Disk
ProgramVirtualAddressSpace0053 1000
v: 14
0040 0010
OS page
page 005314 0000
lw $3 20($4)
Exec: TLB Miss, PT lookup
11/4/25 UCB CS61CL F09 Lec 12 35
PC
ePC badVA
page 0040
Page Table
Regs TLB
IR0040 0010
07 0000
0040 => 07PTB
v: 070053
Physical Memory
Processor
Disk
ProgramVirtualAddressSpace0053 1000
v: 14
OS page
page 005314 0000
lw $3 20($4)
0053 => 07
Exec: Read physical address
11/4/25 UCB CS61CL F09 Lec 12 36
PC
ePC badVA
page 0040
Page Table
Regs TLB
IR0040 0010
07 0000
0040 => 07PTB
v: 070053
Physical Memory
Processor
Disk
ProgramVirtualAddressSpace0053 1000
v: 14
OS page
page 005314 0000
lw $3 20($4)
0053 => 07
432
Paging the Page Table?
• 264 byte virtual address space
• 214 byte pages (16 kB)
• => 250 page table entries
• Large address spaces are used sparsely
11/4/25 UCB CS61CL F09 Lec 12 37
Summary
• Caches are essential to performance
• Virtual Address translation permits modern operating systems and applications
• Requires caching
• Also requires special processor hardware support
• Also requires operating system support
• Works as long as page faults are rare
• Next Time: Andy lectures on “What’s an OS”
11/4/25 UCB CS61CL F09 Lec 12 38