Proxylab and stuff 15-213: Introduction to Computer Systems Recitation 13: November 19, 2012
15-213 Recitation 7 Greg Reshko Office Hours: Wed 2:00-3:00PM March 31 st, 2003.
-
date post
21-Dec-2015 -
Category
Documents
-
view
213 -
download
0
Transcript of 15-213 Recitation 7 Greg Reshko Office Hours: Wed 2:00-3:00PM March 31 st, 2003.
Outline
Virtual Memory Paging Page faults TLB Address translation
Malloc Lab Lots of hints and ideas
Virtual Memory
ReasonsUse RAM as a cache for diskEasier memory managementProtectionEnable ‘partial swapping’Share memory efficiently
Paging: Purpose
Solves two problemsExternal memory fragmentationLong delay to swap a whole process
Divide memory more finelyPage – small logical memory regionFrame – small physical memory region
Any page can map to any frame
Paging: Address Mapping
Logical Address
Page Offset
....f29f34....
Frame Offset
Page tablePhysical Address
Paging: Multi-Level
P1 Offset
....f29f34
f25
Frame Offset
Page Tables
....
f99f87....P2 ....
f07f08
....
Page Directory
Page Faults
Virtual address not in memory This means it is on a disk Go to disk, fetch the page, load it into memory, get back to the process
CPU
Memory
Page Table
Disk
VirtualAddresses
PhysicalAddresses
CPU
Memory
Page Table
Disk
VirtualAddresses
PhysicalAddresses
Copy-on-Write
“Simulated” Copy Copy page table entries to new process Mark PTEs read-only in old and new
What really happens Process writes to page Page fault handler is called
Copy page into empty frame Mark read-write in both PTEs
Result Faster and less work
Relevance to Fork
Why is paging good for fork and exec?Fork produces two very similar processes
Same code, data, and stack
Copying all pages is expensive Many will never be modified (especially in exec)
Share pages instead i.e. just mark them as read only and duplicate
when necessary
Address Translation:General Idea Mapping between virtual and physical addresses
Processor
HardwareAddr TransMechanism
faulthandler
MainMemory
Secondary memoryV
P
page fault
physical addressOS performsthis transfer(only if miss)
virtual address part of the on-chipmemory mgmt unit (MMU)
Address Translation: In terms of address itself Higher bits of the address get mapped from virtual address to
physical. Lower bits (page offset) stays the same.
virtual page number page offset virtual address
physical page number page offset physical address
0p–1
address translation
pm–1
n–1 0p–1p
TLB
Translation Lookaside Buffer Small hardware cache in MMU Maps virtual page numbers to physical page numbers
CPUTLB
LookupCache Main
Memory
VA PA miss
hit
data
Trans-lation
hit
miss
. . .
Address Translation with TLBvirtual addressvirtual page number page offset
physical address
n–1 0p–1p
valid physical page numbertag
valid tag data
data=
cache hit
tag byte offsetindex
=
TLB hit
TLB
Cache
Example Motivation:
A detailed example of end-to-end address translation Same as in the book and lecture
I just want to make sure it makes perfect sense Do practice problems at home
Ask questions if anything is unclear
Example: Description Memory is byte addressable Accesses are to 1-byte words Virtual addresses are 14 bits Physical addresses are 12 bits Page size is 64 bytes TLB is 4-way set associative with 16 total entries L1 d-cache is physically addressed and direct mapped,
with 4-byte line size and 16 total sets
Example: Addresses
13 12 11 10 9 8 7 6 5 4 3 2 1 0
11 10 9 8 7 6 5 4 3 2 1 0
VPO
PPOPPN
VPN
(Virtual Page Number) (Virtual Page Offset)
(Physical Page Number) (Physical Page Offset)
14-bit virtual addresses 12-bit physical address Page size = 64 bits
Example: Page TableVPN PPN Valid
00 28 1
01 – 0
02 33 1
03 02 1
04 – 0
05 16 1
06 – 0
07 – 0
VPN PPN Valid
08 13 1
09 17 1
0A 09 1
0B – 0
0C – 0
0D 2D 1
0E 11 1
0F 0D 1
…
Example: TLB
16 entries 4-way associative
13 12 11 10 9 8 7 6 5 4 3 2 1 0
VPOVPN
TLBITLBT
Set Tag PPN Valid Tag PPN Valid Tag PPN Valid Tag PPN Valid
0 03 – 0 09 0D 1 00 – 0 07 02 1
1 03 2D 1 02 – 0 04 – 0 0A – 0
2 02 – 0 08 – 0 06 – 0 03 – 0
3 07 – 0 03 0D 1 0A 34 1 02 – 0
Example: Cache 16 lines 4-byte line size Direct mapped
11 10 9 8 7 6 5 4 3 2 1 0
PPOPPN
COCICT
Index Tag Valid B0 B1 B2 B3 Index Tag Valid B0 B1 B2 B3
0 19 1 99 11 23 11 8 24 1 3A 00 51 89
1 15 0 – – – – 9 2D 0 – – – –
2 1B 1 00 02 04 08 A 2D 1 93 15 DA 3B
3 36 0 – – – – B 0B 0 – – – –
4 32 1 43 6D 8F 09 C 12 0 – – – –
5 0D 1 36 72 F0 1D D 16 1 04 96 34 15
6 31 0 – – – – E 13 1 83 77 1B D3
7 16 1 11 C2 DF 03 F 14 0 – – – –
Example: Address Translation Virtual Address 0x03D4 Split into offset and page number
0x03D4 = 00001111010100 VPO = 010100 = 0x14 VPN = 00001111 = 0x0F
Lets see if this is in TLB 0x03D4 = 00001111010100 TLBI = 11 = 0x03 TLBT = 000011 = 0x03
Example: TLB
16 entries 4-way associative
13 12 11 10 9 8 7 6 5 4 3 2 1 0
VPOVPN
TLBITLBT
Set Tag PPN Valid Tag PPN Valid Tag PPN Valid Tag PPN Valid
0 03 – 0 09 0D 1 00 – 0 07 02 1
1 03 2D 1 02 – 0 04 – 0 0A – 0
2 02 – 0 08 – 0 06 – 0 03 – 0
3 07 – 0 03 0D 1 0A 34 1 02 – 0
Example: Address Translation Virtual Address 0x03D4 TLB lookup
This address is in TLB (second entry, set 0x3) PPN = 0x0D = 001101 PPO = VPO = 0x14 = 010100 PA = PPN + PPO = 001101010100
Cache PA = 0x354 = 0x001101010100 CT = 001101 = 0x0D CI = 0101 = 0x05 CO = 00 = 0x0
Example: Cache 16 lines 4-byte line size Direct mapped
11 10 9 8 7 6 5 4 3 2 1 0
PPOPPN
COCICT
Index Tag Valid B0 B1 B2 B3 Index Tag Valid B0 B1 B2 B3
0 19 1 99 11 23 11 8 24 1 3A 00 51 89
1 15 0 – – – – 9 2D 0 – – – –
2 1B 1 00 02 04 08 A 2D 1 93 15 DA 3B
3 36 0 – – – – B 0B 0 – – – –
4 32 1 43 6D 8F 09 C 12 0 – – – –
5 0D 1 36 72 F0 1D D 16 1 04 96 34 15
6 31 0 – – – – E 13 1 83 77 1B D3
7 16 1 11 C2 DF 03 F 14 0 – – – –
Example: Address Translation Virtual Address 0x03D4 Cache Hit
Tag in set 0x5 matches CT Data at offset CO is 0x36 Data returned to MMU Data returned to CPU
Lab 6 Hints and Ideas
Due April 16
40 points for performance 20 points for correctness 5 points for style
Get the correctness points this week Get a feel for how hard the lab is You'll probably need the time Starting a couple days before is a BAD idea!
How to get the correctness points
We provide mm-helper.c which contains the code from the book malloc works free works (with coalescing) Heap checking doesn't work realloc doesn't work
Implement a dumb version of realloc malloc new block, memcpy, free old block, return new block
How to get the correctness points
Implement heap checking Have to add a request id field to each allocated block (tricky) Hint: need padding to maintain 8 byte alignment of user pointer In the book's code bp always the same as the user pointer
The 4 bytes immediately before bp contain size of
payload 3 lsb of size unused (because of alignment)
first bit indicates of the block is alloced or not
Size+a Payload… Footer
bp
How to get the correctness points
Need to change block layout to look like this:
This changes how the implicit list has to be traversed But size is at same place relative to bp
Size+a Payload… Footer
bp
ID
How to get the correctness points
Or change block layout to look like this:
All accesses to what was size now access id but can be clever and make size 4 bytes larger
Could even make bp point to id.. Most code would just work
ID Payload… Footer
bp
Size+a
How to get the correctness points
Once malloc, free, and realloc work with the id field, write heapcheck Iterate over the whole heap and print out allocated
blocks Need to read the id field…
That's it for correctness
Hints Remember that pointer arithematic behaves differently
depending on type of pointer Consider using structs/unions to eliminate some messy
pointer code Get things working with the short trace file first:
./mdriver -f short1-bal.rep
To get the best performance Red-Black trees Ternary trees Other interesting data structures