15-213 Recitation 7 Greg Reshko Office Hours: Wed 2:00-3:00PM March 31 st, 2003.

34
15-213 Recitation 7 Greg Reshko Office Hours: Wed 2:00- 3:00PM March 31 st , 2003
  • date post

    21-Dec-2015
  • Category

    Documents

  • view

    213
  • download

    0

Transcript of 15-213 Recitation 7 Greg Reshko Office Hours: Wed 2:00-3:00PM March 31 st, 2003.

15-213 Recitation 7Greg Reshko

Office Hours: Wed 2:00-3:00PM

March 31st, 2003

Outline

Virtual Memory Paging Page faults TLB Address translation

Malloc Lab Lots of hints and ideas

Virtual Memory

ReasonsUse RAM as a cache for diskEasier memory managementProtectionEnable ‘partial swapping’Share memory efficiently

Physical memory

CPU

0:1:

N-1:

Memory

PhysicalAddresses

Virtual Memory

CPU

0:1:

N-1:

Memory

0:1:

P-1:

Page Table

Disk

VirtualAddresses

PhysicalAddresses

Paging: Purpose

Solves two problemsExternal memory fragmentationLong delay to swap a whole process

Divide memory more finelyPage – small logical memory regionFrame – small physical memory region

Any page can map to any frame

Paging: Address Mapping

Logical Address

Page Offset

....f29f34....

Frame Offset

Page tablePhysical Address

Paging: Multi-Level

P1 Offset

....f29f34

f25

Frame Offset

Page Tables

....

f99f87....P2 ....

f07f08

....

Page Directory

Page Faults

Virtual address not in memory This means it is on a disk Go to disk, fetch the page, load it into memory, get back to the process

CPU

Memory

Page Table

Disk

VirtualAddresses

PhysicalAddresses

CPU

Memory

Page Table

Disk

VirtualAddresses

PhysicalAddresses

Copy-on-Write

“Simulated” Copy Copy page table entries to new process Mark PTEs read-only in old and new

What really happens Process writes to page Page fault handler is called

Copy page into empty frame Mark read-write in both PTEs

Result Faster and less work

Relevance to Fork

Why is paging good for fork and exec?Fork produces two very similar processes

Same code, data, and stack

Copying all pages is expensive Many will never be modified (especially in exec)

Share pages instead i.e. just mark them as read only and duplicate

when necessary

Address Translation:General Idea Mapping between virtual and physical addresses

Processor

HardwareAddr TransMechanism

faulthandler

MainMemory

Secondary memoryV

P

page fault

physical addressOS performsthis transfer(only if miss)

virtual address part of the on-chipmemory mgmt unit (MMU)

Address Translation: In terms of address itself Higher bits of the address get mapped from virtual address to

physical. Lower bits (page offset) stays the same.

virtual page number page offset virtual address

physical page number page offset physical address

0p–1

address translation

pm–1

n–1 0p–1p

TLB

Translation Lookaside Buffer Small hardware cache in MMU Maps virtual page numbers to physical page numbers

CPUTLB

LookupCache Main

Memory

VA PA miss

hit

data

Trans-lation

hit

miss

. . .

Address Translation with TLBvirtual addressvirtual page number page offset

physical address

n–1 0p–1p

valid physical page numbertag

valid tag data

data=

cache hit

tag byte offsetindex

=

TLB hit

TLB

Cache

Example Motivation:

A detailed example of end-to-end address translation Same as in the book and lecture

I just want to make sure it makes perfect sense Do practice problems at home

Ask questions if anything is unclear

Example: Description Memory is byte addressable Accesses are to 1-byte words Virtual addresses are 14 bits Physical addresses are 12 bits Page size is 64 bytes TLB is 4-way set associative with 16 total entries L1 d-cache is physically addressed and direct mapped,

with 4-byte line size and 16 total sets

Example: Addresses

13 12 11 10 9 8 7 6 5 4 3 2 1 0

11 10 9 8 7 6 5 4 3 2 1 0

VPO

PPOPPN

VPN

(Virtual Page Number) (Virtual Page Offset)

(Physical Page Number) (Physical Page Offset)

14-bit virtual addresses 12-bit physical address Page size = 64 bits

Example: Page TableVPN PPN Valid

00 28 1

01 – 0

02 33 1

03 02 1

04 – 0

05 16 1

06 – 0

07 – 0

VPN PPN Valid

08 13 1

09 17 1

0A 09 1

0B – 0

0C – 0

0D 2D 1

0E 11 1

0F 0D 1

Example: TLB

16 entries 4-way associative

13 12 11 10 9 8 7 6 5 4 3 2 1 0

VPOVPN

TLBITLBT

Set Tag PPN Valid Tag PPN Valid Tag PPN Valid Tag PPN Valid

0 03 – 0 09 0D 1 00 – 0 07 02 1

1 03 2D 1 02 – 0 04 – 0 0A – 0

2 02 – 0 08 – 0 06 – 0 03 – 0

3 07 – 0 03 0D 1 0A 34 1 02 – 0

Example: Cache 16 lines 4-byte line size Direct mapped

11 10 9 8 7 6 5 4 3 2 1 0

PPOPPN

COCICT

Index Tag Valid B0 B1 B2 B3 Index Tag Valid B0 B1 B2 B3

0 19 1 99 11 23 11 8 24 1 3A 00 51 89

1 15 0 – – – – 9 2D 0 – – – –

2 1B 1 00 02 04 08 A 2D 1 93 15 DA 3B

3 36 0 – – – – B 0B 0 – – – –

4 32 1 43 6D 8F 09 C 12 0 – – – –

5 0D 1 36 72 F0 1D D 16 1 04 96 34 15

6 31 0 – – – – E 13 1 83 77 1B D3

7 16 1 11 C2 DF 03 F 14 0 – – – –

Example: Address Translation Virtual Address 0x03D4 Split into offset and page number

0x03D4 = 00001111010100 VPO = 010100 = 0x14 VPN = 00001111 = 0x0F

Lets see if this is in TLB 0x03D4 = 00001111010100 TLBI = 11 = 0x03 TLBT = 000011 = 0x03

Example: TLB

16 entries 4-way associative

13 12 11 10 9 8 7 6 5 4 3 2 1 0

VPOVPN

TLBITLBT

Set Tag PPN Valid Tag PPN Valid Tag PPN Valid Tag PPN Valid

0 03 – 0 09 0D 1 00 – 0 07 02 1

1 03 2D 1 02 – 0 04 – 0 0A – 0

2 02 – 0 08 – 0 06 – 0 03 – 0

3 07 – 0 03 0D 1 0A 34 1 02 – 0

Example: Address Translation Virtual Address 0x03D4 TLB lookup

This address is in TLB (second entry, set 0x3) PPN = 0x0D = 001101 PPO = VPO = 0x14 = 010100 PA = PPN + PPO = 001101010100

Cache PA = 0x354 = 0x001101010100 CT = 001101 = 0x0D CI = 0101 = 0x05 CO = 00 = 0x0

Example: Cache 16 lines 4-byte line size Direct mapped

11 10 9 8 7 6 5 4 3 2 1 0

PPOPPN

COCICT

Index Tag Valid B0 B1 B2 B3 Index Tag Valid B0 B1 B2 B3

0 19 1 99 11 23 11 8 24 1 3A 00 51 89

1 15 0 – – – – 9 2D 0 – – – –

2 1B 1 00 02 04 08 A 2D 1 93 15 DA 3B

3 36 0 – – – – B 0B 0 – – – –

4 32 1 43 6D 8F 09 C 12 0 – – – –

5 0D 1 36 72 F0 1D D 16 1 04 96 34 15

6 31 0 – – – – E 13 1 83 77 1B D3

7 16 1 11 C2 DF 03 F 14 0 – – – –

Example: Address Translation Virtual Address 0x03D4 Cache Hit

Tag in set 0x5 matches CT Data at offset CO is 0x36 Data returned to MMU Data returned to CPU

Lab 6 Hints and Ideas

Due April 16

40 points for performance 20 points for correctness 5 points for style

Get the correctness points this week Get a feel for how hard the lab is You'll probably need the time Starting a couple days before is a BAD idea!

How to get the correctness points

We provide mm-helper.c which contains the code from the book malloc works free works (with coalescing) Heap checking doesn't work realloc doesn't work

Implement a dumb version of realloc malloc new block, memcpy, free old block, return new block

How to get the correctness points

Implement heap checking Have to add a request id field to each allocated block (tricky) Hint: need padding to maintain 8 byte alignment of user pointer In the book's code bp always the same as the user pointer

The 4 bytes immediately before bp contain size of

payload 3 lsb of size unused (because of alignment)

first bit indicates of the block is alloced or not

Size+a Payload… Footer

bp

How to get the correctness points

Need to change block layout to look like this:

This changes how the implicit list has to be traversed But size is at same place relative to bp

Size+a Payload… Footer

bp

ID

How to get the correctness points

Or change block layout to look like this:

All accesses to what was size now access id but can be clever and make size 4 bytes larger

Could even make bp point to id.. Most code would just work

ID Payload… Footer

bp

Size+a

How to get the correctness points

Once malloc, free, and realloc work with the id field, write heapcheck Iterate over the whole heap and print out allocated

blocks Need to read the id field…

That's it for correctness

Hints Remember that pointer arithematic behaves differently

depending on type of pointer Consider using structs/unions to eliminate some messy

pointer code Get things working with the short trace file first:

./mdriver -f short1-bal.rep

To get the best performance Red-Black trees Ternary trees Other interesting data structures

That’s it for hints…

Good Luck!