Overview

20
1 Overview Assignment 5: hints Garbage collection Assignment 4: solution

description

Overview. Assignment 5: hints Garbage collection Assignment 4: solution. A5 Ex1 - Barriers. Explain the difference between a read and a write barrier Show the instrumented code generated by a compiler for p.next = q Which barrier to use for: Copying GC Mark & Sweep GC. - PowerPoint PPT Presentation

Transcript of Overview

Page 1: Overview

1

Overview Assignment 5: hints

Garbage collection Assignment 4: solution

Page 2: Overview

2

A5 Ex1 - Barriers Explain the difference between a read and

a write barrier Show the instrumented code generated by

a compiler for p.next = q Which barrier to use for:

Copying GCMark & Sweep GC

Page 3: Overview

3

A5 Ex2 – Copying collectors

Compacting and copying GC cause the object address to change at each collection step.

Show how to solve the movement problem (for the 2 GC types).

Page 4: Overview

4

A5 Ex2 – Copying collectors

Mark & Sweep vs. Copying GCs: Give a rough implementation of the

collection and allocation for the Copying GC

Which collector has the fastest allocation? Give an estimate of the collection cycle

cost (M = heap size, R = live objects)

Page 5: Overview

5

A5 Ex3 - Mark & Sweep Phase 1:

mark every reachable object Phase 2:

remove non-reachable objects

o

1

3

4

5

6

7

2

Page 6: Overview

6

Pointer Rotation - Introduction Recursive traversal is very expensive

heap: list with 10’000 elements

PROCEDURE Traverse(root: Node);VAR cnt: INTEGER;

10’000 * 16 = 160’000 Bytes Stack Size

Page 7: Overview

7

Pointer Rotation - Generic Case

q

p q

p

Page 8: Overview

8

Pointer rotation example

0 1 2

P Q R

3

Page 9: Overview

9

Pointer Rotation Deutsch-Schorr-Waite (1967) Stores information in the data structure

+ memory efficient+ iterative

– structures are temporary inconsistent– non-concurrent– non-incremental

Page 10: Overview

10

Input Grammar EBNF:

Graph := noOfNodes { Node }.Node := noOfEdges { destination }.

Implicit: each node is numbered starting from 0.

Example:

8 3 1 2 3 3 4 5 6 2 0 6 1 7 0 0 0 1 2

node

Page 11: Overview

11

Example

8 3 1 2 3 3 4 5 6 2 0 6 1 7 0 0 0 1 2

o

1

3

4

5

6

7

2

0 1 2 3 74 5 6

Page 12: Overview

12

Overview Assignment 5: hints

Garbage collection Assignment 4: solution

Page 13: Overview

13

A4 Ex.1 – Loading Page Tables The whole process’ page table is loaded in

hardware when the process is scheduledAdvantage: During the process execution, no

more memory references are needed for the page table.

Disadvantage: If the page table is large, loading the whole page table at every context switch can also hurt performance, as shown in our example.

Page 14: Overview

14

Ex.1 – Loading Page Tables Compute the fraction of the CPU time devoted to

loading the page tables if 32-bit address space, 8 KB pages each process runs for 100 msec

8KB pages 13 bits for the offset 219 entries in the page table

TLoad = 219 · 100nsec = 52.4288msec

TLoad / T = 0.5252% of the CPU time is devoted to loading the

page tables.

Page 15: Overview

15

A4 Ex.2 – Using TLBs The time to read a word from

page table is 50 nsecTLB is 10 nsec

What hit rate is needed to have a mean access time of 20 nsec?

10nsec + (1 - p) · 50nsec = 20nsec

p = 4 / 5 = 0.80

TLB hit rate = 80%

Page 16: Overview

16

Ex.2 – Using TLBs (cont)

How does a TLB function in a system with multiple processes?

Some systems have an instruction which clears all the validity bits. Linux uses this machine instruction to invalidate all TLB entries at a context switch.

Extend the TLB entries with a process identifier field, and add a register to hold the PID of the current process.

Page 17: Overview

17

A4 Ex.3 – Memory Size The time to execute an instruction is 1 µsec or 2001 µsec

if a page fault occurs A program has 15.000 page faults and an execution time

of 60 sec We double the memory size

the interval between the page faults is doubled

T = Ninstr · 1µsec + 15.000 · 2000µsec = 60sec

Ninstr · 1µsec = 60.000.000 - 30.000.000µsec = 30.000.000µsec

T0 = 30.000.000µsec + 7.500 · 2000µsec = 30.000.000 + 15.000.000µsec = 45.000.000µsec = 45sec

Page 18: Overview

18

A4 Ex.5 – The Aging Algorithm

Page0: 01101110Page1: 01001001Page2: 00110111Page3: 10001011

Problems with this algorithm? Loose the ability to distinguish between references early

in the tick interval from those occurring later. Because the counters have a finite number of bits, it may

happen that two pages have a counter value of 0 and we have no way of seeing which of these two pages was last referenced.

Page 19: Overview

19

A4 Ex.6 – Program Run Time Application

TLB hit rate is 75% number of memory access is 55.500.000 Page fault rate 0.005

System performance for this application average TLB miss penalty is 130 nsec average DRAM access time is 50 nsec average disk access time is 9 msec

Which is the application run time on this system? on a system with a better disk with an access time of

6 msec?

Page 20: Overview

20

Ex.6 – Program Run TimeT = pTLB · Nacc · TTLBmiss + Nacc · TDRAM + ·pPF · Nacc · TDisk

T = 4.578.750.000nsec + 2.497.500msecT = 2.502.078,75msec = 2.507,07875sec = 41min

T0 = 4.578,75msec + 0.005 · 55.500.000 · 6msec

T0 = 4.578,75msec + 1.665.000msec

T0 = 1.669.578,75msec = 1.669,57875sec = 27,8min

An increase in disk performance of 33% results in a performance increase of 35% (for this scenario).