Barriers: Friend or Foe?

Post on 10-Jan-2016

36 views 0 download

description

Barriers: Friend or Foe?. Steve Blackburn Department of Computer Science Australian National University. Tony Hosking Department of Computer Sciences Purdue University. Read & Write Barrier Costs. Are r/w barrier costs significant?. Read and Write Barriers. - PowerPoint PPT Presentation

Transcript of Barriers: Friend or Foe?

Friday, April 21, 2023

International Symposium on Memory Management

Vancouver BC, October 2004

Barriers: Friend or Foe?

Steve BlackburnDepartment of Computer Science

Australian National University

Tony HoskingDepartment of Computer Sciences

Purdue University

Friday, April 21, 2023

International Symposium on Memory Management

Vancouver BC, October 2004

Read & Write Barrier Costs

Are r/w barrier costs significant?

Friday, April 21, 2023

International Symposium on Memory Management

Vancouver BC, October 2004

Read and Write Barriers• Algorithmically powerful mechanisms

– Extend semantics of each read/write• Particularly useful to GC• Untested assumption:

“read/write barriers are expensive”– Curtails creativity in GC algorithm

development– Encourages (unnecessary?) work on

avoidance• Prior work

– [Zorn 1990] (used simulation & traces)– [Blackburn & McKinley 2002] (compilation & inlining)

Friday, April 21, 2023

International Symposium on Memory Management

Vancouver BC, October 2004

Our Contributions

• Methodology for measurement• Evaluate mutator overhead

– 5 common w/b, 2 r/b– 9 benchmarks– 3 architectures (AMD, P4, PPC)– Exclude compiler, GC from

measurements

Friday, April 21, 2023

International Symposium on Memory Management

Vancouver BC, October 2004

Methodology• Want to remove barrier

– Compare with and without barrier• Add full trace to generational collector

– Remembered objects irrelevant– Can include/exclude barrier

• MMTk, Jikes RVM– Hardware performance counters– Pseudo-adaptive (realistic, deterministic)– Second iteration (avoid compiler overhead)– Best of 5 (least disturbed)

Friday, April 21, 2023

International Symposium on Memory Management

Vancouver BC, October 2004

Write Barrier Code

1 public final void writeBarrier(ObjectReference src, Address slot,2 ObjectReference tgt, int mode)3 throws InlinePragma {4 // insert write barrier code here5 slot.store (tgt); 6 }

Friday, April 21, 2023

International Symposium on Memory Management

Vancouver BC, October 2004

Write Barrier Code cont.

Java PPC asm x86 asm

Boundary(Slot)

4 if (slot.LT(NURSERY_START)5 && tgt.GE(NURSERY_START))6 remSlots.insert(slot);

1 liu R3,0x6e102 cmplW cr1,R30,R33 bge 1 544 liu R3,0x6e105 cmplW cr1,R31,R36 bge 1 7c

1 cmp edi 0xa02000002 jlge 03 cmp ebx 0xa02000004 jlge 0

Object 4 if (getHeader(src)5 .and(LOGGING_MASK)6 .EQ(UNLOGGED))7 rememberObject(src);

1 lwz R4,-8(R5)2 rlinm R4,R4,0x0,0x1d,0x1d3 cmpiW cr1,R4,0x44 beq 1 78

1 mov ecx -8[edx]2 and ecx 43 cmp ecx 44 jeq 0

Card 4 int card=src.rshl(LOG_CARD_SIZE);5 cardTable.add(card).store((byte) 1);

1 lwz R5,0x1664(JT)2 rlinm R6,R3,0x16,0xa,0x1f3 lil R7,0x14 stbx R7,R5,R6

1 mov ebx [0x290279a]2 shr eax 103 mov [0+ebx+eax<<0] 1

Zone 4 if (slot.xor(tgt).GE(ZONE_SIZE))5 remSlots.insert(slot);

1 xor R3,R30,R312 liu R5,0x403 cmplW cr1,R3,R54 bge 1 74

1 mov edi eax2 mov eax edi3 xor eax ebx4 cmp eax 0x4000005 jlge 0

Friday, April 21, 2023

International Symposium on Memory Management

Vancouver BC, October 2004

Experiments: Hardware

• 3 platforms:– 1.9GHz AMD Athlon XP 2600 1GB– 2.6GHz Pentium 4 1GB – 1.6GHz PowerPC 970 768MB

• AMD and Intel performance counters– cycles– instructions retired– L1/L2 cache misses– TLB misses– both mutator and collector, separately

Friday, April 21, 2023

International Symposium on Memory Management

Vancouver BC, October 2004

Experiments: Software

• MMTk in Jikes RVM version 2.3.2+CVS – ignore remsets GC configuration (now in

MMTk)– patched to support performance counters– pseudo-adaptive compilation– read barriers

• Debian Linux 2.6.0 kernel + x86 perfctr• Standalone mode

Friday, April 21, 2023

International Symposium on Memory Management

Vancouver BC, October 2004

Write Barrier Overheadmean of SPECjvm98 & SPECjbb

0%

1%

2%

3%

4%

5%

6%

Boundary Object Hybrid Zone Card

Overhead

amd

p4

ppc

Friday, April 21, 2023

International Symposium on Memory Management

Vancouver BC, October 2004

Write Barrier Code (Again)

Java PPC asm x86 asm

Boundary(Slot)

4 if (slot.LT(NURSERY_START)5 && tgt.GE(NURSERY_START))6 remSlots.insert(slot);

1 liu R3,0x6e102 cmplW cr1,R30,R33 bge 1 544 liu R3,0x6e105 cmplW cr1,R31,R36 bge 1 7c

1 cmp edi 0xa02000002 jlge 03 cmp ebx 0xa02000004 jlge 0

Object 4 if (getHeader(src)5 .and(LOGGING_MASK)6 .EQ(UNLOGGED))7 rememberObject(src);

1 lwz R4,-8(R5)2 rlinm R4,R4,0x0,0x1d,0x1d3 cmpiW cr1,R4,0x44 beq 1 78

1 mov ecx -8[edx]2 and ecx 43 cmp ecx 44 jeq 0

Card 4 int card=src.rshl(LOG_CARD_SIZE);5 cardTable.add(card).store((byte) 1);

1 lwz R5,0x1664(JT)2 rlinm R6,R3,0x16,0xa,0x1f3 lil R7,0x14 stbx R7,R5,R6

1 mov ebx [0x290279a]2 shr eax 103 mov [0+ebx+eax<<0] 1

Zone 4 if (slot.xor(tgt).GE(ZONE_SIZE))5 remSlots.insert(slot);

1 xor R3,R30,R312 liu R5,0x403 cmplW cr1,R3,R54 bge 1 74

1 mov edi eax2 mov eax edi3 xor eax ebx4 cmp eax 0x4000005 jlge 0

Friday, April 21, 2023

International Symposium on Memory Management

Vancouver BC, October 2004

Write Barrier Code (Again)

Java PPC asm x86 asm

Boundary(Slot)

4 if (slot.LT(NURSERY_START)5 && tgt.GE(NURSERY_START))6 remSlots.insert(slot);

1 liu R3,0x6e102 cmplW cr1,R30,R33 bge 1 544 liu R3,0x6e105 cmplW cr1,R31,R36 bge 1 7c

1 cmp edi 0xa02000002 jlge 03 cmp ebx 0xa02000004 jlge 0

Object 4 if (getHeader(src)5 .and(LOGGING_MASK)6 .EQ(UNLOGGED))7 rememberObject(src);

1 lwz R4,-8(R5)2 rlinm R4,R4,0x0,0x1d,0x1d3 cmpiW cr1,R4,0x44 beq 1 78

1 mov ecx -8[edx]2 and ecx 43 cmp ecx 44 jeq 0

Card 4 int card=src.rshl(LOG_CARD_SIZE);5 cardTable.add(card).store((byte) 1);

1 lwz R5,0x1664(JT)2 rlinm R6,R3,0x16,0xa,0x1f3 lil R7,0x14 stbx R7,R5,R6

1 mov ebx [0x290279a]2 shr eax 103 mov [0+ebx+eax<<0] 1

Zone 4 if (slot.xor(tgt).GE(ZONE_SIZE))5 remSlots.insert(slot);

1 xor R3,R30,R312 liu R5,0x403 cmplW cr1,R3,R54 bge 1 74

1 mov edi eax2 mov eax edi3 xor eax ebx4 cmp eax 0x4000005 jlge 0

Friday, April 21, 2023

International Symposium on Memory Management

Vancouver BC, October 2004

Write Barrier Code (Again)

Java PPC asm x86 asm

Boundary(Slot)

4 if (slot.LT(NURSERY_START)5 && tgt.GE(NURSERY_START))6 remSlots.insert(slot);

1 liu R3,0x6e102 cmplW cr1,R30,R33 bge 1 544 liu R3,0x6e105 cmplW cr1,R31,R36 bge 1 7c

1 cmp edi 0xa02000002 jlge 03 cmp ebx 0xa02000004 jlge 0

Object 4 if (getHeader(src)5 .and(LOGGING_MASK)6 .EQ(UNLOGGED))7 rememberObject(src);

1 lwz R4,-8(R5)2 rlinm R4,R4,0x0,0x1d,0x1d3 cmpiW cr1,R4,0x44 beq 1 78

1 mov ecx -8[edx]2 and ecx 43 cmp ecx 44 jeq 0

Card 4 int card=src.rshl(LOG_CARD_SIZE);5 cardTable.add(card).store((byte) 1);

1 lwz R5,0x1664(JT)2 rlinm R6,R3,0x16,0xa,0x1f3 lil R7,0x14 stbx R7,R5,R6

1 mov ebx [0x290279a]2 shr eax 103 mov [0+ebx+eax<<0] 1

Zone 4 if (slot.xor(tgt).GE(ZONE_SIZE))5 remSlots.insert(slot);

1 xor R3,R30,R312 liu R5,0x403 cmplW cr1,R3,R54 bge 1 74

1 mov edi eax2 mov eax edi3 xor eax ebx4 cmp eax 0x4000005 jlge 0

Friday, April 21, 2023

International Symposium on Memory Management

Vancouver BC, October 2004

AMD Athlon 2600+ 1.9GHz

Write Barrier

-4%

-2%

0%

2%

4%

6%

8%

10%

12%

_201_compress

_202_jess

_205_raytrace

_209_db_213_javac

_222_mpegaudio

_227_mtrt_228_jackpseudojbb

mean

Boundary

Object

Hybrid

Zone

Card

Friday, April 21, 2023

International Symposium on Memory Management

Vancouver BC, October 2004

Intel P4 2.6GHzWrite Barrier

-4%

-2%

0%

2%

4%

6%

8%

10%

_201_compress

_202_jess

_205_raytrace

_209_db_213_javac

_222_mpegaudio

_227_mtrt_228_jackpseudojbb

mean

Boundary

Object

Hybrid

Zone

Card

Friday, April 21, 2023

International Symposium on Memory Management

Vancouver BC, October 2004

G5 PowerPC 970 1.6GHz Write Barrier

-2%

0%

2%

4%

6%

8%

10%

12%

14%

_201_compress

_202_jess

_205_raytrace

_209_db_213_javac

_222_mpegaudio

_227_mtrt_228_jackpseudojbb

mean

Boundary

Object

Hybrid

Zone

Card

Friday, April 21, 2023

International Symposium on Memory Management

Vancouver BC, October 2004

Performance Counters

Friday, April 21, 2023

International Symposium on Memory Management

Vancouver BC, October 2004

Intel P4 2.6GHzWrite Barrier Retired

Instructions

-2%

0%

2%

4%

6%

8%

10%

12%

_201_compress

_202_jess

_205_raytrace

_209_db_213_javac

_222_mpegaudio

_227_mtrt_228_jackpseudojbb

mean

Boundary

Object

Hybrid

Zone

Card

Friday, April 21, 2023

International Symposium on Memory Management

Vancouver BC, October 2004

Intel P4 2.6GHzWrite Barrier L1 Misses

-50%

-40%

-30%

-20%

-10%

0%

10%

20%

_201_compress

_202_jess

_205_raytrace

_209_db_213_javac

_222_mpegaudio

_227_mtrt_228_jackpseudojbb

mean

Boundary

Object

Hybrid

Zone

Card

Friday, April 21, 2023

International Symposium on Memory Management

Vancouver BC, October 2004

Intel P4 2.6GHzWrite Barrier L2 Misses

-60%

-40%

-20%

0%

20%

40%

60%

80%

100%

120%

140%

_201_compress

_202_jess

_205_raytrace

_209_db_213_javac

_222_mpegaudio

_227_mtrt_228_jackpseudojbb

mean

Boundary

Object

Hybrid

Zone

Card

Friday, April 21, 2023

International Symposium on Memory Management

Vancouver BC, October 2004

Intel P4 2.6GHzWrite Barrier DTLB Misses

-15%

-10%

-5%

0%

5%

10%

15%

20%

25%

_201_compress

_202_jess

_205_raytrace

_209_db_213_javac

_222_mpegaudio

_227_mtrt_228_jackpseudojbb

mean

Boundary

Object

Hybrid

Zone

Card

Friday, April 21, 2023

International Symposium on Memory Management

Vancouver BC, October 2004

Read Barrier Code

1 public final ObjectReference readBarrier(ObjectReference obj,

2 Address slot, int mode)

3 throws InlinePragma {4 ObjectReference value = slot.loadObjectReference();5 return value; // insert read barrier code here6 }

Friday, April 21, 2023

International Symposium on Memory Management

Vancouver BC, October 2004

Read Barrier Code cont.

Java PPC asm x86 asm

Unconditional

5 return value.and(~3); 1 rlinm R3,R3,0x0,0x0,0x1d

1 and cax -4

Conditional 5 if (value.and(1).NE(1))6 return value;7 else8 return 0;

1 rlinm R4,R3,0x0,0x1f,0x1f2 cmpiW cr1,R4,0x13 bne 1 3c

1 mov edx eax2 and edx 13 cmp edx 14 mov edx 05 cmovne edx eax6 mov eax edx

Friday, April 21, 2023

International Symposium on Memory Management

Vancouver BC, October 2004

Read Barrier Overheadmean of SPECjvm98 & SPECjbb

0%

5%

10%

15%

20%

25%

Unconditional Conditonal

Overhead

amd

p4

ppc

Friday, April 21, 2023

International Symposium on Memory Management

Vancouver BC, October 2004

AMD Athlon 2600+ 1.9GHz Read Barrier

0%5%

10%15%20%25%30%35%40%

_201_compress

_202_jess

_205_raytrace

_209_db_213_javac

_222_mpegaudio

_227_mtrt_228_jackpseudojbb

mean

Unconditional

Conditional

Friday, April 21, 2023

International Symposium on Memory Management

Vancouver BC, October 2004

Intel P4 2.6GHzRead Barrier

0%

5%

10%

15%

20%

25%

30%

35%

_201_compress

_202_jess

_205_raytrace

_209_db_213_javac

_222_mpegaudio

_227_mtrt_228_jackpseudojbb

mean

Unconditional

Conditional

Friday, April 21, 2023

International Symposium on Memory Management

Vancouver BC, October 2004

G5 PowerPC 970 1.6GHz Read Barrier

-4%

-2%

0%

2%

4%

6%

8%

10%

12%

14%

16%

_201_compress

_202_jess

_205_raytrace

_209_db_213_javac

_222_mpegaudio

_227_mtrt_228_jackpseudojbb

mean

Unconditional

Conditional

Friday, April 21, 2023

International Symposium on Memory Management

Vancouver BC, October 2004

Conclusions• New methodology: available in MMTk

– Specific barrier patches at:http://cs.anu.edu.au/~Steve.Blackburn/pubs/wb-ismm-2004.tgz

• Barrier costs (often) surprisingly low• Barrier costs very architecturally

sensitive– GC developers: think about your target arch.– GC papers: what architecture did they use?– Architects: choices impact OO languages in

surprising ways.