Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8...

144
Good and Wicked Fairies The Tragedy of the Commons Understanding the Performance of Java 8 Streams Kirk Pepperdine @kcpeppe MauriceNaftalin @mauricenaftalin #java8fairies

Transcript of Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8...

Page 1: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Good and Wicked FairiesThe Tragedy of the Commons

Understanding the Performanceof Java 8 Streams

Kirk Pepperdine @kcpeppeMauriceNaftalin @mauricenaftalin

#java8fairies

Page 2: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

About Kirk

• Specialises in performance tuning• speaks frequently about performance• author of performance tuning workshop

• Co-founder jClarity• performance diagnositic tooling

• Java Champion (since 2006)

Page 3: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

About Kirk

• Specialises in performance tuning• speaks frequently about performance• author of performance tuning workshop

• Co-founder jClarity• performance diagnositic tooling

• Java Champion (since 2006)

Page 4: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Kirk’s quiz: what is this?

Page 5: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

About MauriceDeveloper, designer, architect, teacher, learner, writer

Page 6: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

About Maurice

Co-author

Developer, designer, architect, teacher, learner, writer

Page 7: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

About Maurice

Co-author Author

Developer, designer, architect, teacher, learner, writer

Page 8: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

About Maurice

Co-author Author

Developer, designer, architect, teacher, learner, writer

Page 9: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

@mauricenaftalin

The Lambda FAQ

www.lambdafaq.org

Page 10: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Varian 620/i

6

First Computer I Used

Page 11: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Agenda

#java8fairies

Page 12: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Agenda

• Define the problem

#java8fairies

Page 13: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Agenda

• Define the problem• Implement a solution

#java8fairies

Page 14: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Agenda

• Define the problem• Implement a solution• Analyse performance

#java8fairies

Page 15: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Agenda

• Define the problem• Implement a solution• Analyse performance

– find the bottleneck

#java8fairies

Page 16: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Agenda

• Define the problem• Implement a solution• Analyse performance

– find the bottleneck

#java8fairies

Page 17: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Agenda

• Define the problem• Implement a solution• Analyse performance

– find the bottleneck

• Fork/Join parallelism in the real world

#java8fairies

Page 18: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Agenda

• Define the problem• Implement a solution• Analyse performance

– find the bottleneck

• Fork/Join parallelism in the real world

#java8fairies

Page 19: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Agenda

• Define the problem• Implement a solution• Analyse performance

– find the bottleneck

• Fork/Join parallelism in the real world

#java8fairies

Page 20: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Case Study: grep -b

grep -b: “The offset in bytes of a matched pattern is displayed in front of the matched line.”

Page 21: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Case Study: grep -b

The Moving Finger writes; and, having writ,Moves on: nor all thy Piety nor Wit

Shall bring it back to cancel half a LineNor all thy Tears wash out a Word of it.

rubai51.txt

grep -b: “The offset in bytes of a matched pattern is displayed in front of the matched line.”

Page 22: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Case Study: grep -b

The Moving Finger writes; and, having writ,Moves on: nor all thy Piety nor Wit

Shall bring it back to cancel half a LineNor all thy Tears wash out a Word of it.

rubai51.txt

grep -b: “The offset in bytes of a matched pattern is displayed in front of the matched line.”

$ grep -b 'W.*t' rubai51.txt 44:Moves on: nor all thy Piety nor Wit 122:Nor all thy Tears wash out a Word of it.

Page 23: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Case Study: grep -b

Page 24: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Obvious iterative implementation- process file line by line

- maintain byte displacement in an accumulator variable

Case Study: grep -b

Page 25: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Obvious iterative implementation- process file line by line

- maintain byte displacement in an accumulator variable

Objective here is to implement it in stream code- no accumulators allowed!

Case Study: grep -b

Page 26: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Streams – Why?

• Intention: replace loops for aggregate operations

10

Page 27: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Streams – Why?

• Intention: replace loops for aggregate operations

List<Person> people = … Set<City> shortCities = new HashSet<>();

for (Person p : people) { City c = p.getCity(); if (c.getName().length() < 4 ) { shortCities.add(c); } }

instead of writing this:

10

Page 28: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Streams – Why?

• Intention: replace loops for aggregate operations

• more concise, more readable, composable operations, parallelizable

Set<City> shortCities = new HashSet<>();

for (Person p : people) { City c = p.getCity(); if (c.getName().length() < 4 ) { shortCities.add(c); } }

instead of writing this:

List<Person> people = … Set<City> shortCities = people.stream() .map(Person::getCity) .filter(c -> c.getName().length() < 4) .collect(toSet());

11

we’re going to write this:

Page 29: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Streams – Why?

• Intention: replace loops for aggregate operations

• more concise, more readable, composable operations, parallelizable

Set<City> shortCities = new HashSet<>();

for (Person p : people) { City c = p.getCity(); if (c.getName().length() < 4 ) { shortCities.add(c); } }

instead of writing this:

List<Person> people = … Set<City> shortCities = people.stream() .map(Person::getCity) .filter(c -> c.getName().length() < 4) .collect(toSet());

11

we’re going to write this:

Page 30: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Streams – Why?

• Intention: replace loops for aggregate operations

• more concise, more readable, composable operations, parallelizable

Set<City> shortCities = new HashSet<>();

for (Person p : people) { City c = p.getCity(); if (c.getName().length() < 4 ) { shortCities.add(c); } }

instead of writing this:

List<Person> people = … Set<City> shortCities = people.parallelStream() .map(Person::getCity) .filter(c -> c.getName().length() < 4) .collect(toSet());

12

we’re going to write this:

Page 31: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Visualising Sequential Streams

x2x0 x1 x3x0 x1 x2 x3

Source Map Filter Reduction

Intermediate Operations

Terminal Operation

“Values in Motion”

Page 32: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Visualising Sequential Streams

x2x0 x1 x3x1 x2 x3 ✔

Source Map Filter Reduction

Intermediate Operations

Terminal Operation

“Values in Motion”

Page 33: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Visualising Sequential Streams

x2x0 x1 x3 x1x2 x3 ❌✔

Source Map Filter Reduction

Intermediate Operations

Terminal Operation

“Values in Motion”

Page 34: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Visualising Sequential Streams

x2x0 x1 x3 x1x2x3 ❌✔

Source Map Filter Reduction

Intermediate Operations

Terminal Operation

“Values in Motion”

Page 35: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Journey’s End, JavaOne, October 2015

Simple Collector – toSet()

Collectors.toSet()

people.stream().collect(Collectors.toSet())

Page 36: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Journey’s End, JavaOne, October 2015

Simple Collector – toSet()

Stream<Person>

Collectors.toSet()

Set<Person>

people.stream().collect(Collectors.toSet())

Collector<Person,?,Set<Person>>

Page 37: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Journey’s End, JavaOne, October 2015

Simple Collector – toSet()

Stream<Person>

Collectors.toSet()

Set<Person>

people.stream().collect(Collectors.toSet())

bill

Collector<Person,?,Set<Person>>

Page 38: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Journey’s End, JavaOne, October 2015

Simple Collector – toSet()

Stream<Person>

Collectors.toSet()

Set<Person>

people.stream().collect(Collectors.toSet())

bill

Collector<Person,?,Set<Person>>

Page 39: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Journey’s End, JavaOne, October 2015

Simple Collector – toSet()

Stream<Person>

Collectors.toSet()

Set<Person>

people.stream().collect(Collectors.toSet())

bill

Collector<Person,?,Set<Person>>

Page 40: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Journey’s End, JavaOne, October 2015

Simple Collector – toSet()

Stream<Person>

Collectors.toSet()

Set<Person>

people.stream().collect(Collectors.toSet())

billjon

Collector<Person,?,Set<Person>>

Page 41: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Journey’s End, JavaOne, October 2015

Simple Collector – toSet()

Stream<Person>

Collectors.toSet()

Set<Person>

people.stream().collect(Collectors.toSet())

billjon

Collector<Person,?,Set<Person>>

Page 42: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Journey’s End, JavaOne, October 2015

Simple Collector – toSet()

Stream<Person>

Collectors.toSet()

Set<Person>

people.stream().collect(Collectors.toSet())

amybill

jon

Collector<Person,?,Set<Person>>

Page 43: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Journey’s End, JavaOne, October 2015

Simple Collector – toSet()

Stream<Person>

Collectors.toSet()

Set<Person>

people.stream().collect(Collectors.toSet())

amy

billjon

Collector<Person,?,Set<Person>>

Page 44: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Journey’s End, JavaOne, October 2015

Simple Collector – toSet()

Collectors.toSet()

Set<Person>

people.stream().collect(Collectors.toSet())

amy

billjon

Collector<Person,?,Set<Person>>

Page 45: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Journey’s End, JavaOne, October 2015

Simple Collector – toSet()

Collectors.toSet()

Set<Person>{ , , }

people.stream().collect(Collectors.toSet())

amybilljon

Collector<Person,?,Set<Person>>

Page 46: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Classical Reduction

+

+

+

0 1 2 3

+

+

+

0

4 5 6 7

0

++

+

Page 47: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Classical Reduction

+

+

+

0 1 2 3

+

+

+

0

4 5 6 7

0

++

+intStream.reduce(0,(a,b) -> a+b)

Page 48: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Mutable Reduction

a

a

a

e0 e1 e2 e3

a

a

a

e4 e5 e6 e7

aa

c

a: accumulatorc: combiner

Supplier

()->[] ()->[]

Page 49: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Mutable Reduction

a

a

a

e0 e1 e2 e3

a

a

a

e4 e5 e6 e7

aa

c

a: accumulatorc: combiner

Supplier

()->[] ()->[]

Page 50: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Agenda

• Define the problem• Implement a solution• Analyse performance

– find the bottleneck

• Fork/Join parallelism in the real world

#java8fairies

Page 51: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

122Nor …Moves … 44

grep -b: Collector combiner

0The …[ , 80Shall … ], ,

Page 52: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

41 122Nor …Moves … 36 44

grep -b: Collector combiner

0The … 44[ , 42 80Shall … ], ,

Page 53: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

41 122Nor …Moves … 36 44

grep -b: Collector combiner

0The … 44[ , 42 80Shall … ]

41 42Nor …Moves … 36 440The … 44[ , 42 0Shall … ]

, ,

,] [

Page 54: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

41 122Nor …Moves … 36 44

grep -b: Collector solution

0The … 44[ , 42 80Shall … ]

41 42Nor …Moves … 36 440The … 44[ , 42 0Shall … ]

, ,

,] [

Page 55: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

41 122Nor …Moves … 36 44

grep -b: Collector solution

0The … 44[ , 42 80Shall … ]

41 42Nor …Moves … 36 440The … 44[ , 42 0Shall … ]

, ,

,] [

80

Page 56: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

41 122Nor …Moves … 36 44

grep -b: Collector solution

0The … 44[ , 42 80Shall … ]

41 42Nor …Moves … 36 440The … 44[ , 42 0Shall … ]

, ,

,] [

80

Page 57: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

41 122Nor …Moves … 36 44

grep -b: Collector solution

0The … 44[ , 42 80Shall … ]

41 42Nor …Moves … 36 440The … 44[ , 42 0Shall … ]

, ,

,] [

80

Page 58: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

41 122Nor …Moves … 36 44

grep -b: Collector combiner

0The … 44[ , 42 80Shall … ]

41 42Nor …Moves … 36 440The … 44[ , 42 0Shall … ]

, ,

,] [

Moves … 36 0 41 0Nor …42 0Shall …0The … 44[ ]] [[ [] ]

Page 59: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

grep -b: Collector accumulator

44 0The moving … writ,

“Moves on: … Wit”

44 0The moving … writ, 36 44Moves on: … Wit

][

][ ,

[ ]

Supplier “The moving … writ,”

accumulator

accumulator

Page 60: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Agenda

• Define the problem• Implement a solution• Analyse performance

– find the bottleneck

• Fork/Join parallelism in the real world

#java8fairies

Page 61: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Because we don’t have a problem?

Why Shouldn’t We Optimize Code?

Page 62: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Why Shouldn’t We Optimize Code?

Because we don’t have a problem- No performance target!

Page 63: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Why Shouldn’t We Optimize Code?

Page 64: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Because we don’t have a problem- No performance target!

Else there is a problem, but not in our process

Why Shouldn’t We Optimize Code?

Page 65: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Because we don’t have a problem- No performance target!

Else there is a problem, but not in our process

Why Shouldn’t We Optimize Code?

Demo…

Page 66: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Because we don’t have a problem- No performance target!

Else there is a problem, but not in our process- The OS is struggling!

Why Shouldn’t We Optimize Code?

Page 67: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Because we don’t have a problem- No performance target!

Else there is a problem, but not in our process- The OS is struggling!

Else there’s a problem in our process, but not in the code

Why Shouldn’t We Optimize Code?

Page 68: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Because we don’t have a problem- No performance target!

Else there is a problem, but not in our process- The OS is struggling!

Else there’s a problem in our process, but not in the code

Why Shouldn’t We Optimize Code?

Demo…

Page 69: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Because we don’t have a problem- No performance target!

Else there is a problem, but not in our process- The OS is struggling!

Else there’s a problem in our process, but not in the code- GC is using all the cycles!

Why Shouldn’t We Optimize Code?

Page 70: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Because we don’t have a problem- No performance target!

Else there is a problem, but not in our process- The OS is struggling!

Else there’s a problem in our process, but not in the code- GC is using all the cycles!

Now we can consider the code

Else there’s a problem in the code… somewhere- now we can go and profile it

Page 71: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Because we don’t have a problem- No performance target!

Else there is a problem, but not in our process- The OS is struggling!

Else there’s a problem in our process, but not in the code- GC is using all the cycles!

Now we can consider the code

Else there’s a problem in the code… somewhere- now we can go and profile it

Demo…

Page 72: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

So, streaming IO is slow

Page 73: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

So, streaming IO is slow

If only there was a way of getting all that data into memory…

Page 74: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

So, streaming IO is slow

If only there was a way of getting all that data into memory…

Demo…

Page 75: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Agenda

• Define the problem• Implement a solution• Analyse performance

– find the bottleneck

• Fork/Join parallelism in the real world

#java8fairies

Page 76: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Agenda

• Define the problem• Implement a solution• Analyse performance

– find the bottleneck OR – have a bright idea!• Fork/Join parallelism in the real world

#java8fairies

Page 77: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Intel Xeon E5 2600 10-core

Page 78: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Parallelism – Why?

The Free Lunch Is Over

http://www.gotw.ca/publications/concurrency-ddj.htm

Page 79: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

The Transistor, Blessed at Birth

Page 80: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

What’s Happened?

Page 81: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

What’s Happened?

Physical limitations of the technology:

Page 82: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

What’s Happened?

Physical limitations of the technology:• signal leakage

Page 83: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

What’s Happened?

Physical limitations of the technology:• signal leakage• heat dissipation

Page 84: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

What’s Happened?

Physical limitations of the technology:• signal leakage• heat dissipation• speed of light!

Page 85: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

What’s Happened?

Physical limitations of the technology:• signal leakage• heat dissipation• speed of light!

– 30cm = 1 light-nanosecond

Page 86: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

What’s Happened?

Physical limitations of the technology:• signal leakage• heat dissipation• speed of light!

– 30cm = 1 light-nanosecond

We’re not going to get faster cores,

Page 87: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

What’s Happened?

Physical limitations of the technology:• signal leakage• heat dissipation• speed of light!

– 30cm = 1 light-nanosecond

We’re not going to get faster cores, we’re going to get more cores!

Page 88: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

x2

Visualizing Parallel Streams

x0

x1

x3

x0

x1x2x3

Page 89: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

x2

Visualizing Parallel Streams

x0

x1

x3

x0

x1

x2x3

Page 90: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

x2

Visualizing Parallel Streams

x1

x3

x0

x1

x3

Page 91: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

x2

Visualizing Parallel Streams

x1 y3

x0

x1

x3

Page 92: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

A Parallel Solution for grep -b

• Parallel streams need splittable sources• Streaming I/O makes you subject to Amdahl’s Law:

Page 93: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

A Parallel Solution for grep -b

• Parallel streams need splittable sources• Streaming I/O makes you subject to Amdahl’s Law:

Page 94: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Blessing – and Curse – on the Transistor

Page 95: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Stream Sources for Parallel Processing

Implemented by a Spliterator

Page 96: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Stream Sources for Parallel Processing

Implemented by a Spliterator

Page 97: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Stream Sources for Parallel Processing

Implemented by a Spliterator

Page 98: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Stream Sources for Parallel Processing

Implemented by a Spliterator

Page 99: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Stream Sources for Parallel Processing

Implemented by a Spliterator

Page 100: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Stream Sources for Parallel Processing

Implemented by a Spliterator

Page 101: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Stream Sources for Parallel Processing

Implemented by a Spliterator

Page 102: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Stream Sources for Parallel Processing

Implemented by a Spliterator

Page 103: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Stream Sources for Parallel Processing

Implemented by a Spliterator

Page 104: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Moves …Wit

LineSpliterator

The moving Finger … writ \n Shall … Line Nor all thy … it\n \n \n

spliterator coverage

Page 105: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Moves …Wit

LineSpliterator

The moving Finger … writ \n Shall … Line Nor all thy … it\n \n \n

spliterator coverage

MappedByteBuffer

Page 106: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Moves …Wit

LineSpliterator

The moving Finger … writ \n Shall … Line Nor all thy … it\n \n \n

spliterator coverage

MappedByteBuffer mid

Page 107: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Moves …Wit

LineSpliterator

The moving Finger … writ \n Shall … Line Nor all thy … it\n \n \n

spliterator coverage

MappedByteBuffer mid

Page 108: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Moves …Wit

LineSpliterator

The moving Finger … writ \n Shall … Line Nor all thy … it\n \n \n

spliterator coveragenew spliterator coverage

MappedByteBuffer mid

Page 109: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Moves …Wit

LineSpliterator

The moving Finger … writ \n Shall … Line Nor all thy … it\n \n \n

spliterator coveragenew spliterator coverage

MappedByteBuffer mid

Demo…

Page 110: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Parallelizing grep -b

• Splitting action of LineSpliterator is O(log n)• Collector no longer needs to compute index• Result (relatively independent of data size):

- sequential stream ~2x as fast as iterative solution- parallel stream >2.5x as fast as sequential stream

- on 4 hardware threads

Page 111: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Parallelizing Streams

Parallel-unfriendly intermediate operations:

stateful ones– need to store some or all of the stream data in memory

– sorted()

those requiring ordering– limit()

Page 112: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Collectors Cost Extra!

Depends on the performance of accumulator and combiner functions• toList(), toSet(), toCollection() – performance

normally dominated by accumulator• but allow for the overhead of managing multithread access to non-

threadsafe containers for the combine operation

• toMap(), toConcurrentMap() – map merging is slow. Resizing maps, especially concurrent maps, is very expensive. Whenever possible, presize all data structures, maps in particular.

Page 113: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Agenda

• Define the problem• Implement a solution• Analyse performance

– find the bottleneck

• Fork/Join parallelism in the real world

#java8fairies

Page 114: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Simulated Server Environment

threadPool.execute(() -> { try { double value = logEntries.parallelStream() .map(applicationStoppedTimePattern::matcher) .filter(Matcher::find) .map( matcher -> matcher.group(2)) .mapToDouble(Double::parseDouble) .summaryStatistics().getSum(); } catch (Exception ex) {}});

Page 115: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

How Does It Perform?

• Total run time: 261.7 seconds• Max: 39.2 secs, Min: 9.2 secs, Median: 22.0

secs

Page 116: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Tragedy of the Commons

Garrett Hardin, ecologist (1968):Imagine the grazing of animals on a common ground. Each flock owner gains if they add to their own flock. But every animal added to the total degrades the commons a small amount.

Page 117: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Tragedy of the Commons

Page 118: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Tragedy of the Commons

You have a finite amount of hardware– it might be in your best interest to grab it all– but if everyone behaves the same way…

Page 119: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Tragedy of the Commons

You have a finite amount of hardware– it might be in your best interest to grab it all– but if everyone behaves the same way…

Be a good neighbor

Page 120: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Tragedy of the Commons

You have a finite amount of hardware– it might be in your best interest to grab it all– but if everyone behaves the same way…

With many parallelStream() operations running concurrently, performance is limited by the size of the common thread pool and the number of cores you have

Be a good neighbor

Page 121: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Configuring Common Pool

Size of common ForkJoinPool is • Runtime.getRuntime().availableProcessors() - 1

-Djava.util.concurrent.ForkJoinPool.common.parallelism=N-Djava.util.concurrent.ForkJoinPool.common.threadFactory-Djava.util.concurrent.ForkJoinPool.common.exceptionHandler

Page 122: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Fork-Join

Support for Fork-Join added in Java 7• difficult coding idiom to master

Used internally by parallel streams• uses a spliterator to segment the stream

• each stream is processed by a ForkJoinWorkerThread

How fork-join works and performs is important to latency

Page 123: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

ForkJoinPool invoke

ForkJoinPool.invoke(ForkJoinTask) uses the submitting thread as a worker• If 100 threads all call invoke(), we would have

100+ForkJoinThreads exhausting the limiting resource, e.g. CPUs, IO, etc.

Page 124: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

ForkJoinPool submit/get

ForkJoinPool.submit(Callable).get() suspends the submitting thread• If 100 threads all call submit(), the work queue can become very

long, thus adding latency

Page 125: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Fork-Join Performance

Fork Join comes with significant overhead• each chunk of work must be large enough to amortize the

overhead

Page 126: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

C/P/N/Q Performance Model

C - number of submittersP - number of CPUsN - number of elementsQ - cost of the operation

Page 127: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

When to go Parallel

The workload of the intermediate operations must be great enough to outweigh the overheads (~100µs):

– initializing the fork/join framework – splitting– concurrent collection

Often quoted as N x Q

size of data set(typically > 10,000) processing cost per element

Page 128: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Kernel Times

CPU will not be the limiting factor when• CPU is not saturated

• kernel times exceed 10% of user time

More threads will decrease performance• predicted by Little’s Law

Page 129: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Common Thread Pool

Fork-Join by default uses a common thread pool• default number of worker threads == number of logical cores - 1

• Always contains at least one thread

Performance is tied to whichever you run out of first• availability of the constraining resource

• number of ForkJoinWorkerThreads/hardware threads

Page 130: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Everyone is working together to get it done!

All Hands on Deck!!!

Page 131: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Little’s Law

Fork-Join is a work queue• work queue behavior is typically modeled using Little’s Law

Number of tasks in a system equals the arrival rate times the amount of time it takes to clear an item

Task is submitted every 500ms, or 2 per second Number of tasks = 2/sec * 2.8 seconds = 5.6 tasks

Page 132: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Components of Latency

Latency is time from stimulus to result• internally, latency consists of active and dead time

Reducing dead time assumes you• can find it and are able to fill in with useful work

Page 133: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

From Previous Example

if there is available hardware capacity then make the pool biggerelse add capacity or tune to reduce strength of the dependency

Page 134: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

ForkJoinPool Observability

• In an application where you have many parallel stream operations all running concurrently performance will be affected by the size of the common thread pool• too small can starve threads from needed resources• too big can cause threads to thrash on contended resources

ForkJoinPool comes with no visibility• no metrics to help us tune

• instrument ForkJoinTask.invoke()• gather measures that can be feed into Little’s Law

Page 135: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

• Collect• service times

• time submitted to time returned

• inter-arrival times

Instrumenting ForkJoinPool

public final V invoke() { ForkJoinPool.common.getMonitor().submitTask(this); int s; if ((s = doInvoke() & DONE_MASK) != NORMAL) reportException(s); ForkJoinPool.common.getMonitor().retireTask(this); return getRawResult(); }

Page 136: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Performance

Submit log parsing to our own ForkJoinPool

new ForkJoinPool(16).submit(() -> ……… ).get()new ForkJoinPool(8).submit(() -> ……… ).get()new ForkJoinPool(4).submit(() -> ……… ).get()

Page 137: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

16 workerthreads

8 workerthreads

4 workerthreads

Page 138: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

0

75000

150000

225000

300000

Stream Parallel Flood Stream Flood Parallel

Page 139: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Performance mostly doesn’t matterBut if you must…• sequential streams normally beat iterative solutions• parallel streams can utilize all cores, providing

- the data is efficiently splittable- the intermediate operations are sufficiently expensive and are

CPU-bound- there isn’t contention for the processors

Conclusions

#java8fairies

Page 140: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Resources

#java8fairies

Page 142: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Resources

http://gee.cs.oswego.edu/dl/html/StreamParallelGuidance.htmlhttp://shipilev.net/talks/devoxx-Nov2013-benchmarking.pdf

#java8fairies

Page 143: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Resources

http://gee.cs.oswego.edu/dl/html/StreamParallelGuidance.htmlhttp://shipilev.net/talks/devoxx-Nov2013-benchmarking.pdfhttp://openjdk.java.net/projects/code-tools/jmh/

#java8fairies

Page 144: Good and Wicked Fairies, and the Tragedy of the Commons: Understanding the Performance of Java 8 Streams

Resources

http://gee.cs.oswego.edu/dl/html/StreamParallelGuidance.htmlhttp://shipilev.net/talks/devoxx-Nov2013-benchmarking.pdfhttp://openjdk.java.net/projects/code-tools/jmh/

#java8fairies