Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big...
-
Upload
chris-richardson -
Category
Technology
-
view
10.292 -
download
0
description
Transcript of Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big...
![Page 1: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/1.jpg)
@crichardson
Map(), flatMap() and reduce() are your new best friends:
Simpler collections, concurrency, and big data
Chris Richardson
Author of POJOs in ActionFounder of the original CloudFoundry.com
@[email protected]://plainoldobjects.com
![Page 2: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/2.jpg)
@crichardson
Presentation goalHow functional programming simplifies
your code
Show that map(), flatMap() and reduce()
are remarkably versatile functions
![Page 3: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/3.jpg)
@crichardson
About Chris
![Page 4: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/4.jpg)
@crichardson
About Chris
Founder of a buzzword compliant (stealthy, social, mobile, big data, machine learning, ...) startup
Consultant helping organizations improve how they architect and deploy applications using cloud, micro services, polyglot applications, NoSQL, ...
![Page 5: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/5.jpg)
@crichardson
Agenda
Why functional programming?
Simplifying collection processing
Eliminating NullPointerExceptions
Simplifying concurrency with Futures and Rx Observables
Tackling big data problems with functional programming
![Page 6: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/6.jpg)
@crichardson
What’s functional programming?
![Page 7: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/7.jpg)
@crichardson
It’s a programming paradigm
![Page 8: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/8.jpg)
@crichardson
It’s a kind of programming language
![Page 9: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/9.jpg)
@crichardson
Functions as the building blocks of the application
![Page 10: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/10.jpg)
@crichardson
Functions as first class citizens
Assign functions to variables
Store functions in fields
Use and write higher-order functions:
Pass functions as arguments
Return functions as values
![Page 11: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/11.jpg)
@crichardson
Avoids mutable stateUse:
Immutable data structures
Single assignment variables
Some functional languages such as Haskell don’t side-effects
There are benefits to immutability
Easier concurrency
More reliable code
But be pragmatic
![Page 12: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/12.jpg)
@crichardson
Why functional programming?
"the highest goal of programming-language design to enable good ideas to be elegantly
expressed"
http://en.wikipedia.org/wiki/Tony_Hoare
![Page 13: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/13.jpg)
@crichardson
Why functional programming?
More expressive
More concise
More intuitive - solution matches problem definition
Elimination of error-prone mutable state
Easy parallelization
![Page 14: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/14.jpg)
@crichardson
An ancient idea that has recently become popular
![Page 15: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/15.jpg)
@crichardson
Mathematical foundation:
λ-calculus
Introduced byAlonzo Church in the 1930s
![Page 16: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/16.jpg)
@crichardson
Lisp = an early functional language invented in 1958
http://en.wikipedia.org/wiki/Lisp_(programming_language)
1940
1950
1960
1970
1980
1990
2000
2010
garbage collection dynamic typing
self-hosting compiler tree data structures
(defun factorial (n) (if (<= n 1) 1 (* n (factorial (- n 1)))))
![Page 17: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/17.jpg)
@crichardson
My final year project in 1985: Implementing SASL
sieve (p:xs) = p : sieve [x | x <- xs, rem x p > 0];
primes = sieve [2..]
A list of integers starting with 2
Filter out multiples of p
![Page 18: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/18.jpg)
Mostly an Ivory Tower technology
Lisp was used for AI
FP languages: Miranda, ML, Haskell, ...
“Side-effects kills kittens and
puppies”
![Page 19: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/19.jpg)
@crichardson
http://steve-yegge.blogspot.com/2010/12/haskell-researchers-announce-discovery.html
!*
!*
!*
![Page 20: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/20.jpg)
@crichardson
But today FP is mainstreamClojure - a dialect of Lisp
A hybrid OO/functional language
A hybrid OO/FP language for .NET
Java 8 has lambda expressions
![Page 21: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/21.jpg)
@crichardson
Java 8 lambda expressions are functions
x -> x * x
x -> { for (int i = 2; i < Math.sqrt(x); i = i + 1) { if (x % i == 0) return false; } return true; };
(x, y) -> x * x + y * y
![Page 22: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/22.jpg)
@crichardson
Java 8 lambdas are a shorthand* for an anonymous
inner class
* not exactly. See http://programmers.stackexchange.com/questions/177879/type-inference-in-java-8
![Page 23: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/23.jpg)
@crichardson
Java 8 functional interfacesInterface with a single abstract method
e.g. Runnable, Callable, Spring’s TransactionCallback
A lambda expression is an instance of a functional interface.
You can use a lambda wherever a function interface “value” is expected
The type of the lambda expression is determined from it’s context
![Page 24: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/24.jpg)
@crichardson
Example Functional InterfaceFunction<Integer, Integer> square = x -> x * x;
BiFunction<Integer, Integer, Integer> sumSquares = (x, y) -> x * x + y * y;
Predicate<Integer> makeIsDivisibleBy(int y) { return x -> x % y == 0;}
Predicate<Integer> isEven = makeIsDivisibleBy(2);
Assert.assertTrue(isEven.test(8));Assert.assertFalse(isEven.test(11));
![Page 25: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/25.jpg)
@crichardson
Example Functional InterfaceExecutorService executor = ...;
final int x = 999
Future<Boolean> outcome = executor.submit(() -> { for (int i = 2; i < Math.sqrt(x); i = i + 1) { if (x % i == 0) return false; } return true; }
This lambda is a Callable
![Page 26: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/26.jpg)
@crichardson
Agenda
Why functional programming?
Simplifying collection processing
Eliminating NullPointerExceptions
Simplifying concurrency with Futures and Rx Observables
Tackling big data problems with functional programming
![Page 27: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/27.jpg)
@crichardson
Lot’s of application code=
collection processing:
Mapping, filtering, and reducing
![Page 28: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/28.jpg)
@crichardson
Social network examplepublic class Person {
enum Gender { MALE, FEMALE }
private Name name; private LocalDate birthday; private Gender gender; private Hometown hometown;
private Set<Friend> friends = new HashSet<Friend>(); ....
public class Friend {
private Person friend; private LocalDate becameFriends; ...}
public class SocialNetwork { private Set<Person> people; ...
![Page 29: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/29.jpg)
@crichardson
Mapping, filtering, and reducingpublic class Person {
public Set<Hometown> hometownsOfFriends() { Set<Hometown> result = new HashSet<>(); for (Friend friend : friends) { result.add(friend.getPerson().getHometown()); } return result; }
![Page 30: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/30.jpg)
@crichardson
Mapping, filtering, and reducing
public class Person {
public Set<Person> friendOfFriends() { Set<Person> result = new HashSet(); for (Friend friend : friends) for (Friend friendOfFriend : friend.getPerson().friends) if (friendOfFriend.getPerson() != this) result.add(friendOfFriend.getPerson()); return result; }
![Page 31: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/31.jpg)
@crichardson
Mapping, filtering, and reducingpublic class SocialNetwork {
private Set<Person> people;
...
public Set<Person> lonelyPeople() { Set<Person> result = new HashSet<Person>(); for (Person p : people) { if (p.getFriends().isEmpty()) result.add(p); } return result; }
![Page 32: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/32.jpg)
@crichardson
Mapping, filtering, and reducingpublic class SocialNetwork {
private Set<Person> people;
...
public int averageNumberOfFriends() { int sum = 0; for (Person p : people) { sum += p.getFriends().size(); } return sum / people.size(); }
![Page 33: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/33.jpg)
@crichardson
Problems with this style of programming
Low level
Imperative (how to do it) NOT declarative (what to do)
Verbose
Mutable variables are potentially error prone
Difficult to parallelize
![Page 34: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/34.jpg)
@crichardson
Java 8 streams to the rescue
A sequence of elements
“Wrapper” around a collection
Streams can also be infinite
Provides a functional/lambda-based API for transforming, filtering and aggregating elements
Much simpler, cleaner code
![Page 35: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/35.jpg)
@crichardson
Using Java 8 streams - mappingclass Person ..
private Set<Friend> friends = ...;
public Set<Hometown> hometownsOfFriends() { return friends.stream() .map(f -> f.getPerson().getHometown()) .collect(Collectors.toSet()); }
![Page 36: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/36.jpg)
@crichardson
The map() function
s1 a b c d e ...
s2 f(a) f(b) f(c) f(d) f(e) ...
s2 = s1.map(f)
![Page 37: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/37.jpg)
@crichardson
public class SocialNetwork {
private Set<Person> people;
...
public Set<Person> peopleWithNoFriends() { Set<Person> result = new HashSet<Person>(); for (Person p : people) { if (p.getFriends().isEmpty()) result.add(p); } return result; }
Using Java 8 streams - filteringpublic class SocialNetwork {
private Set<Person> people;
...
public Set<Person> lonelyPeople() { return people.stream()
.filter(p -> p.getFriends().isEmpty())
.collect(Collectors.toSet()); }
![Page 38: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/38.jpg)
@crichardson
Using Java 8 streams - friend of friends V1
class Person ..
public Set<Person> friendOfFriends() { Set<Set<Friend>> fof = friends.stream() .map(friend -> friend.getPerson().friends) .collect(Collectors.toSet()); ... }
Using map() => Set of Sets :-(
Somehow we need to flatten
![Page 39: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/39.jpg)
@crichardson
Using Java 8 streams - mapping
class Person ..
public Set<Person> friendOfFriends() { return friends.stream() .flatMap(friend -> friend.getPerson().friends.stream()) .map(Friend::getPerson) .filter(f -> f != this) .collect(Collectors.toSet()); }
maps and flattens
![Page 40: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/40.jpg)
@crichardson
The flatMap() function
s1 a b ...
s2 f(a)0 f(a)1 f(b)0 f(b)1 f(b)2 ...
s2 = s1.flatMap(f)
![Page 41: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/41.jpg)
@crichardson
Using Java 8 streams - reducingpublic class SocialNetwork {
private Set<Person> people;
...
public long averageNumberOfFriends() { return people.stream() .map ( p -> p.getFriends().size() ) .reduce(0, (x, y) -> x + y) / people.size(); } int x = 0;
for (int y : inputStream) x = x + yreturn x;
![Page 42: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/42.jpg)
@crichardson
The reduce() function
s1 a b c d e ...
x = s1.reduce(initial, f)
f(f(f(f(f(f(initial, a), b), c), d), e), ...)
![Page 43: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/43.jpg)
@crichardson
Newton's method for finding square roots
public class SquareRootCalculator {
public double squareRoot(double input, double precision) { return Stream.iterate( new Result(1.0), current -> refine(current, input, precision)) .filter(r -> r.done) .findFirst().get().value; }
private static Result refine(Result current, double input, double precision) { double value = current.value; double newCurrent = value - (value * value - input) / (2 * value); boolean done = Math.abs(value - newCurrent) < precision; return new Result(newCurrent, done); } class Result { boolean done; double value; }
Creates an infinite stream: seed, f(seed), f(f(seed)), .....
Don’t panic! Streams are lazy
![Page 44: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/44.jpg)
@crichardson
Agenda
Why functional programming?
Simplifying collection processing
Eliminating NullPointerExceptions
Simplifying concurrency with Futures and Rx Observables
Tackling big data problems with functional programming
![Page 45: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/45.jpg)
@crichardson
Tony’s $1B mistake
“I call it my billion-dollar mistake. It was the invention of the null
reference in 1965....But I couldn't resist the temptation to put in a null reference, simply because it
was so easy to implement...”
http://qconlondon.com/london-2009/presentation/Null+References:+The+Billion+Dollar+Mistake
![Page 46: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/46.jpg)
@crichardson
Coding with null pointersclass Person
public Friend longestFriendship() { Friend result = null; for (Friend friend : friends) { if (result == null || friend.getBecameFriends() .isBefore(result.getBecameFriends())) result = friend; } return result; }
Friend oldestFriend = person.longestFriendship();if (oldestFriend != null) { ...} else { ...}
Null check is essential yet easily forgotten
![Page 47: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/47.jpg)
@crichardson
Java 8 Optional<T>A wrapper for nullable references
It has two states:
empty ⇒ throws an exception if you try to get the reference
non-empty ⇒ contain a non-null reference
Provides methods for:
testing whether it has a value
getting the value
...
Return reference wrapped in an instance of this type instead of null
![Page 48: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/48.jpg)
@crichardson
Coding with optionalsclass Person public Optional<Friend> longestFriendship() { Friend result = null; for (Friend friend : friends) { if (result == null || friend.getBecameFriends().isBefore(result.getBecameFriends())) result = friend; } return Optional.ofNullable(result); }
Optional<Friend> oldestFriend = person.longestFriendship();// Might throw java.util.NoSuchElementException: No value present// Person dangerous = popularPerson.get();if (oldestFriend.isPresent) { ...oldestFriend.get()} else { ...}
![Page 49: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/49.jpg)
@crichardson
Using Optionals - better
Optional<Friend> oldestFriendship = ...;
Friend whoToCall1 = oldestFriendship.orElse(mother);
Avoid calling isPresent() and get()
Friend whoToCall3 = oldestFriendship.orElseThrow( () -> new LonelyPersonException());
Friend whoToCall2 = oldestFriendship.orElseGet(() -> lazilyFindSomeoneElse());
![Page 50: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/50.jpg)
@crichardson
Using Optional.map()public class Person {
public Optional<Friend> longestFriendship() { return ...; }
public Optional<Long> ageDifferenceWithOldestFriend() { Optional<Friend> oldestFriend = longestFriendship(); return oldestFriend.map ( of -> Math.abs(of.getPerson().getAge() - getAge())) ); }
Eliminates messy conditional logic
![Page 51: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/51.jpg)
@crichardson
Using flatMap()class Person
public Optional<Friend> longestFriendship() {...}
public Optional<Friend> longestFriendshipOfLongestFriend() { return longestFriendship() .flatMap(friend -> friend.getPerson().longestFriendship());}
not always a symmetric relationship. :-)
![Page 52: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/52.jpg)
@crichardson
Agenda
Why functional programming?
Simplifying collection processing
Eliminating NullPointerExceptions
Simplifying concurrency with Futures and Rx Observables
Tackling big data problems with functional programming
![Page 53: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/53.jpg)
@crichardson
Let’s imagine you are performing a CPU intensive operation
class Person ..
public Set<Hometown> hometownsOfFriends() { return friends.stream() .map(f -> cpuIntensiveOperation()) .collect(Collectors.toSet()); }
![Page 54: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/54.jpg)
@crichardson
class Person ..
public Set<Hometown> hometownsOfFriends() { return friends.parallelStream() .map(f -> cpuIntensiveOperation()) .collect(Collectors.toSet()); }
Parallel streams = simple concurrency Potentially uses N cores
⇒Nx speed up
![Page 55: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/55.jpg)
@crichardson
Let’s imagine that you are writing code to display the
products in a user’s wish list
![Page 56: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/56.jpg)
@crichardson
The need for concurrency
Step #1
Web service request to get the user profile including wish list (list of product Ids)
Step #2
For each productId: web service request to get product info
Sequentially ⇒ terrible response time
Need fetch productInfo concurrently
![Page 57: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/57.jpg)
@crichardson
Futures are a great concurrency abstraction
http://en.wikipedia.org/wiki/Futures_and_promises
![Page 58: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/58.jpg)
@crichardson
Worker thread or event-driven
Main threadHow futures work
Outcome
Future
Client
get
Asynchronous operation
set
initiates
![Page 59: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/59.jpg)
@crichardson
BenefitsSimple way for two concurrent activities to communicate safely
Abstraction:
Client does not know how the asynchronous operation is implemented
Easy to implement scatter/gather:
Scatter: Client can invoke multiple asynchronous operations and gets a Future for each one.
Gather: Get values from the futures
![Page 60: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/60.jpg)
@crichardson
Example wish list servicepublic interface UserService { Future<UserProfile> getUserProfile(long userId);}
public class UserServiceProxy implements UserService {
private ExecutorService executorService;
@Override public Future<UserProfile> getUserProfile(long userId) { return executorService.submit(() -> restfulGet("http://uservice/user/" + userId,
UserProfile.class)); } ...}
public interface ProductInfoService { Future<ProductInfo> getProductInfo(long productId);}
![Page 61: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/61.jpg)
@crichardson
public class WishlistService {
private UserService userService; private ProductInfoService productInfoService;
public Wishlist getWishlistDetails(long userId) throws Exception {
Future<UserProfile> userProfileFuture = userService.getUserProfile(userId); UserProfile userProfile = userProfileFuture.get(300, TimeUnit.MILLISECONDS);
Example wish list serviceget user
info
List<Future<ProductInfo>> productInfoFutures = userProfile.getWishListProductIds().stream() .map(productInfoService::getProductInfo) .collect(Collectors.toList());
long deadline = System.currentTimeMillis() + 300;
List<ProductInfo> products = new ArrayList<ProductInfo>(); for (Future<ProductInfo> pif : productInfoFutures) { long timeout = deadline - System.currentTimeMillis(); if (timeout <= 0) throw new TimeoutException(...); products.add(pif.get(timeout, TimeUnit.MILLISECONDS)); }... return new Wishlist(products); }
asynchronouslyget all products
wait for product
info
![Page 62: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/62.jpg)
@crichardson
It works BUTCode is very low-level and
messyAnd, it’s blocking
![Page 63: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/63.jpg)
@crichardson
Better: Futures with callbacks ⇒ no blocking!
def asyncSquare(x : Int) : Future[Int] = ... x * x...
val f = asyncSquare(25)
Guava ListenableFutures, Spring 4 ListenableFutureJava 8 CompletableFuture, Scala Futures
f onSuccess { case x : Int => println(x)}f onFailure { case e : Exception => println("exception thrown")}
Partial function applied to successful outcome
Applied to failed outcome
![Page 64: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/64.jpg)
@crichardson
But callback-based scatter/gather
⇒Messy, tangled code(aka. callback hell)
![Page 65: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/65.jpg)
@crichardson
Functional futures - map
def asyncPlus(x : Int, y : Int) = ... x + y ...
val future2 = asyncPlus(4, 5).map{ _ * 3 }
assertEquals(27, Await.result(future2, 1 second))
Scala, Java 8 CompletableFuture
Asynchronously transforms future
![Page 66: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/66.jpg)
@crichardson
Functional futures - flatMap()
val f2 = asyncPlus(5, 8).flatMap { x => asyncSquare(x) }
assertEquals(169, Await.result(f2, 1 second))
Scala, Java 8 CompletableFuture (partially)
Calls asyncSquare() with the eventual outcome of asyncPlus()
![Page 67: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/67.jpg)
@crichardson
flatMap() is asynchronous
Outcome3f3
Outcome3
f2
f2 = f1 flatMap (someFn)
Outcome1
f1
Implemented using callbacks
someFn(outcome1)
![Page 68: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/68.jpg)
@crichardson
class WishListService(...) { def getWishList(userId : Long) : Future[WishList] = {
userService.getUserProfile(userId) flatMap { userProfile =>
Scala wishlist service
val futureOfProductsList : Future[List[ProductInfo]] = Future.sequence(listOfProductFutures)
val timeoutFuture = ... Future.firstCompletedOf(Seq(wishlist, timeoutFuture)) } }
val wishlist = futureOfProductsList.map { products =>
WishList(products) }
val listOfProductFutures : List[Future[ProductInfo]] = userProfile.wishListProductIds
.map { productInfoService.getProductInfo }
![Page 69: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/69.jpg)
@crichardson
Using Java 8 CompletableFutures
public class UserServiceImpl implements UserService { @Override public CompletableFuture<UserInfo> getUserInfo(long userId) { return CompletableFuture.supplyAsync( () -> httpGetRequest("http://myuservice/user" + userId,
UserInfo.class)); }
Runs in ExecutorService
![Page 70: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/70.jpg)
@crichardson
Using Java 8 CompletableFuturespublic CompletableFuture<Wishlist> getWishlistDetails(long userId) { return userService.getUserProfile(userId).thenComposeAsync(userProfile -> {
Stream<CompletableFuture<ProductInfo>> s1 = userProfile.getWishListProductIds() .stream() .map(productInfoService::getProductInfo);
Stream<CompletableFuture<List<ProductInfo>>> s2 = s1.map(fOfPi -> fOfPi.thenApplyAsync(pi -> Arrays.asList(pi)));
CompletableFuture<List<ProductInfo>> productInfos = s2 .reduce((f1, f2) -> f1.thenCombine(f2, ListUtils::union)) .orElse(CompletableFuture.completedFuture(Collections.emptyList()));
return productInfos.thenApply(list -> new Wishlist()); }); }
Java 8 is missing Future.sequence()
flatMap()!
map()!
![Page 71: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/71.jpg)
@crichardson
Introducing Reactive Extensions (Rx)
The Reactive Extensions (Rx) is a library for composing asynchronous and event-based programs using
observable sequences and LINQ-style query operators. Using Rx, developers represent asynchronous data streams with Observables , query asynchronous
data streams using LINQ operators , and .....
https://rx.codeplex.com/
![Page 72: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/72.jpg)
@crichardson
About RxJava
Reactive Extensions (Rx) for the JVM
Original motivation for Netflix was to provide rich Futures
Implemented in Java
Adaptors for Scala, Groovy and Clojure
https://github.com/Netflix/RxJava
![Page 73: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/73.jpg)
@crichardson
RxJava core concepts
trait Observable[T] { def subscribe(observer : Observer[T]) : Subscription ...}
trait Observer[T] {def onNext(value : T)def onCompleted()def onError(e : Throwable)
}
Notifies
An asynchronous stream of items
Used to unsubscribe
![Page 74: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/74.jpg)
Comparing Observable to...Observer pattern - similar but adds
Observer.onComplete()
Observer.onError()
Iterator pattern - mirror image
Push rather than pull
Futures - similar
Can be used as Futures
But Observables = a stream of multiple values
Collections and Streams - similar
Functional API supporting map(), flatMap(), ...
But Observables are asynchronous
![Page 75: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/75.jpg)
@crichardson
Fun with observables
val every10Seconds = Observable.interval(10 seconds)
-1 0 1 ...
t=0 t=10 t=20 ...
val oneItem = Observable.items(-1L)
val ticker = oneItem ++ every10Seconds
val subscription = ticker.subscribe { (value: Long) => println("value=" + value) }...subscription.unsubscribe()
![Page 76: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/76.jpg)
@crichardson
def getTableStatus(tableName: String) : Observable[DynamoDbStatus]=
Observable { subscriber: Subscriber[DynamoDbMessage] =>
}
Connecting observables to the outside world
amazonDynamoDBAsyncClient.describeTableAsync(new DescribeTableRequest(tableName), new AsyncHandler[DescribeTableRequest, DescribeTableResult] {
override def onSuccess(request: DescribeTableRequest, result: DescribeTableResult) = { subscriber.onNext(DynamoDbStatus(result.getTable.getTableStatus)) subscriber.onCompleted() }
override def onError(exception: Exception) = exception match { case t: ResourceNotFoundException => subscriber.onNext(DynamoDbStatus("NOT_FOUND")) subscriber.onCompleted() case _ => subscriber.onError(exception) } }) }
![Page 77: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/77.jpg)
@crichardson
Transforming observables
val tableStatus = ticker.flatMap { i => logger.info("{}th describe table", i + 1) getTableStatus(name) }
Status1 Status2 Status3 ...
t=0 t=10 t=20 ...+ Usual collection methods: map(), filter(), take(), drop(), ...
![Page 78: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/78.jpg)
@crichardson
Calculating rolling averageclass AverageTradePriceCalculator {
def calculateAverages(trades: Observable[Trade]): Observable[AveragePrice] = { ... }
case class Trade( symbol : String, price : Double, quantity : Int ...)
case class AveragePrice(symbol : String, price : Double, ...)
![Page 79: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/79.jpg)
@crichardson
Calculating average pricesdef calculateAverages(trades: Observable[Trade]): Observable[AveragePrice] = {
trades.groupBy(_.symbol).map { symbolAndTrades => val (symbol, tradesForSymbol) = symbolAndTrades val openingEverySecond =
Observable.items(-1L) ++ Observable.interval(1 seconds) def closingAfterSixSeconds(opening: Any) =
Observable.interval(6 seconds).take(1)
tradesForSymbol.window(...).map { windowOfTradesForSymbol => windowOfTradesForSymbol.fold((0.0, 0, List[Double]())) { (soFar, trade) => val (sum, count, prices) = soFar (sum + trade.price, count + trade.quantity, trade.price +: prices) } map { x => val (sum, length, prices) = x AveragePrice(symbol, sum / length, prices) } }.flatten }.flatten}
![Page 80: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/80.jpg)
@crichardson
Agenda
Why functional programming?
Simplifying collection processing
Eliminating NullPointerExceptions
Simplifying concurrency with Futures and Rx Observables
Tackling big data problems with functional programming
![Page 81: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/81.jpg)
@crichardson
Let’s imagine that you want to count word frequencies
![Page 82: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/82.jpg)
@crichardson
Scala Word Count
val frequency : Map[String, Int] = Source.fromFile("gettysburgaddress.txt").getLines() .flatMap { _.split(" ") }.toList
frequency("THE") should be(11)frequency("LIBERTY") should be(1)
.groupBy(identity) .mapValues(_.length))
Map
Reduce
![Page 83: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/83.jpg)
@crichardson
But how to scale to a cluster of machines?
![Page 84: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/84.jpg)
@crichardson
Apache HadoopOpen-source software for reliable, scalable, distributed computing
Hadoop Distributed File System (HDFS)
Efficiently stores very large amounts of data
Files are partitioned and replicated across multiple machines
Hadoop MapReduce
Batch processing system
Provides plumbing for writing distributed jobs
Handles failures
...
![Page 85: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/85.jpg)
@crichardson
Overview of MapReduceInputData
Mapper
Mapper
Mapper
Reducer
Reducer
Reducer
Output
DataShuffle
(K,V)
(K,V)
(K,V)
(K,V)*
(K,V)*
(K,V)*
(K1,V, ....)*
(K2,V, ....)*
(K3,V, ....)*
(K,V)
(K,V)
(K,V)
![Page 86: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/86.jpg)
@crichardson
MapReduce Word count - mapper
class Map extends Mapper<LongWritable, Text, Text, IntWritable> { private final static IntWritable one = new IntWritable(1); private Text word = new Text(); public void map(LongWritable key, Text value, Context context) { String line = value.toString(); StringTokenizer tokenizer = new StringTokenizer(line); while (tokenizer.hasMoreTokens()) { word.set(tokenizer.nextToken()); context.write(word, one); } }}
(“Four”, 1), (“score”, 1), (“and”, 1), (“seven”, 1), ...
Four score and seven years⇒
http://wiki.apache.org/hadoop/WordCount
![Page 87: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/87.jpg)
@crichardson
Hadoop then shuffles the key-value pairs...
![Page 88: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/88.jpg)
@crichardson
MapReduce Word count - reducer
class Reduce extends Reducer<Text, IntWritable, Text, IntWritable> {
public void reduce(Text key, Iterable<IntWritable> values, Context context) { int sum = 0; for (IntWritable val : values) { sum += val.get(); } context.write(key, new IntWritable(sum)); } }
(“the”, 11)
(“the”, (1, 1, 1, 1, 1, 1, ...))⇒
http://wiki.apache.org/hadoop/WordCount
![Page 89: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/89.jpg)
@crichardson
About MapReduceVery simple programming abstract yet incredibly powerful
By chaining together multiple map/reduce jobs you can process very large amounts of data
e.g. Apache Mahout for machine learning
But
Mappers and Reducers = verbose code
Development is challenging, e.g. unit testing is difficult
It’s disk-based, batch processing ⇒ slow
![Page 90: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/90.jpg)
@crichardson
Scalding: Scala DSL for MapReduce
class WordCountJob(args : Args) extends Job(args) { TextLine( args("input") ) .flatMap('line -> 'word) { line : String => tokenize(line) } .groupBy('word) { _.size } .write( Tsv( args("output") ) )
def tokenize(text : String) : Array[String] = { text.toLowerCase.replaceAll("[^a-zA-Z0-9\\s]", "") .split("\\s+") }}
https://github.com/twitter/scalding
Expressive and unit testable
Each row is a map of named fields
![Page 91: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/91.jpg)
@crichardson
Apache SparkPart of the Hadoop ecosystem
Key abstraction = Resilient Distributed Datasets (RDD)
Collection that is partitioned across cluster members
Operations are parallelized
Created from either a Scala collection or a Hadoop supported datasource - HDFS, S3 etc
Can be cached in-memory for super-fast performance
Can be replicated for fault-tolerance
http://spark.apache.org
![Page 92: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/92.jpg)
@crichardson
Spark Word Countval sc = new SparkContext(...)
sc.textFile(“s3n://mybucket/...”) .flatMap { _.split(" ")} .groupBy(identity) .mapValues(_.length) .toArray.toMap }}
Expressive, unit testable and very fast
![Page 93: Map(), flatmap() and reduce() are your new best friends: simpler collections, concurrency, and big data (jax, jax2014)](https://reader033.fdocuments.in/reader033/viewer/2022060108/554d1fbdb4c905ca208b4ab1/html5/thumbnails/93.jpg)
@crichardson
Summary
Functional programming enables the elegant expression of good ideas in a wide variety of domains
map(), flatMap() and reduce() are remarkably versatile higher-order functions
Use FP and OOP together
Java 8 has taken a good first step towards supporting FP