Java 8 Stream API. A different way to process collections.

Post on 27-Aug-2014

440 views 5 download

Tags:

description

A look on one of the features of Java 8 hidden behind the lambdas. A different way to iterate Collections. You'll never see the Collecions the same way. These are the slides I used on my talk at the "Tech Thursday" by Oracle in June in Madrid.

Transcript of Java 8 Stream API. A different way to process collections.

Java8 Stream APIA different way to process collectionsDavid Gómez G.@dgomezgdgomezg@autentia.com

Streams? What’s that?

A Stream is…An convenience method to iterate over

collections in a declarative wayList<Integer>  numbers  =  new  ArrayList<Integer>();for  (int  i=  0;  i  <  100  ;  i++)  {   numbers.add(i); }  

List<Integer> evenNumbers = new ArrayList<>();for (int i : numbers) { if (i % 2 == 0) { evenNumbers.add(i); } }

@dgomezg

A Stream is…An convenience method to iterate over

collections in a declarative wayList<Integer>  numbers  =  new  ArrayList<Integer>();for  (int  i=  0;  i  <  100  ;  i++)  {   numbers.add(i); }  

List<Integer> evenNumbers = numbers.stream() .filter(n -> n % 2 == 0) .collect(toList());

@dgomezg

So… Streams are collections?Not Really

Collections Streams

Sequence of elements

Computed at construction

In-memory data structure

Sequence of elements

Computed at iteration

Traversable only Once

External Iteration Internal Iteration

Finite size Infinite size

@dgomezg

Iterating a CollectionList<Integer> evenNumbers = new ArrayList<>();for (int i : numbers) { if (i % 2 == 0) { evenNumbers.add(i); } }

External Iteration - Use forEach or Iterator - Very verbose Parallelism by manually using Threads - Concurrency is hard to be done right! - Lots of contention and error-prone - Thread-safety

@dgomezg

Iterating a Stream

List<Integer> evenNumbers = numbers.stream() .filter(n -> n % 2 == 0) .collect(toList());

Internal Iteration - No manual Iterators handling - Concise - Fluent API: chain sequence processing Elements computed only when needed

@dgomezg

Iterating a Stream

List<Integer> evenNumbers = numbers.parallelStream() .filter(n -> n % 2 == 0) .collect(toList());

Easily Parallelism - Concurrency is hard to be done right! - Uses ForkJoin - Process steps should be - stateless - independent

@dgomezg

Lambdas &

Method References

@FunctionalInterface

@FunctionalInterfacepublic interface Predicate<T> {

boolean test(T t); !!!!!}

An interface with exactly one abstract method !

!

@dgomezg

@FunctionalInterface

@FunctionalInterfacepublic interface Predicate<T> {

boolean test(T t); ! default Predicate<T> negate() { return (t) -> !test(t); } !}

An interface with exactly one abstract method Could have default methods, though! !

@dgomezg

Lambda TypesBased on abstract method signature from @FunctionalInterface: (Arguments) -> <return type>

@FunctionalInterfacepublic interface Predicate<T> {

boolean test(T t); }

T -> boolean

@dgomezg

Lambda TypesBased on abstract method signature from @FunctionalInterface: (Arguments) -> <return type>

@FunctionalInterfacepublic interface Runnable {

void run(); }

() -> void

@dgomezg

Lambda TypesBased on abstract method signature from @FunctionalInterface: (Arguments) -> <return type>

@FunctionalInterfacepublic interface Supplier<T> {

T get(); }

() -> T

@dgomezg

Lambda TypesBased on abstract method signature from @FunctionalInterface: (Arguments) -> <return type>

@FunctionalInterfacepublic interface BiFunction<T, U, R> {

R apply(T t, U t); }

(T, U) -> R

@dgomezg

Lambda TypesBased on abstract method signature from @FunctionalInterface: (Arguments) -> <return type>

@FunctionalInterfacepublic interface Comparator<T> {

int compare(T o1, T o2); }

(T, T) -> int

@dgomezg

Method ReferencesAllows to use a method name as a lambda Usually better readability !

Syntax: <TargetReference>::<MethodName> !

TargetReference: Instance or Class

@dgomezg

Method References

phoneCall -> phoneCall.getContact()

Method ReferenceLambda

PhoneCall::getContact

() -> Thread.currentThread() Thread::currentThread

(str, c) -> str.indexOf(c) String::indexOf

(String s) -> System.out.println(s) System.out::println

@dgomezg

From Collections to

Streams

Characteristics of A Stream

• Interface to Sequence of elements • Focused on processing (not on storage) • Elements computed on demand

(or extracted from source) • Can be traversed only once • Internal iteration • Parallel Support • Could be Infinite

@dgomezg

Anatomy of a Stream

Source

Intermediate Operations

filter

map

order

function

Final operation

pipe

line

@dgomezg

Anatomy of Stream Iteration1. Start from the DataSource (Usually a

collection) and create the Stream

List<Integer> numbers = Arrays.asList(1, 2, 3, 4, 5, 6, 7, 8, 9, 10); Stream<Integer> numbersStream = numbers.stream();

@dgomezg

Anatomy of Stream Iteration2. Add a chain of intermediate Operations

(Stream Pipeline)Stream<Integer> numbersStream = numbers.stream() .filter(new Predicate<Integer>() { @Override public boolean test(Integer number) { return number % 2 == 0; } }) ! .map(new Function<Integer, Integer>() { @Override public Integer apply(Integer number) { return number * 2; } });

@dgomezg

Anatomy of Stream Iteration2. Add a chain of intermediate Operations

(Stream Pipeline) - Better using lambdas

Stream<Integer> numbersStream = numbers.stream() .filter(number -> number % 2 == 0) .map(number -> number * 2);

@dgomezg

Anatomy of Stream Iteration3. Close with a Terminal Operation

List<Integer> numbersStream = numbers.stream() .filter(number -> number % 2 == 0) .map(number -> number * 2) .collect(Collectors.toList());

•The terminal operation triggers Stream Iteration •Before that, nothing is computed. •Depending on the terminal operation, the stream could be fully traversed or not.

@dgomezg

Stream operations

Operation TypesIntermediate operations • Always return a Stream • Chain as many as needed (Pipeline) • Guide processing of data • Does not start processing • Can be Stateless or Stateful

Terminal operations • Can return an object, a collection, or void • Start the pipeline process • After its execution, the Stream can not be revisited

Intermediate Operations // T -> boolean Stream<T> filter(Predicate<? super T> predicate); ! //T -> R<R> Stream<R> map(Function<? super T, ? extends R> mapper); //(T,T) -> intStream<T> sorted(Comparator<? super T> comparator); Stream<T> sorted(); ! //T -> voidStream<T> peek(Consumer<? super T> action); !Stream<T> distinct();Stream<T> limit(long maxSize);Stream<T> skip(long n);

@dgomezg

Final Operations

Object[] toArray(); void forEach(Consumer<? super T> action); //T -> void<R, A> R collect(Collector<? super T, A, R> collector);!

!java.util.stream.Collectors.toList(); java.util.stream.Collectors.toSet(); java.util.stream.Collectors.toMap(); java.util.stream.Collectors.joining(CharSequence); !!!

@dgomezg

Final Operations (II)

//T,U -> R Optional<T> reduce(BinaryOperator<T> accumulator); //(T,T) -> int Optional<T> min(Comparator<? super T> comparator); //(T,T) -> int Optional<T> max(Comparator<? super T> comparator);long count();!

@dgomezg

Final Operations (y III)

//T -> boolean boolean anyMatch(Predicate<? super T> predicate);boolean allMatch(Predicate<? super T> predicate);boolean noneMatch(Predicate<? super T> predicate);!

@dgomezg

Usage examples - Context

public class Contact { private final String name; private final String city; private final String phoneNumber; private final LocalDate birth; public int getAge() { return Period.between(birth, LocalDate.now()) .getYears(); } //Constructor and getters omitted!}

@dgomezg

Usage examples - Contextpublic class PhoneCall { private final Contact contact; private final LocalDate time; private final Duration duration; ! //Constructor and getters omitted }

Contact me = new Contact("dgomezg", "Madrid", "555 55 55 55", LocalDate.of(1975, Month.MARCH, 26));Contact martin = new Contact("Martin", "Santiago", "666 66 66 66", LocalDate.of(1978, Month.JANUARY, 17));Contact roberto = new Contact("Roberto", "Santiago", "111 11 11 11", LocalDate.of(1973, Month.MAY, 11));Contact heinz = new Contact("Heinz", "Chania", "444 44 44 44", LocalDate.of(1972, Month.APRIL, 29));Contact michael = new Contact("michael", "Munich", "222 22 22 22", LocalDate.of(1976, Month.DECEMBER, 8));List<PhoneCall> phoneCallLog = Arrays.asList( new PhoneCall(heinz, LocalDate.of(2014, Month.MAY, 28), Duration.ofSeconds(125)), new PhoneCall(martin, LocalDate.of(2014, Month.MAY, 30), Duration.ofMinutes(5)), new PhoneCall(roberto, LocalDate.of(2014, Month.MAY, 30), Duration.ofMinutes(12)), new PhoneCall(michael, LocalDate.of(2014, Month.MAY, 28), Duration.ofMinutes(3)), new PhoneCall(michael, LocalDate.of(2014, Month.MAY, 29), Duration.ofSeconds(90)), new PhoneCall(heinz, LocalDate.of(2014, Month.MAY, 30), Duration.ofSeconds(365)), new PhoneCall(heinz, LocalDate.of(2014, Month.JUNE, 1), Duration.ofMinutes(7)), new PhoneCall(martin, LocalDate.of(2014, Month.JUNE, 2), Duration.ofSeconds(315))) ;

@dgomezg

People I phoned in June

phoneCallLog.stream() .filter(phoneCall -> phoneCall.getTime().getMonth() == Month.JUNE) .map(phoneCall -> phoneCall.getContact().getName()) .distinct() .forEach(System.out::println);!

@dgomezg

Seconds I talked in May

Long total = phoneCallLog.stream() .filter(phoneCall -> phoneCall.getTime().getMonth() == Month.MAY) .map(PhoneCall::getDuration) .collect(summingLong(Duration::getSeconds));

@dgomezg

Seconds I talked in MayOptional<Long> total = phoneCallLog.stream() .filter(phoneCall -> phoneCall.getTime().getMonth() == Month.MAY) .map(PhoneCall::getDuration) .reduce(Duration::plus); total.ifPresent(duration -> {System.out.println(duration.getSeconds());} ); !

@dgomezg

Did I phone to Paris?

boolean phonedToParis = phoneCallLog.stream() .anyMatch(phoneCall -> "Paris".equals(phoneCall.getContact().getCity()))!!

@dgomezg

Give me the 3 longest phone calls

phoneCallLog.stream() .filter(phoneCall -> phoneCall.getTime().getMonth() == Month.MAY) .sorted(comparing(PhoneCall::getDuration)) .limit(3) .forEach(System.out::println);

@dgomezg

Give me the 3 shortest ones

phoneCallLog.stream() .filter(phoneCall -> phoneCall.getTime().getMonth() == Month.MAY) .sorted(comparing(PhoneCall::getDuration).reversed()) .limit(3) .forEach(System.out::println);

@dgomezg

Creating Streams

Streams can be created fromCollections Directly from values Generators (infinite Streams) Resources (like files)

Stream ranges

@dgomezg

From collections

use stream()

List<Integer> numbers = new ArrayList<>();for (int i= 0; i < 10_000_000 ; i++) { numbers.add((int)Math.round(Math.random()*100));}

Stream<Integer> evenNumbers = numbers.stream();

or parallelStream()

Stream<Integer> evenNumbers = numbers.parallelStream();

@dgomezg

Directly from Values & ranges

Stream.of("Using", "Stream", "API", "From", “Java8”);

can convert into parallelStreamStream.of("Using", "Stream", "API", "From", “Java8”) .parallel();

@dgomezg

Generators - Functions

Stream<Integer> integers = Stream.iterate(0, number -> number + 2);

This is an infinite Stream!, will never be exhausted!

Stream fibonacci = Stream.iterate(new int[]{0,1}, t -> new int[]{t[1],t[0]+t[1]}); fibonacci.limit(10) .map(t -> t[0]) .forEach(System.out::println);

@dgomezg

Generators - Functions

Stream<Integer> integers = Stream.iterate(0, number -> number + 2);

This is an infinite Stream!, will never be exhausted!

Stream fibonacci = Stream.iterate(new int[]{0,1}, t -> new int[]{t[1],t[0]+t[1]}); fibonacci.limit(10) .map(t -> t[0]) .forEach(System.out::println);

@dgomezg

From Resources (Files)

Stream<String> fileContent = Files.lines(Paths.get(“readme.txt”));

Files.lines(Paths.get(“readme.txt”)) .flatMap(line -> Arrays.stream(line.split(" "))) .distinct() .count()); !

Count all distinct words in a file

@dgomezg

Parallelism

Parallel Streams

use stream()

List<Integer> numbers = new ArrayList<>();for (int i= 0; i < 10_000_000 ; i++) { numbers.add((int)Math.round(Math.random()*100));}

//This will use just a single thread Stream<Integer> evenNumbers = numbers.stream();

or parallelStream()//Automatically select the optimum number of threads Stream<Integer> evenNumbers = numbers.parallelStream();

@dgomezg

Let’s test it

use stream()

!for (int i = 0; i < 100; i++) { long start = System.currentTimeMillis(); List<Integer> even = numbers.stream() .filter(n -> n % 2 == 0) .sorted() .collect(toList()); System.out.printf( "%d elements computed in %5d msecs with %d threads\n”, even.size(), System.currentTimeMillis() - start, Thread.activeCount());} 5001983 elements computed in 828 msecs with 2 threads 5001983 elements computed in 843 msecs with 2 threads 5001983 elements computed in 675 msecs with 2 threads 5001983 elements computed in 795 msecs with 2 threads

@dgomezg

Let’s test it

use stream()

!for (int i = 0; i < 100; i++) { long start = System.currentTimeMillis(); List<Integer> even = numbers.parallelStream() .filter(n -> n % 2 == 0) .sorted() .collect(toList()); System.out.printf( "%d elements computed in %5d msecs with %d threads\n”, even.size(), System.currentTimeMillis() - start, Thread.activeCount());}

4999299 elements computed in 225 msecs with 9 threads 4999299 elements computed in 230 msecs with 9 threads 4999299 elements computed in 250 msecs with 9 threads

@dgomezg

Enough, for now, But this is just the beginning

Thank You.

@dgomezgdgomezg@gmail.com

www.adictosaltrabajlo.com