Java 8 features - Stream API explained

Jun 21, 2017

In Java 8 features – Lambda expressions, Interface changes, Stream API, DateTime API post I have briefly described most interesting Java 8 features. In the current post, I will give special attention to Stream API. This post is more theoretical which lays the foundation for next posts: Java 8 features – Stream API basic examples and Java 8 features – Stream API advanced examples that gives code examples to explain the theory. Code examples here can be found in GitHub java-samples/java8 repository.

Functional interfaces

Before explaining Stream API it is needed to understand the idea of a functional interface as they are leveraged for use with lambda expressions. A functional interface is an interface that has only one abstract method that is to be implemented. A functional interface may or may not have default or static methods. Although not mandatory, a good practice is to annotate a functional interface with @FunctionalInterface. Functional interfaces mostly used in Stream API operations are explained below. You can also use functional interfaces in a method signature, hence lambda expressions can be passed when calling a method. If one's below are not suitable you can always create own functional interface.

Predicate

Method for implementation is: boolean test(T t). This interface is used in order to evaluate condition to an input object to a boolean expression.

Supplier

Method for implementation is: T get(). This interface is used in order to get output object as a result.

Function

Method for implementation is: R apply(T t). This interface is used in order to produce a result object based on a given input object.

Consumer

Method for implementation is: void accept(T t). This interface is used in order to do an operation on a single input object that does not produce any result.

BiConsumer

Method for implementation is: void accept(T t, U u). This interface is used in order to do an operation on two input objects that do not produce any result.

Method reference

Sometimes when using lambda expression all that is done is calling a single method by name. Method reference provides an easy way to call the method making the code more readable. In short it is calling NumberUtils::isNumber instead of element-> NumberUtils.isNumber(element).

Stream API

Stream API is used for data processing which supports parallel operations. It enables data processing in a declarative way. Streams are sequences of elements that support different operations. Streams are lazily computed on demand when elements are needed. The stream is like a recipe that gets executed when actual result is needed.

Stream operations

Stream operations are divided into intermediate and terminal operations combined to form stream pipelines. Intermediate operations return a new stream. They are always lazy. Executing an intermediate operation such as filter() does not actually perform any filtering, but instead creates a new stream. Terminal operations on the other hand, such as collect() generates a result or final value. After the terminal operation is performed, the stream pipeline is considered consumed, and can no longer be used. Intermediate and terminal operators, such as limit() or findFirst() can be short-circuiting, once they achieve their goal they stop further stream processing. Intermediate operations are further divided into stateless and stateful operations. Stateless operations, such as filter() and map(), retain no state from the previously seen element when processing a new element, hence each element can be processed independently of operations on other elements. Stateful operations, such as distinct() and sorted(), may incorporate state from previously seen elements when processing new elements. For example, one cannot produce any results from sorting a stream until one has seen all elements of the stream. As a result, under parallel computation, some pipelines containing stateful intermediate operations may require multiple passes on the data or may need to buffer significant data. Stateful operations should be carefully considered when constructing stream pipeline because they might require significant resources.

Stream API methods

Below is a list of most of the methods available in Stream interface with a short description. Code examples with explanations are in the following post.

filter

Stream filter(Predicate<? super T> predicate) - a stateless intermediate operation that returns a stream consisting of the elements of this stream matching the given predicate.

map

Stream map(Function<? super T, ? extends R> mapper) - a stateless intermediate operation that converts a value of one type into another by applying a function that does the conversion. Result is one output value for one input value.

distinct

Stream distinct() - stateful intermediate operation that removes duplicated elements using equals() method.

sorted

Stream sorted() or Stream sorted(Comparator<? super T> comparator) - stateful intermediate operation that sorts stream elements according to given or default comparator.

peek

Stream peek(Consumer<? super T> action) - a stateless intermediate operation that performs an action on an element once the stream is consumed. It does not change the stream or alter stream elements. It is mainly used for debugging purposes.

collect

<R, A> R collect(Collector<? super T, A, R> collector) or R collect(Supplier supplier, BiConsumer<R, ? super T> accumulator, BiConsumer<R, R> combiner) - terminal operation that performs mutable reduction operation on the stream elements reducing the stream to a mutable result collector, such as an ArrayList. Stream elements are incorporated into the result by updating it instead of replacing.

toArray

Object[] toArray() - terminal operation that returns array containing elements of this stream.

flatMap

<R> Stream<R> flatMap(Function<? super T, ? extends Stream<? extends R>> mapper) - stateless intermediate operation that replaces value with a stream. A result is an arbitrary number of output values to a single input value.

limit

Stream<T> limit(long maxSize) - a short-circuiting stateful intermediate operation that truncates a stream to a given length.

skip

Stream<T> skip(long n) - a stateful intermediate operation that skips first elements from a stream.

forEach

void forEach(Consumer<? super T> action) - a terminal operation that performs an action for each element in the stream

reduce

T reduce(T identity, BinaryOperator<T> accumulator) or Optional<T> reduce(BinaryOperator<T> accumulator) or <U> U reduce(U identity, BiFunction<U, ? super T, U> accumulator, BinaryOperator<U> combiner) - terminal operation that performs reduction on the elements in the stream.

min

Optional<T> min(Comparator<? super T> comparator) - terminal operation that returns min element in stream based on given comparator. Special case of reduce operator.

max

Optional<T> max(Comparator<? super T> comparator) - terminal operation that returns max element in stream based on given comparator. Special case of reduce operator.

count

long count() - a terminal operation that counts elements in a stream.

anyMatch

boolean anyMatch(Predicate<? super T> predicate) - a short-circuiting terminal operation that returns a boolean result if an element in stream conforms to given predicate. Once the result is true operation is cancelled and the result is returned.

allMatch

boolean allMatch(Predicate<? super T> predicate) - a short-circuiting terminal operation that returns a boolean result if all elements in stream conforms to given predicate. Once the result is false operation is cancelled and the result is returned.

noneMatch

boolean noneMatch(Predicate<? super T> predicate) - a short-circuiting terminal operation that returns a boolean result if none elements in stream conform to given predicate. Once the result is false operation is cancelled and the result is returned.

findFirst

Optional<T> findFirst() - a short-circuiting terminal operation that returns an Optional with the first element of this stream or an empty Optional if the stream is empty. If the stream has no order, such as Map or Set, then any element may be returned.

findAny

Optional<T> findAny() - a short-circuiting terminal operation that returns an Optional with some element of the stream or an empty Optional if the stream is empty.

Conclusion

Stream API is very powerful instrument provided in Java 8. They allow data processing in a declarative way and in parallel. Code looks very neat and easy to read.

Tags:

Java Java 8 Tutorials