Before explaining Stream API it is needed to understand the idea of a functional interface as they are leveraged for use with lambda expressions. A functional interface is an interface that has only one abstract method that is to be implemented. A functional interface may or may not have default or static methods. Although not mandatory, a good practice is to annotate a functional interface with @FunctionalInterface. Functional interfaces mostly used in Stream API operations are explained below. You can also use functional interfaces in a method signature, hence lambda expressions can be passed when calling a method. If one's below are not suitable you can always create own functional interface.
Predicate
Method for implementation is: boolean test(T t). This interface is used in order to evaluate condition to an input object to a boolean expression.
Supplier
Method for implementation is: T get(). This interface is used in order to get output object as a result.
Function
Method for implementation is: R apply(T t). This interface is used in order to produce a result object based on a given input object.
Consumer
Method for implementation is: void accept(T t). This interface is used in order to do an operation on a single input object that does not produce any result.
BiConsumer
Method for implementation is: void accept(T t, U u). This interface is used in order to do an operation on two input objects that do not produce any result.
Method reference
Sometimes when using lambda expression all that is done is calling a single method by name. Method reference provides an easy way to call the method making the code more readable. In short it is calling NumberUtils::isNumber instead of element-> NumberUtils.isNumber(element).
Stream API
Stream API is used for data processing which supports parallel operations. It enables data processing in a declarative way. Streams are sequences of elements that support different operations. Streams are lazily computed on demand when elements are needed. The stream is like a recipe that gets executed when actual result is needed.
Stream operations
Stream operations are divided into intermediate and terminal operations combined to form stream pipelines. Intermediate operations return a new stream. They are always lazy. Executing an intermediate operation such as filter() does not actually perform any filtering, but instead creates a new stream. Terminal operations on the other hand, such as collect() generates a result or final value. After the terminal operation is performed, the stream pipeline is considered consumed, and can no longer be used. Intermediate and terminal operators, such as limit() or findFirst() can be short-circuiting, once they achieve their goal they stop further stream processing. Intermediate operations are further divided into stateless and stateful operations. Stateless operations, such as filter() and map(), retain no state from the previously seen element when processing a new element, hence each element can be processed independently of operations on other elements. Stateful operations, such as distinct() and sorted(), may incorporate state from previously seen elements when processing new elements. For example, one cannot produce any results from sorting a stream until one has seen all elements of the stream. As a result, under parallel computation, some pipelines containing stateful intermediate operations may require multiple passes on the data or may need to buffer significant data. Stateful operations should be carefully considered when constructing stream pipeline because they might require significant resources.
Stream API methods
Below is a list of most of the methods available in Stream interface with a short description. Code examples with explanations are in the following post.
filter
Stream filter(Predicate<? super T> predicate) - a stateless intermediate operation that returns a stream consisting of the elements of this stream matching the given predicate.
map
Stream map(Function<? super T, ? extends R> mapper) - a stateless intermediate operation that converts a value of one type into another by applying a function that does the conversion. Result is one output value for one input value.
distinct
Stream distinct() - stateful intermediate operation that removes duplicated elements using equals() method.
sorted
Stream sorted() or Stream sorted(Comparator<? super T> comparator) - stateful intermediate operation that sorts stream elements according to given or default comparator.
peek
Stream peek(Consumer<? super T> action) - a stateless intermediate operation that performs an action on an element once the stream is consumed. It does not change the stream or alter stream elements. It is mainly used for debugging purposes.
collect
<R, A> R collect(Collector<? super T, A, R> collector) or R collect(Supplier supplier, BiConsumer<R, ? super T> accumulator, BiConsumer<R, R> combiner) - terminal operation that performs mutable reduction operation on the stream elements reducing the stream to a mutable result collector, such as an ArrayList. Stream elements are incorporated into the result by updating it instead of replacing.
toArray
Object[] toArray() - terminal operation that returns array containing elements of this stream.
flatMap
<R> Stream<R> flatMap(Function<? super T, ? extends Stream<? extends R>> mapper) - stateless intermediate operation that replaces value with a stream. A result is an arbitrary number of output values to a single input value.
limit
Stream<T> limit(long maxSize) - a short-circuiting stateful intermediate operation that truncates a stream to a given length.
skip
Stream<T> skip(long n) - a stateful intermediate operation that skips first elements from a stream.
forEach
void forEach(Consumer<? super T> action) - a terminal operation that performs an action for each element in the stream
reduce
T reduce(T identity, BinaryOperator<T> accumulator) or Optional<T> reduce(BinaryOperator<T> accumulator) or <U> U reduce(U identity, BiFunction<U, ? super T, U> accumulator, BinaryOperator<U> combiner) - terminal operation that performs reduction on the elements in the stream.
min
Optional<T> min(Comparator<? super T> comparator) - terminal operation that returns min element in stream based on given comparator. Special case of reduce operator.
max
Optional<T> max(Comparator<? super T> comparator) - terminal operation that returns max element in stream based on given comparator. Special case of reduce operator.
count
long count() - a terminal operation that counts elements in a stream.
anyMatch
boolean anyMatch(Predicate<? super T> predicate) - a short-circuiting terminal operation that returns a boolean result if an element in stream conforms to given predicate. Once the result is true operation is cancelled and the result is returned.
allMatch
boolean allMatch(Predicate<? super T> predicate) - a short-circuiting terminal operation that returns a boolean result if all elements in stream conforms to given predicate. Once the result is false operation is cancelled and the result is returned.
noneMatch
boolean noneMatch(Predicate<? super T> predicate) - a short-circuiting terminal operation that returns a boolean result if none elements in stream conform to given predicate. Once the result is false operation is cancelled and the result is returned.
findFirst
Optional<T> findFirst() - a short-circuiting terminal operation that returns an Optional with the first element of this stream or an empty Optional if the stream is empty. If the stream has no order, such as Map or Set, then any element may be returned.
findAny
Optional<T> findAny() - a short-circuiting terminal operation that returns an Optional with some element of the stream or an empty Optional if the stream is empty.
Conclusion
Stream API is very powerful instrument provided in Java 8. They allow data processing in a declarative way and in parallel. Code looks very neat and easy to read.