Java 8 features – Stream API advanced examples

Last Updated on by

Post summary: This post explains Java 8 Stream API with very basic code examples.

In Java 8 features – Lambda expressions, Interface changes, Stream API, DateTime API post I have briefly described most interesting Java 8 features. In current post I will give special attention to Stream API. This post is with more advanced code examples to elaborate on basic examples described in Java 8 features – Stream API basic examples post. Code examples here can be found in GitHub java-samples/java8 repository.

Memory consumption and better design

Stream API has operations that are short-circuiting, such as limit(). Once their goal is achieved they stop processing the stream. Most of the operators are not such. Here I have prepared example for possible pitfall when using not short-circuiting operators. For testing purposes I have created PeekObject which outputs a message to console once its constructor is called.

public class PeekObject {
	private String message;

	public PeekObject(String message) {
		this.message = message;
		System.out.println("Constructor called for: " + message);
	}

	public String getMessage() {
		return message;
	}
}

Assume a situation where there is a stream of many instances of PeekObject, but only several elements of the stream are needed, thus they have to be limited. Only 2 constructors are called in this case.

limit the stream

public static List<PeekObject> limit_shortCircuiting(List<String> stringList,
							int limit) {
	return stringList.stream()
		.map(PeekObject::new)
		.limit(limit)
		.collect(Collectors.toList());
}

unit test

@Test
public void test_limit_shortCircuiting() {
	System.out.println("limit_shortCircuiting");

	List<String> stringList = Arrays.asList("a", "b", "a", "c", "d", "a");

	List<PeekObject> result = AdvancedStreamExamples
		.limit_shortCircuiting(stringList, 2);

	assertThat(result.size(), is(2));
}

console output

limit_shortCircuiting
Constructor called for: a
Constructor called for: b

Now stream has to be sorted before limit is applied.

code

public static List<PeekObject> sorted_notShortCircuiting(
					List<String> stringList, int limit) {
	return stringList.stream()
		.map(PeekObject::new)
		.sorted((left, right) -> 
			left.getMessage().compareTo(right.getMessage()))
		.limit(limit)
		.collect(Collectors.toList());
}

unit test

@Test
public void test_sorted_notShortCircuiting() {
	System.out.println("sorted_notShortCircuiting");

	List<String> stringList = Arrays.asList("a", "b", "a", "c", "d", "a");

	List<PeekObject> result = AdvancedStreamExamples
		.sorted_notShortCircuiting(stringList, 2);

	assertThat(result.size(), is(2));
}

console output

sorted_notShortCircuiting
Constructor called for: a
Constructor called for: b
Constructor called for: a
Constructor called for: c
Constructor called for: d
Constructor called for: a

Notice that constructors for all objects in the stream are called. This will require Java to allocate enough memory for all the objects. There are 6 object in this example, but what if there are 6 million. Also current objects are very lightweight, but what if they are much bigger. Conclusion is that you have to know very well Stream API operations and apply them carefully when designing your stream pipeline.

Convert comma separated List to a Map with handling duplicates

There is a List of comma separated values which needs to be converted to a Map. List value “11,21” should become Map entry with key 11 and value 21. Duplicated keys also should be considered: Arrays.asList(“11,21”, “12,21”, “13,23”, “13,24”).

code

public static Map<Long, Long> splitToMap(List<String> stringsList) {
	return stringsList.stream()
		.filter(StringUtils::isNotEmpty)
		.map(line -> line.split(","))
		.filter(array -> array.length == 2 
			&& NumberUtils.isNumber(array[0])
			&& NumberUtils.isNumber(array[1]))
		.collect(Collectors.toMap(array -> Long.valueOf(array[0]), 
			array -> Long.valueOf(array[1]), (first, second) -> first)));
}

unit test

@Test
public void test_splitToMap() {
	List<String> stringList = Arrays
			.asList("11,21", "12,21", "13,23", "13,24");

	Map<Long, Long> result = AdvancedStreamExamples.splitToMap(stringList);

	assertThat(result.size(), is(3));
	assertThat(result.get(11L), is(21L));
	assertThat(result.get(12L), is(21L));
	assertThat(result.get(13L), is(23L));
}

Important bit in this conversion is (first, second) -> first), if it is not present there will be error java.lang.IllegalStateException: Duplicate key 23 (slightly misleading error, as duplicated key is 13, value is 23). This is a merge function which resolves collisions between values associated for the same key. It evaluates two values found for same key – first and second where current lambda returns the first. If overwrite is needed, hence keep the last entered value then lambda would be: (first, second) -> second).

Examples with custom object

Examples to follow use custom object Employee, where Position is an enumeration: public enum Position { DEV, DEV_OPS, QA }.

import java.util.List;

public class Employee {
	private String firstName;
	private String lastName;
	private Position position;
	private List<String> skills;
	private int salary;

	public Employee() {
	}

	public Employee(String firstName, String lastName,
				Position position, int salary) {
		this.firstName = firstName;
		this.lastName = lastName;
		this.position = position;
		this.salary = salary;
	}

	public void setSkills(String... skills) {
		this.skills = Arrays.stream(skills).collect(Collectors.toList());
	}

	public String getName() {
		return this.firstName + " " + this.lastName;
	}

	... Getters and Setters
}

A company has been created, it consists of 6 developers, 2 QAs and 2 DevOps..

private List<Employee> createCompany() {
	Employee dev1 = new Employee("John", "Doe", Position.DEV, 110);
	dev1.setSkills("C#", "ASP.NET", "React", "AngularJS");
	Employee dev2 = new Employee("Peter", "Doe", Position.DEV, 120);
	dev2.setSkills("Java", "MongoDB", "Dropwizard", "Chef");
	Employee dev3 = new Employee("John", "Smith", Position.DEV, 115);
	dev3.setSkills("Java", "JSP", "GlassFish", "MySql");
	Employee dev4 = new Employee("Brad", "Johston", Position.DEV, 100);
	dev4.setSkills("C#", "MSSQL", "Entity Framework");
	Employee dev5 = new Employee("Philip", "Branson", Position.DEV, 140);
	dev5.setSkills("JavaScript", "React", "AngularJS", "NodeJS");
	Employee dev6 = new Employee("Nathaniel", "Barth", Position.DEV, 99);
	dev6.setSkills("Java", "Dropwizard");
	Employee qa1 = new Employee("Ronald", "Wynn", Position.QA, 100);
	qa1.setSkills("Selenium", "C#", "Java");
	Employee qa2 = new Employee("Erich", "Kohn", Position.QA, 105);
	qa2.setSkills("Selenium", "JavaScript", "Protractor");
	Employee devOps1 = new Employee("Harold", "Jess", Position.DEV_OPS, 116);
	devOps1.setSkills("CentOS", "bash", "c", "puppet", "chef", "Ansible");
	Employee devOps2 = new Employee("Karl", "Madsen", Position.DEV_OPS, 123);
	devOps2.setSkills("Ubuntu", "bash", "Python", "chef");

	return Arrays.asList(dev1, dev2, dev3, dev4, dev5, dev6,
				qa1, qa2, devOps1, devOps2);
}

Company skill set

This method accepts none, one or many positions. If no positions are provided then information for all positions is printed. Positions array is transferred to List<String> because all objects used in lambda should be effectively final. Transferring array to stream is done with Arrays.stream() method. Employees are filtered based on desired position. Each skills list is concatenated and flattened to a stream with flatMap(). After this operation there is a stream of strings with all skills. Duplicates are removed with distinct(). Finally stream is collected to a list.

code

public static List<String> gatherEmployeeSkills(
		List<Employee> employees, Position... positions) {
	positions = positions == null || positions.length == 0 
		? Position.values() : positions;
	List<Position> searchPositions = Arrays.stream(positions)
			.collect(Collectors.toList());
	return employees == null ? Collections.emptyList()
		: employees.stream()
			.filter(employee 
				-> searchPositions.contains(employee.getPosition()))
			.flatMap(employee -> employee.getSkills().stream())
			.distinct()
			.collect(Collectors.toList());
}

unit test

@Test
public void test_gatherEmployeeSkills() {
	List<Employee> company = createCompany();

	List<String> skills = AdvancedStreamExamples
			.gatherEmployeeSkills(company);

	assertThat(skills.size(), is(25));
}

Skill set per position

This method first received list of all skills per position and converts it to a stream. Stream can be collected to a String with Collectors.joining() method. It accepts delimiter, prefix and suffix.

code

public static String printEmployeeSkills(
		List<Employee> employees, Position position) {
	List<String> skills = gatherEmployeeSkills(employees, position);
	return skills.stream()
		.collect(Collectors.joining("; ",
			"Our " + position + "s have: ", " skills"));
}

unit test

@Test
public void test_printEmployeeSkills() {
	List<Employee> company = createCompany();

	String skills = AdvancedStreamExamples
			.printEmployeeSkills(company, Position.QA);

	assertThat(skills, is("Our employees have: "
		+ "Selenium; C#; Java; JavaScript; Protractor skills"));
}

Salary statistics

This method returns Map with Position as key and IntSummaryStatistics as value. Collectors.groupingBy() groups employees by position key and then using Collectors.summarizingInt() to get statistics of employee’s salary.

code

public static Map<Position, IntSummaryStatistics> salaryStatistics(
		List<Employee> employees) {
	return employees.stream()
		.collect(Collectors.groupingBy(Employee::getPosition,
			Collectors.summarizingInt(Employee::getSalary)));
}

unit test

@Test
public void test_salaryStatistics() {
	List<Employee> company = createCompany();

	Map<Position, IntSummaryStatistics> salaries = AdvancedStreamExamples
			.salaryStatistics(company);

	assertThat(salaries.get(Position.DEV).getAverage(), is(114D));
	assertThat(salaries.get(Position.QA).getAverage(), is(102.5D));
	assertThat(salaries.get(Position.DEV_OPS).getAverage(), is(119.5D));
}

Position with lowest average salary

Map with position and salary summary is retrieved and then with entrySet().stream() map is converted to stream of Entry<Position, IntSummaryStatistics> objects. Entries are sorted by average value in ascending order by custom comparator Double.compare(). findFirst() returns Optional<Entry>. Entry itself is obtained with get() method. Key which is basically the position is obtained with getKey() method.

code

public static Position positionWithLowestAverageSalary(
		List<Employee> employees) {
	return salaryStatistics(employees)
		.entrySet().stream()
		.sorted((entry1, entry2) 
			-> Double.compare(entry1.getValue().getAverage(),
				entry2.getValue().getAverage()))
		.findFirst()
		.get()
		.getKey();
}

unit test

@Test
public void test_positionWithLowestAverageSalary() {
	List<Employee> company = createCompany();

	Position position = AdvancedStreamExamples
			.positionWithLowestAverageSalary(company);

	assertThat(position, is(Position.QA));
}

Employees per each position

Grouping is done per position and employees are aggregated to list with Collectors.toList() method.

code

public static Map<Position, List<Employee>> employeesPerPosition(
		List<Employee> employees) {
	return employees.stream()
		.collect(Collectors.groupingBy(Employee::getPosition,
				Collectors.toList()));
}

unit test

@Test
public void test_employeesPerPosition() {
	List<Employee> company = createCompany();

	Map<Position, List<Employee>> employees = AdvancedStreamExamples
			.employeesPerPosition(company);

	assertThat(employees.get(Position.QA).size(), is(2));
	assertThat(employees.get(Position.QA).get(0).getName(),
		is("Ronald Wynn"));
	assertThat(employees.get(Position.QA).get(1).getName(),
		is("Erich Kohn"));
}

Employee names per each position

Similar to method above, but one more mapping is needed here. Employee name should be extracted and converted to List<String>. This is done with Collectors.mapping(Employee::getName, Collectors.toList()) method.

code

public static Map<Position, List<String>> employeeNamesPerPosition(
		List<Employee> employees) {
	return employees.stream()
		.collect(Collectors.groupingBy(Employee::getPosition,
			Collectors.mapping(Employee::getName,
						Collectors.toList())));
}

unit test

@Test
public void test_employeeNamesPerPosition() {
	List<Employee> company = createCompany();

	Map<Position, List<String>> employees = AdvancedStreamExamples
			.employeeNamesPerPosition(company);

	assertThat(employees.get(Position.QA).size(), is(2));
	assertThat(employees.get(Position.QA).get(0), is("Ronald Wynn"));
	assertThat(employees.get(Position.QA).get(1), is("Erich Kohn"));
}

Employee count per position

Getting the count is done by Collectors.counting() method. It returns Long by default. If Integer is needed then this can be changed to Collectors.reducing(0, e -> 1, Integer::sum).

code

public static Map<Position, Long> employeesCountPerPosition(
			List<Employee> employees) {
	return employees.stream()
		.collect(Collectors.groupingBy(Employee::getPosition,
						Collectors.counting()));
}

unit test

@Test
public void test_employeesCountPerPosition() {
	List<Employee> company = createCompany();

	Map<Position, Long> employees = AdvancedStreamExamples
				.employeesCountPerPosition(company);

	assertThat(employees.get(Position.DEV), is(6L));
	assertThat(employees.get(Position.QA), is(2L));
	assertThat(employees.get(Position.DEV_OPS), is(2L));
}

Employees with duplicated first name

Employees are grouped into map with key first name and List<Employee> as value. This map is converted to stream and filtered for List<Employee> greater than 1 element. List is flattened with flatMap() and collected to List<Employee>.

code

public static List<Employee> employeesWithDuplicateFirstName(
		List<Employee> employees) {
	return employees.stream()
		.collect(Collectors.groupingBy(Employee::getFirstName,
						Collectors.toList()))
		.entrySet().stream()
		.filter(entry -> entry.getValue().size() > 1)
		.flatMap(entry -> entry.getValue().stream())
		.collect(Collectors.toList());
}

unit test

@Test
public void test_employeesWithDuplicateFirstName() {
	List<Employee> company = createCompany();

	List<Employee> employees = AdvancedStreamExamples
			.employeesWithDuplicateFirstName(company);

	assertThat(employees.size(), is(2));
	assertThat(employees.get(0).getName(), is("John Doe"));
	assertThat(employees.get(1).getName(), is("John Smith"));
}

Conclusion

In this post I have just scratched the Java 8 Stream API. It offers vast amount of functionalities which can be very useful for data processing. Beware when generating stream pipeline because it might end up consuming too much resources.

Read more...

Java 8 features – Stream API basic examples

Last Updated on by

Post summary: This post explains Java 8 Stream API with very basic code examples.

In Java 8 features – Lambda expressions, Interface changes, Stream API, DateTime API post I have briefly described most interesting Java 8 features. In current post I will give special attention to Stream API. This post is with very basic code examples to explain the theory described in Java 8 features – Stream API explained post. Code examples here can be found in GitHub java-samples/java8 repository.

Example for filter, map, distinct, sorted, peek and collect

I will cover all those operations in one example. Code bellow takes a list of strings and convert it to stream by stream() method. For debug purposes peek() is used in the beginning and in the end of stream operations. It only prints to the console elements from the stream. Filtering of the elements is done by filter() method. Lambda expression is used as predicate. This lambda expression is a method call to verify current element is a number: element-> NumberUtils.isNumber(element). Since it is a single method call it is substituted with method reference: NumberUtils::isNumber. All elements that are evaluated to false are removed from further processing. It is good practice to use filtering in the beginning of stream pipeline so stream elements are reduced. Next operation is converting String values in the stream to Long values. This is done with map() method again with method reference. Duplicated elements are removed by calling distinct(). Stream elements are sorted by element’s natural order, in current example they are Long values. In the end stream is materialised into a List by using collect(Collectors.toList()) method. If this code has to be written without streams it would have looked as shown in “no stream code” tab. Note that using stream code is much more readable. Actually in the beginning it is not that easy to think in stream oriented way, but once you get used to it, you will never want to see non-streams code.

code

public static List<Long> toLongList(List<String> stringList) {
	return stringList.stream()
		.peek(element -> System.out.println("Before: " + element))
		.filter(NumberUtils::isNumber)
		.map(Long::valueOf)
		.distinct()
		.sorted()
		.peek(element -> System.out.println("After: " + element))
		.collect(Collectors.toList());
}

unit test

@Test
public void test_toLongList() {
	List<String> stringList = Arrays
		.asList(null, "", "aaa", "345", "123", "234", "123");

	List<Long> result = BasicStreamExamples.toLongList(stringList);

	assertEquals(3, result.size());
	assertEquals(123L, (long) result.get(0));
	assertEquals(234L, (long) result.get(1));
	assertEquals(345L, (long) result.get(2));
}

console output

Before: null
Before: 
Before: aaa
Before: 345
Before: 123
Before: 234
Before: 123
After: 123
After: 234
After: 345

no stream code

public static List<Long> toLongListWithoutStream(List<String> stringList) {
	List<Long> result = new ArrayList<>();
	for (String value : stringList) {
		System.out.println("Before: " + value);
		if (NumberUtils.isNumber(value)) {
			Long longValue = Long.valueOf(value);
			if (!result.contains(longValue)) {
				result.add(longValue);
				System.out.println("After: " + value);
			}
		}
	}
	Collections.sort(result);
	return result;
}

Example for toArray

This example is similar as example above, instead of collecting as list here stream elements are returned in array.

toArray code

public static Long[] toLongArray(String[] stringArray) {
	return Arrays.stream(stringArray)
		.filter(NumberUtils::isNumber)
		.map(Long::valueOf)
		.toArray(Long[]::new);
}

unit test

@Test
public void test_toLongArray() {
	String[] stringArray = new String[] {null, "", "aaa", "123", "234"};

	Long[] result = BasicStreamExamples.toLongArray(stringArray);

	assertEquals(2, result.length);
	assertEquals(123L, (long) result[0]);
	assertEquals(234L, (long) result[1]);
}

Example for flatMap

This function is pretty complex and hard to understand. In current example there is a map with String for key and List for value. Example bellow merges all list values in one result list. Note that Map interface does not have stream() method. Instead firstentrySet() is invoked which returns Set and then invoke its stream() method. Once stream is created flatMap() is called and result of Function argument should be stream: map -> map.getValue().stream(). This resultant stream is merge of all list values streams, which is then collected to a List.

flatMap code

public static List<String> flapMap(Map<String, List<String>> mapToProcess) {
	return mapToProcess.entrySet()
		.stream()
		.flatMap(map -> map.getValue().stream())
		.collect(Collectors.toList());
}

unit test

@Test
public void test_flapMap() {
	Map<String, List<String>> map = new HashMap<>();
	map.put("1", Arrays.asList("a", "b"));
	map.put("2", Arrays.asList("C", "D"));

	List<String> expectedResult = Arrays.asList("a", "b", "C", "D");

	List<String> result = BasicStreamExamples.flapMap(map);

	assertEquals(expectedResult, result);
}

Examples on limit and skip

limit code

public static List<String> limitValues(List<String> stringList, long limit) {
	return stringList.stream()
		.limit(limit)
		.collect(Collectors.toList());
}

limit unit test

@Test
public void test_limitValues() {
	List<String> stringList = Arrays.asList("a", "b", "c", "d");

	List<String> result = BasicStreamExamples.limitValues(stringList, 2);

	assertEquals(2, result.size());
	assertEquals("a", result.get(0));
	assertEquals("b", result.get(1));
}

skip code

public static List<String> skipValues(List<String> stringList, long skip) {
	return stringList.stream()
		.skip(skip)
		.collect(Collectors.toList());
}

skip unit test

@Test
public void test_skipValues() {
	List<String> stringList = Arrays.asList("a", "b", "c", "d");

	List<String> result = BasicStreamExamples.skipValues(stringList, 2);

	assertEquals(2, result.size());
	assertEquals("c", result.get(0));
	assertEquals("d", result.get(1));
}

Example for forEach

forEach code

public static void printEachElement(List<String> stringList) {
	stringList.stream()
		.forEach(element -> System.out.println("Element: " + element));
}

unit test

@Test
public void test_printEachElement() {
	List<String> stringList = Arrays.asList("a", "b", "c", "d");

	BasicStreamExamples.printEachElement(stringList);
}

console output

Element: a
Element: b
Element: c
Element: d

Examples for min and max

min code

public static Optional<Integer> getMin(List<Integer> stringList) {
	return stringList.stream()
		.min(Long::compare);
}

min unit test

@Test
public void test_getMin() {
	List<Integer> integerList = Arrays.asList(234, 123, 345);

	Optional<Integer> result = BasicStreamExamples.getMin(integerList);

	assertEquals(123, (int) result.get());
}

max code

public static Optional<Integer> getMax(List<Integer> integers) {
	return integers.stream()
		.max(Long::compare);
}

max unit test

@Test
public void test_getMax() {
	List<Integer> integerList = Arrays.asList(234, 123, 345);

	Optional<Integer> result = BasicStreamExamples.getMax(integerList);

	assertEquals(345, (int) result.get());
}

Example for reduce

This also is a bit complex method. Method given bellow sums all elements in the provided stream.

reduce code

public static Optional<Integer> sumByReduce(List<Integer> integers) {
	return integers.stream()
		.reduce((x, y) -> x + y);
}

unit test

@Test
public void test_sumByReduce() {
	List<Integer> integerList = Arrays.asList(100, 200, 300);

	Optional<Integer> result = BasicStreamExamples.sumByReduce(integerList);

	assertEquals(600, (int) result.get());
}

Example for count

count code

public static long count(List<Integer> integers) {
	return integers.stream()
		.count();
}

unit test

@Test
public void test_count() {
	List<Integer> integerList = Arrays.asList(234, 123, 345);

	long result = BasicStreamExamples.count(integerList);

	assertEquals(3, result);
}

Example for anyMatch, allMatch and noneMatch

anyMatch code

public static boolean isOddElementPresent(List<Integer> integers) {
	return integers.stream()
		.anyMatch(element -> element % 2 != 0);
}

allMatch code

public static boolean areAllElementsOdd(List<Integer> integers) {
	return integers.stream()
		.allMatch(element -> element % 2 != 0);
}

noneMatch code

public static boolean areAllElementsEven(List<Integer> integers) {
	return integers.stream()
		.noneMatch(element -> element % 2 != 0);
}

unit test 1

@Test
public void test_anyMatch_allMatch_noneMatch_allEven() {
	List<Integer> integerList = Arrays.asList(234, 124, 346, 124);

	assertFalse(BasicStreamExamples.isOddElementPresent(integerList));
	assertFalse(BasicStreamExamples.areAllElementsOdd(integerList));
	assertTrue(BasicStreamExamples.areAllElementsEven(integerList));
}

unit test 2

@Test
public void test_anyMatch_allMatch_noneMatch_evenAndOdd() {
	List<Integer> integerList = Arrays.asList(234, 123, 345, 123);

	assertTrue(BasicStreamExamples.isOddElementPresent(integerList));
	assertFalse(BasicStreamExamples.areAllElementsOdd(integerList));
	assertFalse(BasicStreamExamples.areAllElementsEven(integerList));
}

unit test 3

@Test
public void test_anyMatch_allMatch_noneMatch_allOdd() {
	List<Integer> integerList = Arrays.asList(233, 123, 345, 123);

	assertTrue(BasicStreamExamples.isOddElementPresent(integerList));
	assertTrue(BasicStreamExamples.areAllElementsOdd(integerList));
	assertFalse(BasicStreamExamples.areAllElementsEven(integerList));
}

Examples for findFirst

In case of List stream has order and it will return always 234 as result.

findFirst code for List

public static Optional<Integer> getFirstElementList(List<Integer> integers) {
	return integers.stream()
		.findFirst();
}

findFirst unit test for List

@Test
public void test_getFirstElementList() {
	List<Integer> integerList = Arrays.asList(234, 123, 345, 123);

	Optional<Integer> result = BasicStreamExamples
		.getFirstElementList(integerList);

	assertEquals(Integer.valueOf(234), result.get());
}

Since Set has no natural order then there is no guarantee which element is to be returned by findFirst(). On my machine with my JVM it is 345, but on other machine with other JVM it might be different value, so this test most likely will fail for someone else.

findFirst code for Set

public static Optional<Integer> getFirstElementSet(Set<Integer> integers) {
	return integers.stream()
		.findFirst();
}

findFirst unit test for Set

@Test
public void test_getFirstElementSet() {
	Set<Integer> integerSet = new HashSet<>();
	integerSet.add(234);
	integerSet.add(123);
	integerSet.add(345);
	integerSet.add(123);

	Optional<Integer> result = BasicStreamExamples
		.getFirstElementSet(integerSet);

	assertEquals(Integer.valueOf(345), result.get());
}

Examples for findAny

There is no guarantee which element is to be returned by findAny(). On my machine with my JVM it is 234, but on other machine with other JVM it might be different value, so this test most likely will fail for someone else.

findAny code

public static Optional<Integer> getAnyElement(List<Integer> integers) {
	return integers.stream()
		.findAny();
}

findAny unit test

@Test
public void test_getGetAnyElement() {
	List<Integer> integerList = Arrays.asList(234, 123, 345, 123);

	Optional<Integer> result = BasicStreamExamples
		.getAnyElement(integerList);

	assertEquals(Integer.valueOf(234), result.get());
}

Conclusion

These basic code examples give idea how Java 8 Stream API operations work. More advanced examples are shown in Java 8 features – Stream API advanced examples post.

Read more...

Java 8 features – Stream API explained

Last Updated on by

Post summary: Code examples of Java 8 Stream API showing useful use cases.

In Java 8 features – Lambda expressions, Interface changes, Stream API, DateTime API post I have briefly described most interesting Java 8 features. In current post I will give special attention to Stream API. This post is more theoretical which lays the foundation of next posts: Java 8 features – Stream API basic examples and Java 8 features – Stream API advanced examples that gives code examples to explain the theory. Code examples here can be found in GitHub java-samples/java8 repository.

Functional interfaces

Before explaining Stream API it is needed to understand the idea of functional interface as they are leveraged for use with lambda expressions. Functional interface is interface that has only one abstract method that is to be implemented. Functional interface may or may not have default or static methods. Although not mandatory good practice is to annotate functional interface with @FunctionalInterface. Functional interfaces mostly used in Stream API operations are explained bellow. You can also use functional interfaces in method signature, hence lambda expressions can be passed when calling a method. If ones bellow are not suitable you can always create own functional interface.

Predicate

Method for implementation is: boolean test(T t). This interface is used in order to evaluate condition to an input object to a boolean expression.

Supplier

Method for implementation is: T get(). This interface is used in order to get output object as a result.

Function

Method for implementation is: R apply(T t). This interface is used in order to produce a result object based on a given input object.

Consumer

Method for implementation is: void accept(T t). This interface is used in order to do operation on a single input object that do not produce any result.

BiConsumer

Method for implementation is: void accept(T t, U u). This interface is used in order to do operation on two input objects that do not produce any result.

Method reference

Sometimes when using lambda expression all that is done is calling a single method by name. Method reference provides easy way to call the method making code more readable.

Stream API

Stream API is used for data processing which supports parallel operations. It enables data processing in declarative way. Streams are sequences of elements that support different operations. Streams are lazily computed on demand, when elements are needed. Stream is like a recipe that gets executed when actual result is needed.

Stream operations

Stream operations are divided into intermediate and terminal operations combined to form stream pipelines. Intermediate operations return a new stream. They are always lazy. Executing an intermediate operation such as filter() does not actually perform any filtering, but instead creates a new stream. Terminal operations on the other hand, such as collect() generates a result or final value. After the terminal operation is performed, the stream pipeline is considered consumed, and can no longer be used. Intermediate and terminal operators, such as limit() or findFirst() can be short-circuiting, once they achieve they goal they stop further stream processing. Intermediate operations are further divided into stateless and stateful operations. Stateless operations, such as filter() and map(), retain no state from previously seen element when processing a new element, hence each element can be processed independently of operations on other elements. Stateful operations, such as distinct() and sorted(), may incorporate state from previously seen elements when processing new elements. For example, one cannot produce any results from sorting a stream until one has seen all elements of the stream. As a result, under parallel computation, some pipelines containing stateful intermediate operations may require multiple passes on the data or may need to buffer significant data. Stateful operations should be carefully considered when constructing stream pipeline because they might require significant resources.

Stream API methods

Bellow is a list of most of the methods available in Stream interface with short description. Code examples with explanations are in following post.

filter

Stream filter(Predicate<? super T> predicate) – stateless intermediate operation that returns a stream consisting of the elements of this stream matching the given predicate.

map

Stream map(Function<? super T, ? extends R> mapper) – stateless intermediate operation that converts a value of one type into another by applying a function that does the conversion. Result is one output value for one input value.

distinct

Stream distinct() – stateful intermediate operation that removes duplicated elements using equals() method.

sorted

Stream sorted() or Stream sorted(Comparator<? super T> comparator) – stateful intermediate operation that sorts stream elements according to given or default comparator.

peek

Stream peek(Consumer<? super T> action) – stateless intermediate operation that performs action on element once stream is consumed. It does not change the stream or alter stream elements. It is mainly used for debugging purposes.

collect

<R, A> R collect(Collector<? super T, A, R> collector) or R collect(Supplier supplier, BiConsumer<R, ? super T> accumulator, BiConsumer<R, R> combiner) – terminal operation that performs mutable reduction operation on the stream elements reducing the stream to a mutable result collector, such as an ArrayList. Stream elements are incorporated into the result by updating it instead of replacing.

toArray

Object[] toArray() – terminal operation that returns array containing elements of this stream.

flatMap

<R> Stream<R> flatMap(Function<? super T, ? extends Stream<? extends R>> mapper) – stateless intermediate operation that replaces value with a stream. Result is arbitrary number of output values to single input value.

limit

Stream<T> limit(long maxSize) – short-circuiting stateful intermediate operation that truncates a stream to a given length.

skip

Stream<T> skip(long n) – stateful intermediate operation that skips first elements from a stream.

forEach

void forEach(Consumer<? super T> action) – terminal operation that performs an action for each element in the stream

reduce

T reduce(T identity, BinaryOperator<T> accumulator) or Optional<T> reduce(BinaryOperator<T> accumulator) or <U> U reduce(U identity, BiFunction<U, ? super T, U> accumulator, BinaryOperator<U> combiner) – terminal operation that performs reduction on the elements in the stream.

min

Optional<T> min(Comparator<? super T> comparator) – terminal operation that returns min element in stream based on given comparator. Special case of reduce operator.

max

Optional<T> max(Comparator<? super T> comparator) – terminal operation that returns max element in stream based on given comparator. Special case of reduce operator.

count

long count() – terminal operation that counts elements in a stream.

anyMatch

boolean anyMatch(Predicate<? super T> predicate) – short-circuiting terminal operation that returns boolean result if element in stream conforms to given predicate. Once result is true operation is cancelled and result is returned.

allMatch

boolean allMatch(Predicate<? super T> predicate) – short-circuiting terminal operation that returns boolean result if all elements in stream conforms to given predicate. Once result is false operation is cancelled and result is returned.

noneMatch

boolean noneMatch(Predicate<? super T> predicate) – short-circuiting terminal operation that returns boolean result if none elements in stream conform to given predicate. Once result is false operation is cancelled and result is returned.

findFirst

Optional<T> findFirst() – short-circuiting terminal operation that returns an Optional with the first element of this stream or an empty Optional if the stream is empty. If the stream has no order, such as Map or Set, then any element may be returned.

findAny

Optional<T> findAny()  – short-circuiting terminal operation that returns an Optional with some element of the stream or an empty Optional if the stream is empty.

Conclusion

Stream API is very powerful instrument provided in Java 8. They allow data processing in declarative way and in parallel. Code looks very neat and easy to read.

Read more...

Java 8 features – Lambda expressions, Interface changes, Stream API, DateTime API

Last Updated on by

Post summary: Short overview of most interesting and useful Java 8 features.

More details and code examples are available for Stream API in a post to follow.

Java 8

Java 8 is released March 2014, more three years ago, so we should have already be familiar with its features, which are really nice and can significantly improve our code. Bellow are some of them I find most interesting and important.

Lambda expressions

In math Lambda calculus is a way for expressing computation based on function abstraction and was first introduced in the 1930s. This is where the name of Lambda expressions in Java comes from. Functional interface is another concept that is closely related to lambda expressions. Functional interface is an interface with just one method that is to be implemented. Lambda expression is an inline code that implements this interface without creating concrete or anonymous class. Lambda expression is basically an anonymous method. With lambda expression code is treated as data and lambda expression can be passed as an argument to another method allowing code itself to be invoked at a later stage. Sometimes when using lambda expression all you do is call single method by name. Method reference is a shortcut for calling a method making code more readable. Lambdas, functional interfaces and method reference are very much used with Stream API and will be covered in details in a separate post.

Method implementation in an Interface

With this feature interfaces are not what they used to be. It is now possible to have method implementation inside an interface. There are two types of methods – default and static. Default methods have implementation and all classes implementing this interface inherit this implementation. It is possible to override existing default method. Static methods also has implementation, but cannot be overridden. Static methods are accessible from interfaces only (InterfaceName.methodName()), they are not accessible from classes implementing those interfaces. Having said that it seems now that interface with static methods is good candidate for utilities class, instead of having final class with private constructor as it is usually done. I will not give code examples for this feature, there are lots of resources online.

Stream API

This might be the most significant feature in Java 8 release. It is related to lambda expressions as Stream methods has functional interfaces in their signature, so it is nice and easy to pass lambda expression. Stream API was introduced because default methods in interfaces were allowed. Interface java.util.Collection was extended with stream() method and if default methods were not allowed this would have meant a lot of custom implementations broken, essentially an incompatible change. Stream API provides methods for building pipelines for data processing. Unlike collections streams are not physical objects, they are abstractions and become physical when they are needed. Huge benefits of streams is that they are designed to facilitate multi-core architectures without developers to worry about it. Everything happens behind the scenes. Stream API is explained in more details in following posts:

Date and Time API

Prior to Java 8 date-time classes were not thread-safe and calculations and date-time manipulations were very hard. Also time zones management was hard. In Java 8 date-time classes are now immutable which makes then thread-safe. In most of the projects I’ve seen prior to Java 8 instead of using default Java time classes Joda-Time library was used. It is an amazing library providing so much features to manipulate date and time. In Java 8 date and time classes follow principles from Joda-Time which makes Java 8 Date and Time API very efficient. Actually the Joda-Time designer was the Java specification lead for JSR 310. In Java 8 there is local and zoned date-time classes. I’m not going to get into details here, there are many tutorials online for Java 8 Date and Time API usage. I just say – start using it! It is located in java.time.* package.

Conclusion

Java 8 has really great features. I anticipate you are already using it, if not – start right now!

Read more...

Run multiple machines in a single Vagrant file

Last Updated on by

Post summary: How to run multiple machines on Vagrant described in a single Vagrantfile.

Code bellow can be found in GitHub sample-dropwizard-rest-stub repository in Vagrantfile file. This post is part of Vagrant series. All of other Vagrant related posts as well as more theoretical information what is Vagrant and why to use it can be found in What is Vagrant and why to use it post.

Vagrantfile

As described in Vagrant introduction post all configurations are done in a single text file called Vagrantfile. Bellow is a Vagrant file which can be used to initialise two machines. One is same as described in Run Dropwizard Java application on Vagrant post, the other is the one described in Run Docker container on Vagrant post.

Vagrant.configure('2') do |config|

  config.vm.hostname = 'dropwizard'
  config.vm.box = 'opscode-centos-7.2'
  config.vm.box_url = 'http://opscode-vm-bento.s3.amazonaws.com/vagrant/virtualbox/opscode_centos-7.2_chef-provisionerless.box'

  config.vm.synced_folder './', '/vagrant'

  config.vm.define 'jar' do |jar|
    jar.vm.network :forwarded_port, guest: 9000, host: 9100
    jar.vm.network :forwarded_port, guest: 9001, host: 9101

    jar.vm.provider :virtualbox do |vb|
      vb.name = 'dropwizard-rest-stub-jar'
    end

    jar.vm.provision :shell do |shell|
      shell.inline = <<-SHELL
        sudo service dropwizard stop
        sudo yum -y install java
        sudo mkdir -p /var/dropwizard-rest-stub
        sudo mkdir -p /var/dropwizard-rest-stub/logs
        sudo cp /vagrant/target/sample-dropwizard-rest-stub-1.0-SNAPSHOT.jar /var/dropwizard-rest-stub/dropwizard-rest-stub.jar
        sudo cp /vagrant/config-vagrant.yml /var/dropwizard-rest-stub/config.yml
        sudo cp /vagrant/linux_service_file /etc/init.d/dropwizard
        # Replace CR+LF with LF because of Windows
        sudo sed -i -e 's/\r//g' /etc/init.d/dropwizard
        sudo service dropwizard start
      SHELL
    end
  end

  config.vm.define 'docker' do |docker|
    docker.vm.network :forwarded_port, guest: 9000, host: 9000
    docker.vm.network :forwarded_port, guest: 9001, host: 9001

    docker.vm.provider :virtualbox do |vb|
      vb.name = 'dropwizard-rest-stub-docker'
      vb.customize ['modifyvm', :id, '--memory', '768', '--cpus', '2']
    end
  
    docker.vm.provision :shell do |shell|
      shell.inline = <<-SHELL
        sudo yum -y install epel-release
        sudo yum -y install python-pip
        sudo pip install --upgrade pip
        sudo pip install six==1.4
        sudo pip install docker-py
      SHELL
    end
  
    docker.vm.provision :docker do |docker|
      docker.build_image '/vagrant/.', args: '-t dropwizard-rest-stub'
      docker.run 'dropwizard-rest-stub', args: '-it -p 9000:9000 -p 9001:9001 -e ENV_VARIABLE_VERSION=1.1.1'
    end
  end
  
end

Vagrantfile explanation

File starts with a Vagrant.configure(‘2’) do |config| which states that version 2 of Vagrant API will be used and defines constant with name config to be used bellow. Guest operating system hostname is set with config.vm.hostname. If you use vagrant-hostsupdater plugin it will add it to your hosts file and you can access it from browser in case you are developing web applications. With config.vm.box you define which would be the guest operating system. Vagrant maintains config.vm.box = “hashicorp/precise64” which is Ubuntu 12.04 (32 and 64-bit), they also recommend to use Bento’s boxes, but I found issues with Vagrant’s as well as Bento’s boxes so I’ve decided to use one I know is working. I specify where it is located with config.vm.box_url. It is It is CentOS 7.2. With config.vm.synced_folder command you specify that Vagrantfile location folder is shared as /vagrant/ in guest operating system. This makes it easy to transfer files between guest and host operating systems. Now comes the part where two different machines are defined. First one is defined with config.vm.define ‘jar’ do |jar|, which declares variable jar to be used later in configurations. All other configurations are well described in Run Dropwizard Java application on Vagrant post. Specific part here is port mapping. In order to avoid port collision port 9000 from guest is mapped to port 9100 to host with jar.vm.network :forwarded_port, guest: 9000, host: 9100 line. This is because second machine uses port 9000 from the host. Second machine is defined in config.vm.define ‘docker’ do |docker|, which declares variable docker to be used in further configurations. All other configurations are described in Run Docker container on Vagrant post.

Running Vagrant

Command to start Vagrant machine is: vagrant up. Then in order to invoke provisioning section with actual deployment you have to call: vagrant provision. All can be done in one step: vagrant up –provision. To shutdown the machine use vagrant halt. To delete machine: vagrant destroy.

Conclusion

It is very easy to create Vagrantfile that builds and runs several machines with different applications. It possible to make those machine communicate with each other, hence simulation real environment. Once created file can be reused by all team members. It is executed over and over again making provisioning extremely easy.

Read more...

Run Docker container on Vagrant

Last Updated on by

Post summary: How to run Docker container on Vagrant.

Code bellow can be found in GitHub sample-dropwizard-rest-stub repository in Vagrantfile-docker file. Since Vagrant requires to have only one Vagrantfile if you want to run this example you have to rename Vagrantfile-docker to Vagrantfile then run Vagrant commands described in the end of this post. This post is part of Vagrant series. All of other Vagrant related posts as well as more theoretical information what is Vagrant and why to use it can be found in What is Vagrant and why to use it post.

Vagrantfile

As described in Vagrant introduction post all configurations are done in a single text file called Vagrantfile. Bellow is a Vagrant file which can be used to deploy and start Docker container on Vagrant. Example here uses Dockerised application that is described in Run Dropwizard application in Docker with templated configuration using environment variables post.

Vagrant.configure('2') do |config|

  config.vm.hostname = 'dropwizard'
  config.vm.box = 'opscode-centos-7.2'
  config.vm.box_url = 'http://opscode-vm-bento.s3.amazonaws.com/vagrant/virtualbox/opscode_centos-7.2_chef-provisionerless.box'

  config.vm.synced_folder './', '/vagrant'

  config.vm.network :forwarded_port, guest: 9000, host: 9000
  config.vm.network :forwarded_port, guest: 9001, host: 9001

  config.vm.provider :virtualbox do |vb|
    vb.name = 'dropwizard-rest-stub-docker'
    vb.customize ['modifyvm', :id, '--memory', '768', '--cpus', '2']
  end

  config.vm.provision :shell do |shell|
    shell.inline = <<-SHELL
      sudo yum -y install epel-release
      sudo yum -y install python-pip
      sudo pip install --upgrade pip
      sudo pip install six==1.4
      sudo pip install docker-py
    SHELL
  end

  config.vm.provision :docker do |docker|
    docker.build_image '/vagrant/.', args: '-t dropwizard-rest-stub'
    docker.run 'dropwizard-rest-stub', args: '-it -p 9000:9000 -p 9001:9001 -e ENV_VARIABLE_VERSION=1.1.1'
  end

end

Vagrantfile explanation

File starts with a Vagrant.configure(‘2’) do |config| which states that version 2 of Vagrant API will be used and defines constant with name config to be used bellow. Guest operating system hostname is set with config.vm.hostname. If you use vagrant-hostsupdater plugin it will add it to your hosts file and you can access it from browser in case you are developing web applications. With config.vm.box you define which would be the guest operating system. Vagrant maintains config.vm.box = “hashicorp/precise64” which is Ubuntu 12.04 (32 and 64-bit), they also recommend to use Bento’s boxes. I have found issues with Vagrant’s as well as Bento’s boxes so I’ve decided to use one I know is working. I specify where it is located with config.vm.box_url. It is CentOS 7.2. With config.vm.synced_folder command you specify that Vagrantfile location folder is shared as /vagrant/ in guest operating system. This makes it easy to transfer files between guest and host operating systems. This mount is done by default, but it is good to explicitly state it for better readability. With config.vm.network :forwarded_port port from guest OS is forwarded to your hosting OS. Without exposing any port you will not have access to guest OS, only port open by default is 22 for SSH. With config.vm.provider :virtualbox do |vb| you access VirtualBox provider for more configurations, vb.name = ‘dropwizard-rest-stub-docker’ sets the name that you see in Oracle VirtualBox Manager. With vb.customize [‘modifyvm’, :id, ‘–memory’, ‘768’, ‘–cpus’, ‘2’] you modify default hardware settings for the machine, RAM is set to 768MB and 2 CPUs are configured. Finally the provisioning part takes place which is done by shell commands inside config.vm.provision :shell do |shell| block. This block installs Python as well as docker-py. It is CentOS specific as it uses YUM which is CentOS package manager. Next provisioning part is to run docker provisioner that builds docker image and then runs it by mapping ports and setting environment variable. For more details how to build and run Docker containers read Run Dropwizard application in Docker with templated configuration using environment variables post.

Running Vagrant

Command to start Vagrant machine is: vagrant up. Then in order to invoke provisioning section with actual deployment you have to call: vagrant provision. All can be done in one step: vagrant up –provision. To shutdown the machine use vagrant halt. To delete machine: vagrant destroy.

Conclusion

It is very easy to create Vagrantfile that builds and runs Docker container. Once created file can be reused by all team members. It is executed over and over again making provisioning extremely easy.

Read more...

Run Dropwizard Java application on Vagrant

Last Updated on by

Post summary: How to run Dropwizard or any other Java application on Vagrant.

Code bellow can be found in GitHub sample-dropwizard-rest-stub repository in Vagrantfile-jar file. Since Vagrant requires to have only one Vagrantfile if you want to run this example you have to rename Vagrantfile-jar to Vagrantfile then run Vagrant commands described in the end of this post. This post is part of Vagrant series. All of other Vagrant related posts as well as more theoretical information what is Vagrant and why to use it can be found in What is Vagrant and why to use it post.

Vagrantfile

As described in Vagrant introduction post all configurations are done in a single text file called Vagrantfile. Bellow is a Vagrant file which can be used to deploy and start as service Dropwizard Java application described in Build a RESTful stub server with Dropwizard post.

Vagrant.configure('2') do |config|

  config.vm.hostname = 'dropwizard'
  config.vm.box = 'opscode-centos-7.2'
  config.vm.box_url = 'http://opscode-vm-bento.s3.amazonaws.com/vagrant/virtualbox/opscode_centos-7.2_chef-provisionerless.box'

  config.vm.synced_folder './', '/vagrant'

  config.vm.network :forwarded_port, guest: 9000, host: 9000
  config.vm.network :forwarded_port, guest: 9001, host: 9001

  config.vm.provider :virtualbox do |vb|
    vb.name = 'dropwizard-rest-stub-jar'
  end

  config.vm.provision :shell do |shell|
    shell.inline = <<-SHELL
      sudo service dropwizard stop
      sudo yum -y install java
      sudo mkdir -p /var/dropwizard-rest-stub
      sudo mkdir -p /var/dropwizard-rest-stub/logs
      sudo cp /vagrant/target/sample-dropwizard-rest-stub-1.0-SNAPSHOT.jar /var/dropwizard-rest-stub/dropwizard-rest-stub.jar
      sudo cp /vagrant/config-vagrant.yml /var/dropwizard-rest-stub/config.yml
      sudo cp /vagrant/linux_service_file /etc/init.d/dropwizard
      # Replace CR+LF with LF because of Windows
      sudo sed -i -e 's/\r//g' /etc/init.d/dropwizard
      sudo service dropwizard start
    SHELL
  end

end

Vagrantfile explanation

File starts with a Vagrant.configure(‘2’) do |config| which states that version 2 of Vagrant API will be used and defines constant with name config to be used bellow. Guest operating system hostname is set with config.vm.hostname. If you use vagrant-hostsupdater plugin it will add it to your hosts file and you can access it from browser in case you are developing web applications. With config.vm.box you define which would be the guest operating system. Vagrant maintains config.vm.box = “hashicorp/precise64” which is Ubuntu 12.04 (32 and 64-bit), they also recommend to use Bento’s boxes. I have found issues with Vagrant’s as well as Bento’s boxes so I’ve decided to use one I know is working. I specify where it is located with config.vm.box_url. It is CentOS 7.2. With config.vm.synced_folder command you specify that Vagrantfile location folder is shared as /vagrant/ in guest operating system. This makes it easy to transfer files between guest and host operating systems. This mount is done by default, but it is good to explicitly state it for better readability. With config.vm.network :forwarded_port port from guest OS is forwarded to your hosting OS. Without exposing any port you will not have access to guest OS, only port open by default is 22 for SSH. With config.vm.provider :virtualbox do |vb| you access VirtualBox provider for more configurations, vb.name = ‘dropwizard-rest-stub-jar’ sets the name that you see in Oracle VirtualBox Manager. Finally the deployment part takes place which is done by shell commands inside config.vm.provision :shell do |shell| block. Service dropwizard is stopped, if not existing an error is shown, but it does not interrupt provisioning process. Command yum -y install java is CentOS specific and it installs Java by YUM which is CentOS package manager. For other Linux distributions you have to use command with their package manager. Folders are created, then JAR and YML files are copied to machine. Notice that files are copied from /vagrant/ folder, this is actually the shared folder to your host OS. Installing Java application as service is done by copying linux_service_file to /etc/init.d/dropwizard. This creates service with name dropwizard. See more how to install Linux service in Install Java application as a Linux service post. Since I’m on Windows its line endings (CR+LF) are different that on Linux (LF) and service is not working, giving env: /etc/init.d/dropwizard: No such file or directory error. This is why CF+LF should be replaced with LF with sudo sed -i -e ‘s/\r//g’ /etc/init.d/dropwizard command. Finally script starts the dropwizard service. The more nicer way to do this is all installation steps to be extracted as separate batch file and in Vagrantfile just to call that file. I’ve put it in Vagrantfile just to have it on one place.

Running Vagrant

Command to start Vagrant machine is: vagrant up. Then in order to invoke provisioning section with actual deployment you have to call: vagrant provision. All can be done in one step: vagrant up –provision. To shutdown the machine use vagrant halt. To delete machine: vagrant destroy.

Conclusion

It is very easy to create Vagrantfile that install Java application. Once created file can be reused by all team members. It is executed over and over again making provisioning extremely easy.

Read more...

What is Vagrant and why to use it

Last Updated on by

Post summary: Brief description on Vagrant and when and why to use it.

This post is a preface to other post where I will describe in details with examples how to configure and run Vagrant.

What is Vagrant

Vagrant is a tool for building and managing virtual machine environments in a single workflow. With an easy-to-use workflow and focus on automation, Vagrant lowers development environment setup time, increases production parity, and makes the “works on my machine” excuse a relic of the past. Vagrant in convenient to share virtual environment setup and configurations.

How Vagrant works

Vagrant does not provide virtualisation engines, but builds on top of already existing such as VirtualBox which is the default provider, VMWare, Hyper-V or Docker. Vagrant providers are available as plugins so can be easily installed and used. Simply said Vagrant spins up a virtual machine for you, configures it and installs software on it. All those actions are described in a single text file, called Vagrantfile, that can be shared among team members allowing everyone to have one and the same setup.

Why to use Vagrant

Vagrant allows us very easily to share setups between team members allowing very easy spin up of work environment. For me important reason to use Vagrant is test how your deployment works, i.e. provisioning, locally before pushing those changes to other environments. Other important use case I’ve seen is to create own development/test environment which is very hard to create on local machine. This was a huge Tomcat application consisting of tens of configuration files with hundreds of configuration values which was not possible to provision on local box, here Vagrant came to a rescue applying Chef cookbook used for deployment on physical hosts.

Provisioning

Provisioning is all tasks related to deployment and configurations of applications making them ready to use. In the past this was done with many scripts or manual steps, which was quite unreliable and error-prone. Nowadays tools like Chef or Ansible allow automatic deployment and configuration of applications. This is proper way to do deployments as it eliminates the human error and speeds up deployment. Once you have your Chef cookbook or Ansile playbook ready you wan to test them if they work properly. Here comes the true value of Vagrant, you can test locally changes which otherwise may broke some shared environment and stop work for many people.

Why is this post existing?

This post has not real practical value. Its purpose is to introduce Vagrant and to serve as a preface to three other posts from Vagrant series:

Conclusion

Vagrant provides easy way to define and share different application or environment setup in a single text file called Vagrantfile. Vagrant uses virtualisation engines like VirtualBox, VMWare or Hyper-V and builds on top of them. Most valuable usage I’ve seen Vagrant used for is to test your provisioning scripts and also provision application which otherwise would be very hard to run manually on local machine. Enjoy reading post with actual configurations and Vagrantfile examples.

Read more...

Build a Dropwizard project with Gradle

Last Updated on by

Post summary: Code examples how to create Dropwizard project with Gradle.

Code sample here can be found as buildable and runable project in GitHub sample-dropwizard-rest-stub repository in separate git branch called gradle.

Project structure

All classes have been thoroughly described in Build a RESTful stub server with Dropwizard post. In this post I will describe how to make project build-able with Gradle. In order to make it more understandable I will compare with Maven’s pom.xml file elements by XPath.

Gradle

Gradle is an open source build automation system that builds upon the concepts of Apache Ant and Apache Maven and introduces a Groovy-based domain-specific language (DSL) instead of the XML form used by Apache Maven for declaring the project configuration. Gradle is much more powerful and more complex than Maven. There is significant tendency for Java projects moving towards Gradle so I’ve decided to make this post.

Gradle artefacts

In order to make your project work with Gradle you need several files. List bellow is how files are placed in project’s root folder:

  • gradle/wrapper/gradle-wrapper.jar – Gradle Wrapper allows you to make builds without installing Gradle on your machine. This is very convenient and makes Gradle usage easy. This JAR is managing the Gradle Wrapper automatic download and installation on first build.
  • gradle\wrapper\gradle-wrapper.properties – configuration which Gradle Wrapper version to be downloaded and installed on first build.
  • build.gradle – the most import file. This is where you configure your project.
  • gradlew – this Gradle Wrapper executable for Linux.
  • gradlew.bat – this is Gradle Wrapper executable for Windows.
  • settings.gradle – Project settings. Mainly used in case of multi-module projects.

setting.gradle file

This file is mainly used in case of multi-module project. In it we currently define project name: rootProject.name = ‘sample-dropwizard-rest-stub’. This is same value as in /project/name form pom.xml file.

Constructing build.gradle file

This is main file where you configure your project. You need to define version (/project/version in pom.xml), group (/project/groupId in pom.xml) and optionally description. Since this is Java project you need to apply plugin: ‘java’. Also you need need to specify Java version, 1.8 in this case by sourceCompatibility and targetCompatibility values. Next is to set repositories. You can use mavenCentral or add a custom one by following code, which is not shown in example bellow: maven { url ‘https://plugins.gradle.org/m2/’ }. You need to define dependencies (/project/dependencies/dependency in pom.xml file) to tell Gradle what libraries this project needs. In current example it is compile dependency to io.dropwizard:dropwizard-core:0.8.0 and testCompile dependency to junit:junit:4.12. This is enough to have fully functional Dropwizard project with code examples given in Build a RESTful stub server with Dropwizard post.

version '1.0-SNAPSHOT'
group 'com.automationrhapsody.reststub'
description 'Sample Dropwizard REST Stub'

apply plugin: 'java'

sourceCompatibility = 1.8
targetCompatibility = 1.8

repositories {
	mavenCentral()
}

dependencies {
	compile 'io.dropwizard:dropwizard-core:0.8.0'

	testCompile 'junit:junit:4.12'
}

The beauty of Dropwizard is the ability to pack everything into a single JAR file and then run that file. In Maven this was done by maven-shade-plugin in Gradle the best way to do it is Shadow JAR plugin. You need to define it via plugins closure. Now lets configure shadowJar. You can specify archiveName or exclude some artefacts from packed JAR. Optionally you can enhance you MANIFEST.MF file by adding more details to manifest closure. Nice thing for Gradle is that you can use Groovy as well as pure Java code. Constructing Build-Time requires import some Java DateTime classes and using them to make human readable time. Next piece that you need to add to your build.gradle file is:

import java.time.ZoneId
import java.time.ZonedDateTime
import java.time.format.DateTimeFormatter

plugins {
	id 'com.github.johnrengelman.shadow' version '1.2.4'
}

mainClassName = 'com.automationrhapsody.reststub.RestStubApp'

shadowJar {
	mergeServiceFiles()
	exclude 'META-INF/*.DSA', 'META-INF/*.RSA', 'META-INF/*.SF'
	manifest {
		attributes 'Implementation-Title': rootProject.name
		attributes 'Implementation-Version': rootProject.version
		attributes 'Implementation-Vendor-Id': rootProject.group
		attributes 'Build-Time': ZonedDateTime.now(ZoneId.of("UTC"))
				.format(DateTimeFormatter.ISO_ZONED_DATE_TIME)
		attributes 'Built-By': InetAddress.localHost.hostName
		attributes 'Created-By': 'Gradle ' + gradle.gradleVersion
		attributes 'Main-Class': mainClassName
	}
	archiveName 'sample-dropwizard-rest-stub.jar'
}

Once you build your JAR file with command: gradlew shadowJar you can run it with java -jar build/sample-dropwizard-rest-stub.jar server config.yml command. Gradle has another option to run your project for testing purposes. It is done by first apply plugin: ‘application’. You need to specify which is mainClassName to be run and configure run args. In order to run your project from Gradle with gradlew run command you just add:

apply plugin: 'application'

mainClassName = 'com.automationrhapsody.reststub.RestStubApp'

run {
	args = ['server', 'config.yml']
}

build.gradle file

Full build.gradle file content is shown bellow:

import java.time.ZoneId
import java.time.ZonedDateTime
import java.time.format.DateTimeFormatter

plugins {
	id 'com.github.johnrengelman.shadow' version '1.2.4'
}

version '1.0-SNAPSHOT'
group 'com.automationrhapsody.reststub'
description 'Sample Dropwizard REST Stub'

apply plugin: 'java'
apply plugin: 'application'

sourceCompatibility = 1.8
targetCompatibility = 1.8

repositories {
	mavenCentral()
}

dependencies {
	compile 'io.dropwizard:dropwizard-core:0.8.0'

	testCompile 'junit:junit:4.12'
}

mainClassName = 'com.automationrhapsody.reststub.RestStubApp'

run {
	args = ['server', 'config.yml']
}

shadowJar {
	mergeServiceFiles()
	exclude 'META-INF/*.DSA', 'META-INF/*.RSA', 'META-INF/*.SF'
	manifest {
		attributes 'Implementation-Title': rootProject.name
		attributes 'Implementation-Version': rootProject.version
		attributes 'Implementation-Vendor-Id': rootProject.group
		attributes 'Build-Time': ZonedDateTime.now(ZoneId.of("UTC"))
				.format(DateTimeFormatter.ISO_ZONED_DATE_TIME)
		attributes 'Built-By': InetAddress.localHost.hostName
		attributes 'Created-By': 'Gradle ' + gradle.gradleVersion
		attributes 'Main-Class': mainClassName
	}
	archiveName 'sample-dropwizard-rest-stub.jar'
}

Conclusion

This post is extension to Build a RESTful stub server with Dropwizard post, in which I have described how to build REST service with Dropwizard and Maven. In current post I have shown how to do the same with Gradle.

Read more...

Manage and automatically select needed WebDriver in Java 8 Selenium project

Last Updated on by

Post summary: Example code how to efficiently manage and automatically select needed local WebDriver using Java 8 method reference used as lambda expression.

Code examples in current post can be found in GitHub selenium-samples-java/design-patterns repository.

Java 8 features

In this example lambda expression and method reference Java 8 features are used. More in Java 8 features can be found in Java 8 features – Lambda expressions, Interface changes, Stream API, DateTime API post.

Functional interface

Before explaining lambda it is needed to understand the idea of functional interface as they are leveraged for use with lambda expressions. Functional interface is interface that has only one abstract method that is to be implemented. Functional interface may or may not have default or static methods (again new Java 8 feature). Although not mandatory good practice is to annotate functional interface with @FunctionalInterface.

Lambda expressions

There is not such term in Java, but you can think of lambda expression as anonymous method. Lambda expression is piece of code that provides inline implementation of a functional interface, eliminating the need of using anonymous classes. Lambda expressions facilitate functional programming and ease development by reducing the amount of code needed.

Method reference

Sometimes when using lambda expression all you do is call a method by name. Method reference provides easy way to call the method making code more readable.

Managing WebDriver

Proposed solution of managing WebDriver has enumeration called Browser and class called WebDriverFactory. Another important thing is web drivers should be placed in folder with name webdrivers and named with special pattern.

Browser enum

Code is shown bellow:

package com.automationrhapsody.designpatterns;

import java.util.Arrays;
import java.util.function.Supplier;

import org.openqa.selenium.WebDriver;
import org.openqa.selenium.chrome.ChromeDriver;
import org.openqa.selenium.firefox.FirefoxDriver;
import org.openqa.selenium.ie.InternetExplorerDriver;

public enum Browser {
	FIREFOX("gecko", FirefoxDriver::new),
	CHROME("chrome", ChromeDriver::new),
	IE("ie", InternetExplorerDriver::new);

	private String name;
	private Supplier<WebDriver> driverSupplier;

	Browser(String name, Supplier<WebDriver> driverSupplier) {
		this.name = name;
		this.driverSupplier = driverSupplier;
	}

	public String getName() {
		return name;
	}

	public WebDriver getDriver() {
		return driverSupplier.get();
	}

	public static Browser fromString(String value) {
		for (Browser browser : values()) {
			if (value != null && value.toLowerCase().equals(browser.getName())) {
				return browser;
			}
		}
		System.out.println("Invalid driver name passed as 'browser' property. "
			+ "One of: " + Arrays.toString(values()) + " is expected.");
		return FIREFOX;
	}
}

Enumeration’s constructor has Supplier functional interface as parameter. When constructor is called method reference FirefoxDriver::new is called as lambda expression which purpose is to instantiate new Firefox driver. If only lambda expression is used is would be: () -> new FirefoxDriver(). Notice that method reference is much shorter and easy to read. getDriver() method invokes Supplier’s get() method which is implemented by the lambda expression, so lambda expression is executed hence instantiating new web driver. With this approach Firefox web dirver object is created only when getDriver() method is called.

WebDriverFactory

Code is:

package com.automationrhapsody.designpatterns;

import java.io.File;

import org.openqa.selenium.WebDriver;

class WebDriverFactory {

	private static final String WEB_DRIVER_FOLDER = "webdrivers";

		public static WebDriver createWebDriver() {
		Browser browser = Browser.fromString(System.getProperty("browser"));
		String arch = System.getProperty("os.arch").contains("64") ? "64" : "32";
		String os = System.getProperty("os.name").toLowerCase().contains("win") 
				? "win.exe" : "linux";
		String driverFileName = browser.getName() + "driver-" + arch + "-" + os;
		String driverFilePath = driversFolder(new File("").getAbsolutePath());
		System.setProperty("webdriver." + browser.getName() + ".driver", 
				driverFilePath + driverFileName);
		return browser.getDriver();
	}

	private static String driversFolder(String path) {
		File file = new File(path);
		for (String item : file.list()) {
			if (WEB_DRIVER_FOLDER.equals(item)) {
				return file.getAbsolutePath() + "/" + WEB_DRIVER_FOLDER + "/";
			}
		}
		return driversFolder(file.getParent());
	}
}

This code recursively searches for folder named webdrivers in the project. This is done because when you have multi-module project running from IDE and from Maven has different root folder and finding web drivers is not possible from both simultaneously. Once folder is found then proper web driver is selected based on OS and architecture. Code reads browser system property which can be passed from outside hence making selection of web driver easy to configure. Important part is to have web drivers with special naming convention.

Web drivers naming convention

In order code above to work web drivers should be place in webdrivers folder in the project and their names should match the pattern: {DIVER_NAME}-{ARCHITECTURE}-{OS}, e.g. geckodriver-64-win.exe for Windows 64 bit and geckodriver-64-linux for Linux 64 bit.

Conclusion

Proposed solution is very elegant way to manage your web drivers and select proper one just by passing -Dbrowser={BROWSER} Java system property.

Read more...

Run Dropwizard application in Docker with templated configuration using environment variables

Last Updated on by

Post summary: How to run Dropwizard application inside a Docker container with configuration template file which later get replaced by environment variables.

In Build a RESTful stub server with Dropwizard post I have described how to build a Dropwizard application and run it. In current post I will show how this application can be inserted into Docker container and run with different configurations, based on different environment variables. Code sample here can be found as buildable and runable project in GitHub sample-dropwizard-rest-stub repository.

Dropwizard templated config

Dropwizard supports replacing variables from config file with environment variables if such are defined. First step is to make the config file with special notation.

version: ${ENV_VARIABLE_VERSION:- 0.0.2}

Dropwizard substitution is based on Apache’s StrSubstitutor library. ${ENV_VARIABLE_VERSION:- 0.0.2} means that Dropwizard will search for environment variable with name ENV_VARIABLE_VERSION and replace its value with given variable. If no variable is configured then it will replace with default value of 0.0.2. If default is not needed then just ${ENV_VARIABLE_VERSION} can be used.

Next step is to make Dropwizard substitute config file with environment variables on its startup before file is being read. This is done with following code:

@Override
public void initialize(Bootstrap<RestStubConfig> bootstrap) {
	bootstrap.setConfigurationSourceProvider(new SubstitutingSourceProvider(
			bootstrap.getConfigurationSourceProvider(), 
			new EnvironmentVariableSubstitutor(false)));
}

EnvironmentVariableSubstitutor is used with false in order to suppress throwing of UndefinedEnvironmentVariableException in case environment variable is not defined. There is approach to pack config.yml file into JAR file and use it from there, then bootstrap.getConfigurationSourceProvider() should be changed with path -> Thread.currentThread().getContextClassLoader().getResourceAsStream(path).

Docker

Docker is platform for building software inside containers which then can be deployed on different environments. A container is like a virtual machine with significant difference that it does not build full operation system. In this way host’s resources are optimised, container consumes as much memory as needed by application. Virtual machine itself consumes memory to run the operation system as well. Containers have resource (CPU, RAM, HDD) isolation so that application sees the container as separate operating system.

Dockerfile

A Dockerfile is a text document that contains all the commands a user could call on the command line to assemble an image. Using docker build users can create an automated build that executes several command-line instructions in succession. In given example Dockerfile looks like this:

FROM openjdk:8u121-jre-alpine
MAINTAINER Automation Rhapsody http://automationrhapsody.com/

WORKDIR /var/dropwizard-rest-stub

ADD target/sample-dropwizard-rest-stub-1.0-SNAPSHOT.jar /var/dropwizard-rest-stub/dropwizard-rest-stub.jar
ADD config-docker.yml /var/dropwizard-rest-stub/config.yml

EXPOSE 9000 9001

ENTRYPOINT ["java", "-jar", "dropwizard-rest-stub.jar", "server", "config.yml"]

Dockerfile starts with FROM openjdk:8u121-jre-alpine which instructs docker to use this already prepared image as a base image for container. At https://hub.docker.com/ there are numerous of images maintained by different organisations or individuals that provide different frameworks and tools. They can save you a lot of work allowing you to skip configuration and installation of general things. In given case openjdk:8u121-jre-alpine has OpenJDK JRE 1.8.121 already installed with minimalist Linux libraries. MAINTAINER documents who is the author of this file. WORKDIR sets working directory of the container. Then ADD copies JAR file and config.yml to container. ENTRYPOINT configures container to be run as executable. This instruction translates to java -jar dropwizard-rest-stub.jar server config.yml command. More info can be found in Dockerfile reference link.

Build Docker container

Docker container can be build with following command:

docker build -t dropwizard-rest-stub .

dropwizard-rest-stub is the name of container, it is later used to run container with this name. Note that dot in the end is mandatory.

Run Docker container

docker run -it -p 9000:9000 -p 9001:9001 -e ENV_VARIABLE_VERSION=1.1.1 dropwizard-rest-stub

Name of the container is dropwizard-rest-stub and is put in the end. With -p 9000:9000 port 9000 form guest machine is exposed to host machine, otherwise container will not be accessible. With -e ENV_VARIABLE_VERSION=1.1.1 environment variable is being set. If not passed it will be substituted with 0.0.2 in config.yml file as described above. In case of many environment variables it is more comfortable to put them in a file with content KEY=VALUE. Then use this file with –env-file= instead of specifying each individual variable.

Conclusion

Dropwizard has very easy mechanism for making configuration dependant on environment variables. This makes Dropwizard applications extremely suitable to be build into Docker containers and run on different environments just by changing environment variables being passed to container.

Read more...

Coloured log files in Linux

Last Updated on by

Post summary: How to colour your log files for better perception under Linux.

I’m far away from being a Linux guru and honestly I like it that way. In order to be effective as QA you need to have minimal knowledge how to do certain things under Linux. This post is devoted on working with logs.

Chaining commands

Linux offers a possibility to combine several commands by chaining them. In current post I will just one of them, the PIPE (I) operator. By using it output of one command is used as input for other.

I would strongly recommend to read following post if you are interested in chaining Linux commands: 10 Useful Chaining Operators in Linux with Practical Examples.

Useful commands

Commands bellow are one I use on daily basis when working with logs. I will show basic usage, if you need more detail on certain command then you can type: man <command> e.g. man cat in Linux console and it will display you more information.

grep

It is used to search in text files or print lines matching some pattern. Usage is: grep text filename.log. If text contains spaces it should be wrapped around single quote (‘) or double quote (“). If text contains single quote, then you have to wrap it around with double quote and vice versa.

cat

Print file content on standard out put. Usage is: cat filename.log. You can concatenate several files: cat file1.log file2.log. Drawback using this command is when you have large files. It combines very well with grep to search output of the file: cat filename.log | grep text.

zcat

Print content of zipped file. Usage: zcat filename.gz. Combines with grep: zcat filename.gz | grep text.

tail

Prints last 10 lines from a file. Usage: tail filename.log. Most valuable tail usage is with -f option: tail -f filename.log. This monitor file in real time and outputs all new text appended to the file. You can also monitor several files: tail -f file1.log file2.log.

less

Used for paging through a file. It shows one page and with arrow key up and down you can scroll into the file. Usage: less filename.txt. In order to exit just type q. Valuable with this command is that you can type a search term /text and then with n go to next appearance and with N go to previous.

Colours

Commands above are nice, but using colours aid for much better perception of information in the files. In order to use colours perl -pe command will be used as chained command to colour the output of commands described above. Syntax is: perl -pe ‘s/^.*INFO.*$/\e[0;36;40m$&\e[0m/g’. It is quite complex expression and I will try to explain it in details.

Match text to be highlighted

^.*INFO.*$ is a regular expression that matches a text to be higlighted. Character ^ means from beginning of the string or line, character $ means to end of string or line. Group .* matches any character. So this regular expression means inspect every string or line and match those that contain INFO.

Text effects

\e[0;36;40m is the colouring part of the expression. 0 is value for ANSI escape code. Possible values for escape code are show in table bellow. Note that not all of them are supported by all OS.

Code Effect
0 Reset / Normal
1 Bold or increased intensity
2 Faint (decreased intensity)
3 Italic: on
4 Underline: Single
5 Blink: Slow
6 Blink: Rapid
7 Image: Negative
8 Conceal
9 Crossed-out

More codes can be found in ANSI escape code wiki.

Text colour

36 from \e[0;36;40m is colour code of text. Colour depends and is different based on escape code. Possible combinations of escape and colour codes are:

Code Colour Code Colour
0;30 Black 1;30 Dark Grey
0;31 Red 1;31 Light Red
0;32 Green 1;32 Lime
0;33 Dark Yellow 1;33 Yellow
0;34 Blue 1;34 Light Blue
0;35 Purple 1;35 Magenta
0;36 Dark Cyan 1;36 Cyan
0;37 Light Grey 1;37 White

Background colour

40m from \e[0;36;40m is colour code of background. Background colours are:

Code Colour
40m Black
41m Red
42m Green
43m Yellow
44m Blue
45m Purple
46m Cyan
47m Light Grey

Sample colour scheme for logs

One possible colour scheme I like is: cat application.log | perl -pe ‘s/^.*FATAL.*$/\e[1;37;41m$&\e[0m/g; s/^.*ERROR.*$/\e[1;31;40m$&\e[0m/g; s/^.*WARN.*$/\e[0;33;40m$&\e[0m/g; s/^.*INFO.*$/\e[0;36;40m$&\e[0m/g; s/^.*DEBUG.*$/\e[0;37;40m$&\e[0m/g’ which will produce following output:

linux-logs-colour

Conclusion

Having coloured logs makes it much easier to investigate logs. Linux provides tooling for better visualisation so it is good to take advantage of those.

Read more...

Create simple REST API client using Jersey

Last Updated on by

Post summary: Code examples how to create REST API client using Jersey.

In current port I will give code examples how to build REST API client using Jersey. Code shown in examples bellow is available in GitHub java-samples/wiremock repository.

REST API client

REST API client is needed when you want to consume given REST API, either for production usage or for testing this API. In the latter case client does not need to be very sophisticated since it is used just for testing the API with Java code. In current post I will show how to create REST API client for Persons functionality of Dropwizard Rest Stub described in Build a RESTful stub server with Dropwizard post.

Jersey 2 and Jackson

Jersey is a framework which allows easier building of RESTful services. It is one of the most used such framework nowadays. Jackson is JSON parser for Java. It is also one of the most used ones. First step is to import libraries you are going to use:

<dependency>
	<groupId>org.glassfish.jersey.core</groupId>
	<artifactId>jersey-client</artifactId>
	<version>2.25.1</version>
</dependency>
<dependency>
	<groupId>org.glassfish.jersey.media</groupId>
	<artifactId>jersey-media-json-jackson</artifactId>
	<version>2.25.1</version>
</dependency>
<dependency>
	<groupId>com.fasterxml.jackson.core</groupId>
	<artifactId>jackson-databind</artifactId>
	<version>2.8.7</version>
</dependency>
<dependency>
	<groupId>com.fasterxml.jackson.core</groupId>
	<artifactId>jackson-annotations</artifactId>
	<version>2.8.7</version>
</dependency>

Not mandatory but it is good practice to create an interface for this client and then do implementations. The idea is you may have different implementations and switch between them.

import com.automationrhapsody.wiremock.model.Person;

import java.util.List;

public interface PersonRestClient {

	List<Person> getAll();

	Person get(int id);

	String save(Person person);

	String remove();
}

Now you can start with implementation. In given example constructor take the host which also includes port and scheme. Then it creates ClientConfig object with specific properties. Full list is shown in ClientProperties javadoc. In the example I set up timeouts only. Next is to create WebTarget object to query different API endpoints. It could not be simpler than that:

import javax.ws.rs.client.ClientBuilder;
import javax.ws.rs.client.WebTarget;

import org.glassfish.jersey.client.ClientConfig;
import org.glassfish.jersey.client.ClientProperties;
import org.glassfish.jersey.filter.LoggingFilter;

public class JerseyPersonRestClient implements PersonRestClient {

	private final WebTarget webTarget;

	public JerseyPersonRestClient(String host) {
		ClientConfig clientConfig = new ClientConfig()
				.property(ClientProperties.READ_TIMEOUT, 30000)
				.property(ClientProperties.CONNECT_TIMEOUT, 5000);

		webTarget = ClientBuilder
				.newClient(clientConfig)
				.register(new LoggingFilter())
				.target(host);
	}
}

Once WebTarget is instantiated it will be used to query all the endpoints. I will show implementation of one GET and one POST endpoints consumption:

@Override
public List<Person> getAll() {
	Person[] persons = webTarget
		.path(ENDPOINT_GET_ALL)
		.request()
		.get()
		.readEntity(Person[].class);
	return Arrays.stream(persons).collect(Collectors.toList());
}

@Override
public String save(Person person) {
	if (person == null) {
		throw new RuntimeException("Person entity should not be null");
	}
	return webTarget
		.path(ENDPOINT_SAVE)
		.request()
		.post(Entity.json(person))
		.readEntity(String.class);
}

Full code can be seen in GitHub repo: JerseyPersonRestClient full code.

jersey-media-json-jackson

This is the bonding betheww Jersey and Jackson. It should be used otherwise Jersey’s readEntity(Class var1) method throws:

Exception in thread “main” org.glassfish.jersey.message.internal.MessageBodyProviderNotFoundException: MessageBodyReader not found for media type=application/json, type=class …

or

Exception in thread “main” org.glassfish.jersey.message.internal.MessageBodyProviderNotFoundException: MessageBodyWriter not found for media type=application/json, type=class …

Client builder

In code there is class called PersonRestClientBuilder. In current case it does not do many things, but in reality it might turn out that a lot of configurations or input is provided to build a REST API client instance. This is where such builder becomes very useful:

public class PersonRestClientBuilder {

	private String host;

	public PersonRestClientBuilder setHost(String host) {
		this.host = host;
		return this;
	}

	public PersonRestClient build() {
		return new JerseyPersonRestClient(host);
	}
}

Unit testing

It is common and best practice that each piece of code is covered by unit tests. In Mock JUnit tests with Mockito example post I’ve described how Mockito can be used. Problem in current example is if we use Mockito we have to mock readEntity() method to return some response objects. This is way too much mocking and will not do adequate testing, actually it does not testing at all. We want to test that out REST API client successfully communicates over the wire. In order to do proper testing we need to use library called WireMock. In subsequent post I will add more details how to use it.

Conclusion

REST API consuming or testing requires building a client. Jersey is perfect candidate to be used as underlying framework. WireMock can be used for unit testing the REST API client you have created.

Read more...

Implement secure API authentication over HTTP with Dropwizard

Last Updated on by

Post summary: Reference implementation on suggested in How to implement secure REST API authentication over HTTP post authentication mechanism.

API authentication mechanism

Suggested authentication mechanism consists of following steps:

  • Secret key that is known only by API consumer and API provider is needed along with API key.
  • Secret key is used to one way hash a token which is send to server along with API key in the API call.
  • Token consists of: API key + Secret key + Current time in seconds, which then gets hashed with SHA-256 algorithm preferably.
  • Server recreates all the tokens locally for every second for some time in the future, preferably not too long – 30~120 seconds.
  • Server recreates all the tokens for 30~120 seconds in the past, to take into account the time needed for request to reach the server.
  • Server compares each of the tokens with received one.
  • If there is match consumer is authenticated and response is returned.

Dropwizard implementation

Dropwizard stub introduced in Build a RESTful stub server with Dropwizard post will be used to create authentication. Full example can be found in GitHib sample-dropwizard-rest-stub repository. Implementation consists of following steps:

  • Implement javax.ws.rs.container.ContainerRequestFilter interface. Implementation will inspect every request and verify authentication.
  • Create custom annotation
  • Annotate RequestFilter and Dropwizard resource (API service) on which authentication should be applied.
  • Register RequestFilter implementation class into Dropwizard Jersey environment.

Create custom annotation

Starting with the easiest step. Creating custom annotation is pretty easy. It could be applied to a class (ElementType.TYPE) or to a method (ElementType.METHOD). It should live as long as program runs (RetentionPolicy.RUNTIME). In order to make it possible annotated request filter to be applied on specific resource only @NameBinding annotation is a must in Jersey. If not specified request filter will apply on all resources. Needed annotation is:

import java.lang.annotation.ElementType;
import java.lang.annotation.Retention;
import java.lang.annotation.RetentionPolicy;
import java.lang.annotation.Target;

import javax.ws.rs.NameBinding;

@Target({ElementType.TYPE, ElementType.METHOD})
@Retention(value = RetentionPolicy.RUNTIME)
@NameBinding
public @interface Authenticator {
}

ContainerRequestFilter implementation

Container request filter is applied on incoming requests. If used with @NameBinding annotation it is applied only where needed, if not it is applied globally. Mandatory is to override filter() method:

import com.automationrhapsody.reststub.persistence.AuthDB;

import java.io.IOException;
import java.util.List;

import javax.ws.rs.container.ContainerRequestContext;
import javax.ws.rs.container.ContainerRequestFilter;
import javax.ws.rs.core.MediaType;
import javax.ws.rs.core.Response;
import javax.ws.rs.core.UriInfo;

import org.apache.commons.codec.digest.DigestUtils;
import org.apache.commons.collections.CollectionUtils;
import org.apache.commons.lang3.StringUtils;

@Authenticator
public class AuthenticateFilter implements ContainerRequestFilter {

	private static final String PARAM_API_KEY = "apiKey";
	private static final String PARAM_TOKEN = "token";
	private static final long SECONDS_IN_MILLISECOND = 1000L;
	private static final int TTL_SECONDS = 60;

	@Override
	public void filter(ContainerRequestContext context) throws IOException {
		final String apiKey = extractParam(context, PARAM_API_KEY);
		if (StringUtils.isEmpty(apiKey)) {
			context.abortWith(responseMissingParameter(PARAM_API_KEY));
		}

		final String token = extractParam(context, PARAM_TOKEN);
		if (StringUtils.isEmpty(token)) {
			context.abortWith(responseMissingParameter(PARAM_TOKEN));
		}

		if (!authenticate(apiKey, token)) {
			context.abortWith(responseUnauthorized());
		}
	}
}

As seen above two GET parameters are mandatory in the request: “apiKey” and “token”. Those are first extracted and verified. If some of them is not existing BAD_REQUEST (HTTP Status code 400) Response is returned with error message. Methods that extract params and build error response are:

private String extractParam(ContainerRequestContext context, String param) {
	final UriInfo uriInfo = context.getUriInfo();
	final List user = uriInfo.getQueryParameters().get(param);
	return CollectionUtils.isEmpty(user) ? null : String.valueOf(user.get(0));
}

private Response responseMissingParameter(String name) {
	return Response.status(Response.Status.BAD_REQUEST)
		.type(MediaType.TEXT_PLAIN_TYPE)
		.entity("Parameter '" + name + "' is required.")
		.build();
}

If both are present then code tried to authenticate the call by rebuilding all the hashes for 60 seconds in the past because request cannot arrive instantly it takes some time. If network is slower this time can be increased. It also rebuilds all hashes for 60 seconds in the future, this is token time to live. Server has access to Secret key for any given API key. In example above they are stored in fake DB provider and obtained by AuthDB.getSecretKey(apiKey):

private boolean authenticate(String apiKey, String token) {
	final String secretKey = AuthDB.getSecretKey(apiKey);

	// No need to calculate digest in case of wrong apiKey
	if (StringUtils.isEmpty(secretKey)) {
		return false;
	}

	final long nowSec = System.currentTimeMillis() / SECONDS_IN_MILLISECOND;
	long startTime = nowSec - TTL_SECONDS;
	long endTime = nowSec + TTL_SECONDS;
	for (; startTime < endTime; startTime++) {
		final String toHash = apiKey + secretKey + startTime;
		final String sha1 = DigestUtils.sha256Hex(toHash);
		if (sha1.equals(token)) {
			return true;
		}
	}

	return false;
}

As seen above server uses SHA-256 cryptographic algorithm. It is the best solution in terms of speed and security. In MD5, SHA-1, SHA-256 and SHA-512 speed performance post a comparison between MD5, SHA-1, SHA-256 and SHA-512 is made. If authentication cannot be verified then UNAUTHORIZED (HTTP Status code 401) Response response is returned:

private Response responseUnauthorized() {
	return Response.status(Response.Status.UNAUTHORIZED)
		.type(MediaType.TEXT_PLAIN_TYPE)
		.entity("Unauthorized")
		.build();
}

This is the hardest part. Now this filter has to be registered with Jersey and applied to needed resources (services). See more on ContainerRequestFilter interface and @NameBinding annotation in Jersey filters and interceptors page.

Apply authentication filter on a resource

Indicating that given resource should be checked for authentication is done with custom @Authenticator annotation created previously. If needed just for specific API call it can be applied also on a method level:

import com.automationrhapsody.reststub.data.Book;
import com.automationrhapsody.reststub.filters.Authenticator;
import com.automationrhapsody.reststub.persistence.BookDB;
import com.codahale.metrics.annotation.Timed;

import java.util.List;

import javax.ws.rs.GET;
import javax.ws.rs.Path;
import javax.ws.rs.Produces;
import javax.ws.rs.core.MediaType;

@Authenticator
@Path("/secure/books")
public class BooksSecureService {

	@GET
	@Timed
	@Produces(MediaType.APPLICATION_JSON)
	public List<Book> getBooks() {
		return BookDB.getAll();
	}
}

Register in Dropwizard Jersey

Last step is to register the request filter and resource with Dropwizard’s Jersey:

@Override
public void run(RestStubConfig config, Environment env) {

	env.jersey().register(BooksSecureService.class);
	env.jersey().register(AuthenticateFilter.class);

}

Conclusion

Very easy to implement in Dropwizard and relatively secure way to provide API authentication over HTTP protocol. For mission critical application definitely more strict consideration and review of this authentication mechanism is needed.

Read more...

How to implement secure REST API authentication over HTTP

Last Updated on by

Post summary: How to implement secure API authentication even over HTTP.

Important: this post is not a complete and expert guide on API security. It is mainly done to test Postman Pre-request hook that is described in Introduction to Postman with examples post. It does not go into all the details about API security, SSL certificates, encrypting the data, etc. It gives basic information how you can protect your API’s consumers against their network traffic being sniffed and credentials, apiKeys, session keys, etc stolen.

Authentication vs. Authorisation

Authentication is defined as “Who you are”. It deals with usernames and password. Authorisation is defined as “What you can do”. It deals with permissions. Before dealing with permissions application must know the user, so Authorisation comes after Authentication.

Basic Authentication

As it is stated it is very basic. The idea is to sent Base64 encoded username and password in the header of the request in following format:

Authorization: Basic dXNlcjpwYXNz

Server decodes the username and password and use them to authenticate and authorise the user. Problem with Basic authentication is it must be used only over HTTPS, since network traffic is encrypted. Over HTTP request can be easily sniffed. Base64 is reversible and there are numerous tools on the web where you can put dXNlcjpwYXNz and they will return user:pass as plain text.

Nota bene: Never use this one without HTTPS.

OAuth and OAuth 2.0

OAuth is authorisation protocol. It is intended mainly for web, but can be used in API authorisation. The idea is that authentication and authorisation is done by third party like Microsoft, Google, Facebook, Twitter, etc. This is easy for API as it does not have to deal with user data. Customer logins to third party and access token is being issued. Token has some validity which is not too long, but not too short, usually 1 or 2 days. The API can obtain user details from third party by this token. User authenticates itself to the API with this access token by sending it in the request header:

Authorization: Bearer 66408bd9-2bc0-40c3-9823-e9bec390532a

Problem with OAuth is it also must be used over HTTPS. Over HTTP traffic can be sniffed and token can be stolen. Although token has some expiry time, it is long enough for a hacker to use API from your behalf.

Nota bene: Never use this one without HTTPS.

API keys

API keys has become the standard when consuming an API. API key is some random hash which uniquely identifies the consumer. API keys have numerous benefits over username/password mechanism. Again in case of HTTP network traffic can be sniffed and API key stolen.

HTTPS

Reading post so far turned out there is not a single API authentication protocol that is secure if not used over HTTPS. In current a solution is proposed. It is commonly used in public APIs, it is possible to exist as a standard I’m just not aware of its name, which provides secure API authentication even over HTTP.

Implement API security over HTTP

In short in order to have security over HTTP following steps should be done:

  • Secret key that is known only by API consumer and API provider is needed along with API key.
  • Secret key is used to one way hash a token which is send to server along with API key in the API call.
  • Token consists of: API key + Secret key + Current time in seconds, which then gets hashed with SHA-256 algorithm preferably.
  • Server recreates all the tokens locally for every second for some time in the future, preferably not too long – 30~120 seconds.
  • Server recreates all the tokens for 30~120 seconds in the past, to take into account the time needed for request to reach the server.
  • Server compares each of the tokens with received one.
  • If there is match consumer is authenticated and response is returned.

Cryptographic hash algorithms

Most used hash algorithms nowadays are: MD5, SHA-1, SHA-256, SHA-512. MD5 and SHA-1 are to week and are not recommended. SHA-512 takes more time to compute the hashes. SHA-256 is the most appropriate solution in terms of security and speed. In MD5, SHA-1, SHA-256 and SHA-512 speed performance post all 4 algorithms have been tested and compared with Apache’s Commons Codec implementation.

Hash with Salt

Time stamp into hashed token is used for so called salt, an random data that is used to differentiate the hashed data against dictionary attacks. If just API key + Secret key are hashed, then hash will always be one and the same. Intruder will take the hash and just use it. Using time stamp makes the hash always different. Another function of time stamp is to set expiration time on the token, so even if stolen not to be used for a long period of time.

Conclusion

There are already established standards to secure an API, but all of them are effective only over HTTPS. In current post is given a proposal for secure API authentication which is very simple and relatively safe even over HTTP. Cons of this method is server has to recalculate hash many times, which in massive load would require some caching. In Implement secure API authentication over HTTP with Dropwizard post there is reference implementation of proposed solution with Dropwizard.

Read more...

Mock JUnit tests with Mockito example

Last Updated on by

Post summary: Why mocking is needed in unit testing and how to do it with Mockito.

Unit testing

By definition unit testing is a process in which the smallest testable parts of an application, called units, are individually and independently tested for proper operation. Smallest testable unit in Java is a method. Public methods are the only one exposed for outside world, so only they are subject to unit testing.

Mocking

Unit tests focus on particular piece of code that needs to be exercised. In most of cases this code relies on external dependencies. Those dependencies has to be controlled, so only code under test is exercised. Removing dependencies is done with a test double. Test doubles are object that look and behave like their release-intended counterparts, but are actually simplified versions of them which reduce the complexity and facilitate testing. Test doubles are fakes, stubs and mocks.

Mockito

Mockito is the most famous mocking framework for Java. It provides all mocking features needed for proper unit testing, except mocking of static methods. Static methods can be mocked with PowerMock. It is a Mockito’s wrapper that provides same API plus static method mocking and other features. In PowerMock examples and why better not to use them post I have shown how to use PowerMock, what features is has.

Example class for unit test

Code shown in examples bellow is available in GitHub java-samples/junit repository. We are going to unit test a class called Locator that internally uses another class LocatorService:

public class Locator {

	private final LocatorService locatorService;

	public Locator(LocatorService locatorService) {
		this.locatorService = locatorService;
	}

	public Point locate(int x, int y) {
		if (x < 0 || y < 0) {
			return new Point(Math.abs(x), Math.abs(y));
		} else {
			return locatorService.geoLocate(new Point(x, y));
		}
	}
}

Example above is pretty simple. If we pass point with some negative coordinates method locate() returns point with positive coordinates. If coordinates are positive then search via LocatorService is done. This class represents some external API that our code is calling. Since there is no control over this API and internal structure is not know it should be mocked in the unit tests. As stated above unit tests are focused on specific piece of code, an unit.

Initialising a mock

As described in Mockito’s documentation a way to mock some object is: List mockedList = mock(List.class); Another way, that is used in current examples is to annotate the filed that is going to be mocked with @Mock and annotate JUnit test class with @RunWith(MockitoJUnitRunner.class). In this way Mockito runner does the initialisation behind the scenes:

@RunWith(MockitoJUnitRunner.class)
public class LocatorTest {

	@Mock
	private LocatorService locatorServiceMock;
}

Control mock’s behaviour

The whole idea for having a mock is to be able to control its behaviour. If mock is called it should respond in a predictable manner. This is done with when() method:

when(locatorServiceMock.geoLocate(any(Point.class)))
	.thenReturn(new Point(11, 11)); 

When mock’s geoLocate() method is being called with any given point object it always returns new Point with coordinates X=11 and Y=11. If this is not enough, more elaborate scenarios can be used:

when(locatorServiceMock.geoLocate(new Point(5, 5))).thenReturn(new Point(50, 50));
when(locatorServiceMock.geoLocate(new Point(1, 1))).thenReturn(new Point(11, 11));

If locator class is called with point with coordinates (5, 5) then new point with coordinates (50, 50) is returned. If mock is called with point with coordinates (1, 1) then point with (11, 11) is returned. In any other cases null is returned by default.

Nota bene: in order to work properly object used to call mocked method (Point is current example) should have properly implemented equals() method otherwise java.lang.Object‘s equals() method is used, which just compared the references. Examples above will not work, as Point doesn’t have equals() method properly overridden.

Depending on tests that has to be conducted more precise control over mock’s response could be needed. This is done with thenAnswer() mock’s method:

when(locatorServiceMock.geoLocate(any(Point.class)))
	.thenAnswer(new Answer<Point>() {
		@Override
		public Point answer(InvocationOnMock invocationOnMock) throws Throwable {
			Object[] args = invocationOnMock.getArguments();
			Point caller = (Point) args[0];
			
			if (caller.getX() == 5 && caller.getY() == 5) {
				return new Point(50, 50);
			} else if (caller.getX() == 1 && caller.getY() == 1) {
				return new Point(11, 11);
			} else {
				return null;
			}
		}
	});

Call to invocationOnMock.getArguments() returns array with arguments that mock’s geoLocate() method was called with. In current example it is only one argument from type Point, so it is cast and saved to new Point object inside caller variable. If coordinates are (5, 5) then new point with coordinates (50, 50) are returned. If coordinates on input are (1, 1) then new point (11, 11) is returned. In all other cases null is returned.

Verify mock was interacted with

In order to verify execution path is correct Mockito provides a way to check if certain method on the mock has been called and how many times. This is done with verify() method. To confirm no more methods are called on this specific mock instance then verifyNoMoreInteractions() is used:

verify(locatorServiceMock, times(1)).geoLocate(new Point(1, 1));

verifyNoMoreInteractions(locatorServiceMock);

Example above verifies that mock’s geoLocate() method was called with specific point with coordinates (1, 1). If it is not important which object is passed to method then any(Point.class) can be used.

Nota bene: example above will not work as there is no equals() method implemented on point class, so Mockito is using java.lang.Object’s equals() method by default that compares only the references. Point class is intentionally left without equals method to demonstrate how such situations can be solved. How to solve this obsticle is shown in Assert Mockito mock method arguments if no equals() method implemented post.

Putting it all together

All snippets above are put together is one simple unit test that covers all the possible paths for Locator’s locate() method, but obviously not all the test conditions:

import org.junit.Before;
import org.junit.Test;
import org.junit.runner.RunWith;
import org.mockito.Mock;
import org.mockito.runners.MockitoJUnitRunner;

import static org.junit.Assert.assertEquals;
import static org.junit.Assert.assertTrue;
import static org.mockito.Matchers.any;
import static org.mockito.Mockito.when;

@RunWith(MockitoJUnitRunner.class)
public class LocatorTest {

	private static final Point TEST_POINT = new Point(11, 11);

	@Mock
	private LocatorService locatorServiceMock;

	private Locator locatorUnderTest;

	@Before
	public void setUp() {
		when(locatorServiceMock.geoLocate(any(Point.class)))
			.thenReturn(TEST_POINT);

		locatorUnderTest = new Locator(locatorServiceMock);
	}

	@Test
	public void testLocateWithServiceResult() {
		assertEquals(TEST_POINT, locatorUnderTest.locate(1, 1));
	}

	@Test
	public void testLocateLocalResult() {
		Point expected = new Point(1, 1);
		assertTrue(arePointsEqual(expected, locatorUnderTest.locate(-1, -1)));
	}

	private boolean arePointsEqual(Point p1, Point p2) {
		return p1.getX() == p2.getX()
			&& p1.getY() == p2.getY();
	}
}

When locatorServiceMock is called with any then TEST_POINT is returned. No matter that Point has no equals() method defined, assertEquals() in testLocateWithServiceResult() passes because code refers one and the same object. Helper method arePointsEqual() is needed in testLocateLocalResult() though. Code coverage report in IntelliJ IDEA is:

Mockito-JUnit-results

Optimise

Next step is to improve test coverage by adding more unit tests. Copy/paste is not an option, so in post Data driven testing with JUnit parameterized tests I have described how to make data driven tests in JUnit.

Conclusion

Mocking is a mandatory when developing unit tests. Mockito is a convenient mocking library for Java. It is possible to control what mock returns if called with whatever value or if called with specific value. Mockito allows verification on which of mock’s methods has been called and how many times.

Read more...

Code coverage of manual or automated tests with OpenCover for .NET applications

Last Updated on by

Post summary: Examples how to do code coverage of manual or automated functional test with OpenCover tool for .NET applications

Code coverage

This topic is how to do the code coverage on .NET applications with OpenCover. Theory on what is code coverage, why it is needed can be found in What about code coverage post.

OpenCover

OpenCover is open source tool for code coverage for .NET 2.0 and above applications for Windows only. With OpenCover instrumentation of the code is not needed. Application is started through OpenCover and it collects coverage results. What is mandatory though is PDB file along with executables and assemblies, so application under test should be build in Debug mode. If PDB file is not found then no coverage data will be gathered.

Latest version can be downloaded form “Releases” in GitHub. There is installer and zip archive. If installer is used by default OpenCover is installed in C:\Users\{USER_ACCOUNT}\AppData\Local\Apps\OpenCover. If you want to change this, click “Advanced” button during installation and then select “Install for all users on this machine”.

ReportGenerator

OpenCover produces results is raw format, which is not for humans. ReportGenerator is used to convert XML reports generated by OpenCover, PartCover, Visual Studio or NCover into human readable reports in various formats. Usage guide can be found on its home page, most useful commands are:

  • -reports: – coverage reports that should be parsed, semicolon separated, wildcards are allowed
  • -targetdir: – directory where the generated report should be saved
  • -sourcedirs:[;][;] – directories which contain the corresponding source code, optional, semicolon separated
  • -classfilters:<(+|-)filter>[;<(+|-)filter>][;<(+|-)filter>] – list of classes that should be included or excluded in the report, optional, wildcards are allowed.

How to use OpenCover

Usage guide can be found in OpenCover usage reference online. Also along with OpenCover installation there Usage.rft file which holds all the information about the tool. Most useful commands are listed bellow:

  • -target: – path to application executable file or name of service
  • -filter: – list of filters to apply to selectively include or exclude assemblies and classes from coverage results
  • -output: – path to output XML file, if empty then results.xml will be created in current directory
  • -register[:user] – register and de-register the code coverage profiler
  • -targetargs: – arguments to be passed to the target process
  • -targetdir: – path to the target directory or alternative path to PDB files

Hands on examples on manual code coverage

In order to try you need to checkout code samples from GitHub SampleApp or SampleAppPlus repository to C:\. Telerik Testing Framework needs to be installed as it copies lots of assemblies in GAC. OpenCover and ReportGenerator should also be installed to C:\

With current setup command to start SimpleAppPlus.exe from OpenCover is: C:\OpenCover\OpenCover.Console.exe -target:”C:\SampleAppPlus\SampleAppPlus\bin\Debug\SampleAppPlus.exe” -output:C:\SampleAppPlus\CoverageReports\SampelAppPlus.results.xml -register:user Now application is started and manual functional tests can be executed. Once application is stopped coverage results are saved in SampelAppPlus.results.xml file.

Hands on examples on automated code coverage

Since automation is the future of QA and we have already created automated tests for both SimpleApp and SimplaAppPlus, we want to measure how our tests perform on code coverage. Automated tests run the application, attach to it and manipulate it. So it seems close to mind just to use the command from manual example and start the application with it. It will not work though since command is starting and returning OpenCover process, not underlying SimpleAppPlus one. Extra code is needed in SampleAppPlus.Tests.Framework\Tests\BaseTest.cs file. Instead of:

Application appWhite = Application.Launch(applicationPath);

following code has to be added:

Process sampleAppPlus = StartProcess();
Application appWhite = Application.Attach(sampleAppPlus);

where StartProcess() method is:

using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;
using System.Threading;

private Process StartProcess()
{
	string appName = "SampleAppPlus";
	string openCover = @"C:\OpenCover\OpenCover.Console.exe";
	long timeStamp = (long)(DateTime.UtcNow 
						- new DateTime(1970, 1, 1)).TotalMilliseconds;

	List<string> arguments = new List<string>();
	arguments.Add(@"-target:C:\SampleAppPlus\SampleAppPlus\bin\Debug\"
					+ appName + ".exe");
	arguments.Add(@"-output:C:\SampleAppPlus\CoverageReports\SampelAppPlus."
					+ timeStamp + ".xml");
	arguments.Add("-register:user");

	Process process = new Process();
	process.StartInfo.FileName = openCover;
	process.StartInfo.Arguments = string.Join(" ", arguments);
	process.Start();

	Thread.Sleep(5000);
	return Process.GetProcesses().First(proc => appName == proc.ProcessName);
}

OpenCover arguments “-target”, “-output” and “-register” are used. Note that -output file is always different by adding current unix time in file name, this is to prevent overwriting of file. The idea is to run OpenCover which will start SimpleAppPlus. Wait 5 seconds to ensure application is up, then get all Windows processes with Process.GetProcesses(), iterate them, find and return the needed SampleAppPlus process which TestStack.White and Telerik Testing Framework will attach to.

Create report

Once tests are run and OpenCover XML report files are generated it is time to generated human readable reports. This is done with ReportGenerator with command: C:\ReportGenerator\bin\ReportGenerator.exe -reports:C:\SampleAppPlus\CoverageReports\SampelAppPlus.*.xml -targetdir:C:\SampleAppPlus\CoverageReports\html -sourcedirs:C:\SampleAppPlus\ -classfilters:-SampleAppPlus.Properties.*

OpenCover-report

Inspect report

Coverage report file from examples above can be found in OpenCover code coverage report. Inspecting the report there is missed code in SampleAppPlus.MainWindow class – else branch of if ((bool)openFileDialog.ShowDialog()) condition is not covered. Documentation of this method states that it returns false if Cancel button is clicked of dialog window. In order to increase the coverage test that clicks Cancel button and verifies no upload is done should be added to test suite.

Code coverage for IIS web application or Windows service

Examples above show how to run normal windows application. It is valid for both UI and console applications as they are started with single EXE file. OpenCover can also work for IIS web applications, Silverlight applications and Windows service applications. More details can be found in documentation accompanying OpenCover installation.

Although it is possible to connect to a running service, I have done code coverage on Windows service in the manner suggested in the documentation – run the service as a console application. Since debugging a running Windows service is not that straightforward task, developers have most likely already implemented a switch to start service as console application. If not you will easy their lives by asking them to do so.

Conclusion

OpenCover is the only open source tool for code coverage for .NET applications. It is really powerful and easy to use. No code instrumentation is needed, just build the code into Debug mode to have PDB files and run the application through OpenCover. It can also be used for measuring code coverage of unit tests.

Read more...

Introduction to Cucumber and BDD with examples

Last Updated on by

Post summary: Code examples and introduction to Cucumber, a framework that runs automated tests written in behaviour driven development (BDD) style.

Cucumber examples can be found in selenium-samples-java/cucumber-parallel GitHub repository.

Before going into details how to create tests with Cucumber first is good to do some context introduction why this framework is created and getting more popular.

Test driven development (TDD)

TDD (test driven development) is software development process in which developers first write the unit tests for feature or module based on requirements and then implement the feature or module itself. In short TDD cycle is “red > green > refactor”:

  • Red – phase where tests are implemented according to requirements, but they still fail
  • Green – phase where module or feature is implemented and tests pass
  • Refactor – phase where working code is made more readable and well structured

Behaviour driven development (BDD)

BDD emerged from and extends TDD. Instead of writing unit tests from specification why not make the specification a test itself. The main idea is that business analysts, project managers, users or anyone without technical, but with sufficient business knowledge can define tests.

Gherkin

BDD ideas sound very nice but actually are not so easy to put in practice. In order to make documentation sufficient for testing it should follow some rules. This is where Gherkin comes in place. It is a Business Readable, Domain Specific Language created especially to describe behaviour without defining how to implement it. Gherkin serves two purposes: it is your project’s documentation and automated tests.

Cucumber

Cucumber is not a testing tool it is a BDD tool for collaboration between all members of the team. So if you are using Cucumber just for automated testing you can do better. Testwise Cucumber is framework that understands Gherkin and runs the automated tests. It sounds like a fairy tale, you get your documentation described in Gherkin and tests just run. Actually it is not that simple, each step from documentation should have underlying test code that manipulates the application and should have test conditions. All process will be described in details with code samples bellow.

Setup Maven project

Proper way to do things is by using some build automation tool. Most used ones are Gradle and Maven. Maven is more simple to use and works fine for normal workflows. Gradle can do anything you want to do, but it comes at a price of complexity since you have to write Groovy code for custom stuff. In order to use Cucumber-JVM (Cucumber implementation for for the most popular JVM languages: Java, Groovy, Scala, etc.) in Java 8 POM looks like this:

<properties>
	<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
	<cucumber.version>1.2.4</cucumber.version>
	<selenium.version>2.48.2</selenium.version>
</properties>
<dependencies>
	<dependency>
		<groupId>info.cukes</groupId>
		<artifactId>cucumber-java8</artifactId>
		<version>${cucumber.version}</version>
		<scope>test</scope>
	</dependency>
	<dependency>
		<groupId>info.cukes</groupId>
		<artifactId>cucumber-junit</artifactId>
		<version>${cucumber.version}</version>
		<scope>test</scope>
	</dependency>
	<dependency>
		<groupId>org.seleniumhq.selenium</groupId>
		<artifactId>selenium-server</artifactId>
		<version>${selenium.version}</version>
		<scope>test</scope>
	</dependency>
</dependencies>

In order to build project in Java 8 this should be specified explicitly in POM by using maven-compiler-plugin:

<build>
	<plugins>
		<plugin>
			<groupId>org.apache.maven.plugins</groupId>
			<artifactId>maven-compiler-plugin</artifactId>
			<version>3.3</version>
			<configuration>
				<source>1.8</source>
				<target>1.8</target>
			</configuration>
		</plugin>
	</plugins>
</build>

Once project is setup then real Cucumber usage can start.

Feature files

Everything start with a plain text file with .feature extension written in Gherkin language that describes only one product feature. It may contain from one to many scenarios. Bellow is a file with simple scenario for searching in Wikipedia.

Feature: search Wikipedia

  Scenario: direct search article
    Given Enter search term 'Cucumber'
    When Do search
    Then Single result is shown for 'Cucumber'

In Gherkin, each line that isn’t blank has to start with a Gherkin keyword, followed by any text you like. The main keywords are:

  • Feature
  • Scenario
  • Given, When, Then, And, But (Steps)
  • Background
  • Scenario Outline
  • Examples

More information can be found in Cucumber reference page.

Don’t repeat yourself

In some features there might be one and the same Given steps before each scenario. In order to avoid copy/paste it is better to define those steps as feature prerequisite with Background keyword.

Feature: search Wikipedia

  Background:
    Given Open http://en.wikipedia.org
    And Do login

  Scenario: direct search article
    Given Enter search term 'Cucumber'
    When Do search
    Then Single result is shown for 'Cucumber'

Data driven testing

Cucumber makes it very easy to handle cases of different business scenarios with different input data and different results based on that input data. Scenario is defined with Scenario Outline and . Then data is fed to this scenario with Examples table where variables are defined with concrete values. Example bellow shows scenario where search is done for two keywords and expected results for each are defined. It is very easy just to add more keywords and expected result which is actually treated by Cucumber reporting as different scenario.

Feature:

  Scenario Outline:
    Given Enter search term '<searchTerm>'
    When Do search
    Then Multiple results are shown for '<result>'

    Examples:
      | searchTerm | result                |
      | mercury    | Mercury may refer to: |
      | max        | Max may refer to:     |

Organise features and scenarios

Tags can be used in order to group feature files. Tag can be applied as annotation above Feature or Scenario keyword in .feature file. Having correctly tagged different scenarios or features runners can be created per each tag and run when needed. See more for tags in Cucumber tags page.

Runners

Next step in the process is to add runners. Cucumber supports running tests with JUnit and TestNG. In current post JUnit will be used. It has been imported in POM project file with cucumber-junit. In order to run a test with JUnit a special runner class should be created. The very basic form of the file is an empty class with @RunWith(Cucumber.class) annotation.

import org.junit.runner.RunWith;

import cucumber.api.junit.Cucumber;

@RunWith(Cucumber.class)
public class RunWikipediaTest {
}

This will run the tests and JUnit results file will be generated. Although it works it is much better to provide some Cucumber options with @CucumberOptions annotation.

import org.junit.runner.RunWith;

import cucumber.api.CucumberOptions;
import cucumber.api.junit.Cucumber;

@RunWith(Cucumber.class)
@CucumberOptions(
		format = {
				"json:target/cucumber/wikipedia.json",
				"html:target/cucumber/wikipedia.html",
				"pretty"
		},
		tags = {"~@ignored"}
)
public class RunWikipediaTest {
}

Runner above runs all feature files which does not have tag @ignored and outputs the results in three different formats: JSON in target/cucumber/wikipedia.json file, HTML in target/cucumber/wikipedia.html and Pretty – a console output with colours. All formatting options are available in Cucumber formatters page. All possible cucumber options can be found in Cucumber options page.

Runners strategy

Different runners can be created for different purposes. There could be runners based on specific tags and those to be invoked in specific cases like regression and smoke testing, or full or partial regression testing. Depends on the needs. It is possible to create runners for each feature and run tests in parallel. There is a way to automatically generate runners and run tests in parallel. This will speed up execution and will keep the project clean from unessential runners. More details can be found in Running Cucumber tests in parallel post.

Runners feature file

If runner is not linked to any feature file (no features section defined in @CucumberOptions, like the one shown above) then runner automatically searches for all feature files in current and all child folders.

Java package structure is represented as a folder structure. Feature files can be place in some package structure in resources folder of the project, e.g. com.automationrhapsody.cucumber.parallel.tests.wikipedia. When compiled those are copied to same folder structure in the output. So in order to run all feature files from package above runner class need to be placed in same package in java folder of the project. It is possible to place the runner in parent package and it will still work as it searches all child folders e.g com.automationrhapsody.cucumber.parallel.tests will work. Placing in different package will not work though, e.g. com.automationrhapsody.cucumber.parallel.tests.other will not work. There will be message:

None of the features at [classpath:com/automationrhapsody/cucumber/parallel/tests/other] matched the filters: [~@ignored]

This means either there are no feature files into com.automationrhapsody.cucumber.parallel.tests.other resources package or those files have tag @ignored.

Easiest way is to put just one runner into default package and it will run all feature files available. This gives more limitations than benefits in the long run though.

Running feature files without step definitions

So far feature file and runner has been created. If now tests are started build will successes but tests will be skipped. This is because there is no implementation available for Given, When, Then commands in feature file. Those should be implemented and Cucumber gives hints in build output how to do it.

You can implement missing steps with the snippets below:

Given("^Enter search term 'Cucumber'$", () -> {
	// Write code here that turns the phrase above into concrete actions
	throw new PendingException();
});

Step definitions code / glue

So far feature file has been defined with runner for it. In terms of BDD this is OK, but in terms of testing a step definitions should be created so tests can actually be executed. As shown in hint above a method with annotation @Given is needed. Annotation text is actually a regular expression this is why it is good to start with ^ and end with $ which means whole line from the feature file to be matched. In order to make the step reusable some regular expression can be added in order to be able to pass different search terms, not only ‘Cucumber’. If there is regular expression defined then method can take arguments which will be actually what regex will match in feature file text. Step definition for Given command is:

@Given("^Enter search term '(.*?)'$")
public void searchFor(String searchTerm) {
	WebElement searchField = driver.findElement(By.id("searchInput"));
	searchField.sendKeys(searchTerm);
}

(.*?) is very basic regex, but it can do the work. It matches all zero or more characters surrounded by quotes in the end of the line after “Enter search term ” text. Once matched this is passed as argument to searchFor() method invocation.

Full step definitions for initial feature file are shown in class bellow:

package com.automationrhapsody.cucumber.parallel.tests.wikipedia;

import org.openqa.selenium.By;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.WebElement;
import org.openqa.selenium.firefox.FirefoxDriver;

import cucumber.api.java.After;
import cucumber.api.java.Before;
import cucumber.api.java.en.Given;
import cucumber.api.java.en.Then;
import cucumber.api.java.en.When;

import static junit.framework.Assert.assertTrue;
import static junit.framework.TestCase.assertFalse;
import static org.junit.Assert.assertEquals;

public class WikipediaSteps {

	private WebDriver driver;

	@Before
	public void before() {
		driver = new FirefoxDriver();
		driver.navigate().to("http://en.wikipedia.org");
	}

	@After
	public void after() {
		driver.quit();
	}

	@Given("^Enter search term '(.*?)'$")
	public void searchFor(String searchTerm) {
		WebElement searchField = driver.findElement(By.id("searchInput"));
		searchField.sendKeys(searchTerm);
	}

	@When("^Do search$")
	public void clickSearchButton() {
		WebElement searchButton = driver.findElement(By.id("searchButton"));
		searchButton.click();
	}

	@Then("^Single result is shown for '(.*?)'$")
	public void assertSingleResult(String searchResult) {
		WebElement results = driver
			.findElement(By.cssSelector("div#mw-content-text.mw-content-ltr p"));
		assertFalse(results.getText().contains(searchResult + " may refer to:"));
		assertTrue(results.getText().startsWith(searchResult));
	}
}

Before and After

As noticeable above there is method annotated with @Before. This is executed before each scenario being run. It is used to allocate resources needed for particular scenario to run. It is a good idea to instantiate WebDriver into this method. @After method is run in the end of the scenario so this is where resources are released, e.g. quitting WebDriver.

Page objects

Steps definitions code above are NOT well written. It is not good idea to locate elements directly in method where they are used. Never do this in real automation! Always use Page Object pattern. More details on it can be found in Page objects design pattern post.

Conclusion

BDD as an idea is good as it is according to Agile methodologies and stories definitions. Cucumber is very good BDD tool giving lots of flexibility and features with big community supporting it. Cucumber-JVM is very easy to start working with. Be careful though, when steps definitions grow in count the main challenge will be to make those reusable. Challenge will be to avoid writing new steps code if such is already written for semantically the same commands but written in different words.

Read more...

Performance testing with Gatling – reports

Last Updated on by

Post summary: Some details on Gatling results report.

Current post is part of Performance testing with Gatling series in which Gatling performance testing tool is explained in details.

If you have followed the Gatling series so far you should know how to record a simulation, what simulation consists of, how to create Maven project and make code well structured and maintainable. Now is the time to run that code and see the results.

Gatling global information

Gatling has pretty cool looking report. It shows global information about simulation as long as more detailed information for each request or request group. This is how global information looks like:

Gatling-reports-global

Shown above is just part of global information report page. There are following sections on it:

  • Indicators – distribution in specified response time intervals: less than 800ms, 800ms – 1200ms, more 1200ms and failed. This can give you general overview of the system performance. If highest percentage of the responses are less than 800ms this is quite good performance indication.
  • Number of requests – pie chart showing different request types. This gives visual information how many request of different type are being sent. Type is actually the request name defined in http() method.
  • STATISTICS – table with very detailed information what count of each request type has been sent, OK, KO count, KO percentage. There is information what is the best, worst and mean time for each request type. Since worst time could be for a single response this is not quite informative. This is why there is grouping what is the response time for 95% and 99% of the responses. Last information is request per second.
  • Active Users along the Simulation – how many virtual users were sending requests at each moment during the simulation. There is also user count per scenario. Scenario is identified by its name defined in scenario() method.
  • Response Time Distribution – detailed responses distribution in small time intervals. Pretty similar as Indicators one, but much more detailed, as time intervals are very small. This gives much better perspective of performance as you can see what percentage of the requests are in given time bucket. This graphic also includes error requests as well.
  • Response Time Percentiles over Time (OK) – minimum and maximum request time at each moment during simulation only for successful (OK) requests. There is also request grouping into percentage values showing what percent of the requests take given amount of time. Similar to STATISTICS table, but here there is much more detailed grouping and also you can see it distributed during simulation execution. Additionally this graphic shows number of user at each moment during simulation.
  • Number of requests per second – how many requests are done to the server at each moment during simulation. There is separate graphics for all requests, OK and KO requests. Additionally this graphic shows number of user at each moment during simulation.
  • Number of responses per second – how many responses are done to the server at each moment during simulation. There is separate graphics for all responses, OK and KO responses. Additionally this graphic shows number of user at each moment during simulation.

Gatling request details

Apart from global information there is detailed report for each request type. Requests are sorted by name used when defining the HTTP request in http() method. This is how request details look like:

Gatling-reports-details

Shown above is just part of request details report page. There are following sections on it:

  • Indicators – same as in global information
  • STATISTICS – same as in global information but just timing for this particular request are shown.
  • Response Time Distribution – same as in global information
  • Response Time Percentiles over Time (OK) – same as in global information
  • Latency Percentiles over Time (OK) – same as Response Time Percentiles over Time (OK), but showing the time needed for the server to process the request, although it is incorrectly called latency. By definition Latency + Process Time = Response time. So this graphic is supposed to give the time needed for request to reach the server. Checking real life graphics I think this graphic shows not the Latency, but the real Process Time. You can get idea of the real Latency by taking one and the same second from Response Time Percentiles over Time (OK) and subtract values from current graphs for the same second.
  • Number of requests per second – same as in global information
  • Number of responses per second – same as in global information
  • Response Time against Global RPS – distribution of current request’s response time related to total request per second of the simulation.
  • Latency against Global RPS – distribution of current request’s latency (process time) related to total request per second of the simulation.

Gatling data in simulation.log file

As you will see in the previous two sections Gatling gathers very limited amount of data, how many requests are made per any given time of the execution, are the responses OK or KO, what time each request and response take. All this information is stored into simulation.log file. Although file is plain text data in it is understandable only by Gatling. In Performance testing with Gatling – advanced usage post it is shown how you can extract more details from request and response. This gets recorded in simulation.log file, so be careful when do this as this file might get enormous. View sample simulation.log file or sample Gatling report.

Conclusion

Gatling report is valuable source of information to read the performance data by providing some details about requests and responses timing. Report should not be your main tool for finding issues when doing performance testing though. It is a good idea to have a server monitoring tool that gives more precise information about memory consumption and CPU. In case of bottlenecks identified by Gatling it is mandatory to do some profiling of the application to understand what action on server take longest time.

Read more...

Performance testing with Gatling – advanced usage

Last Updated on by

Post summary: Code samples and explanation how to do advanced performance testing with Gatling, such like proper scenarios structure, checks, feeding test data, session maintenance, etc.

Current post is part of Performance testing with Gatling series in which Gatling performance testing tool is explained in details.

Code samples are available in GitHub sample-performance-with-gatling repository.

In previous post Performance testing with Gatling – integration with Maven there is description how to setup Maven project. In Performance testing with Gatling – recorded simulation explanation there is information what a simulation consist of. Simulations from this post will be refactored in current post and more advanced topics will be discussed how to make flexible automation testing with Gatling.

What is included

Following topics are included in current post:

  • Access external configuration data
  • Define single HTTP requests for better re-usability
  • Add checks for content on HTTP response
  • Check and extract data from HTTP response
  • More checks and extract List with values
  • Create HTTP POST request with body from template file
  • Manage session variables
  • Standard CSV feeder
  • Create custom feeder
  • Create unified scenarios
  • Conditional scenario execution
  • Only one HTTP protocol
  • Extract data from HTTP request and response
  • Advanced simulation setUp
  • Virtual users vs requests per second

Refactored code

Bellow are all classes that are created after they have been refactored. In order to separate things and make it easier to read ProductSimulation and PersonSimulation classes contain only the setUp() method. Request, scenarios and external configurations are being defined into Constants, Product and Person singleton objects.

Constants

object Constants {
	val numberOfUsers: Int = System.getProperty("numberOfUsers").toInt
	val duration: FiniteDuration = System.getProperty("durationMinutes").toInt.minutes
	val pause: FiniteDuration = System.getProperty("pauseBetweenRequestsMs").toInt.millisecond
	val responseTimeMs = 500
	val responseSuccessPercentage = 99
	private val url: String = System.getProperty("url")
	private val repeatTimes: Int = System.getProperty("numberOfRepetitions").toInt
	private val successStatus: Int = 200
	private val isDebug = System.getProperty("debug").toBoolean

	val httpProtocol = http
		.baseURL(url)
		.check(status.is(successStatus))
		.extraInfoExtractor { extraInfo => List(getExtraInfo(extraInfo)) }

	def createScenario(name: String, feed: FeederBuilder[_], chains: ChainBuilder*): ScenarioBuilder = {
		if (Constants.repeatTimes > 0) {
			scenario(name).feed(feed).repeat(Constants.repeatTimes) {
				exec(chains).pause(Constants.pause)
			}
		} else {
			scenario(name).feed(feed).forever() {
				exec(chains).pause(Constants.pause)
			}
		}
	}

	private def getExtraInfo(extraInfo: ExtraInfo): String = {
		if (isDebug
			|| extraInfo.response.statusCode.get != successStatus
			|| extraInfo.status.eq(Status.apply("KO"))) {
			",URL:" + extraInfo.request.getUrl +
				" Request: " + extraInfo.request.getStringData +
				" Response: " + extraInfo.response.body.string
		} else {
			""
		}
	}
}

Product

object Product {

	private val reqGoToHome = exec(http("Open home page")
		.get("/products")
		.check(regex("Search: "))
	)

	private val reqSearchProduct = exec(http("Search product")
		.get("/products?q=${search_term}&action=search-results")
		.check(regex("Your search for '${search_term}' gave ([\\d]{1,2}) results:").saveAs("numberOfProducts"))
		.check(regex("NotFound").optional.saveAs("not_found"))
	)

	private val reqOpenProduct = exec(session => {
		var numberOfProducts = session("numberOfProducts").as[String].toInt
		var productId = Random.nextInt(numberOfProducts) + 1
		session.set("productId", productId)
	}).exec(http("Open Product")
		.get("/products?action=details&id=${productId}")
		.check(regex("This is 'Product ${productId} name' details page."))
	)
	
	private val csvFeeder = csv("search_terms.csv").circular.random

	val scnSearch = Constants.createScenario("Search", csvFeeder,
		reqGoToHome, reqSearchProduct, reqGoToHome)

	val scnSearchAndOpen = Constants.createScenario("Search and Open", csvFeeder,
		reqGoToHome, reqSearchProduct, reqOpenProduct, reqGoToHome)
}

ProductSimulation

class ProductSimulation extends Simulation {

	setUp(
		Product.scnSearch.inject(rampUsers(Constants.numberOfUsers) over 10.seconds),
		Product.scnSearchAndOpen.inject(atOnceUsers(Constants.numberOfUsers))
	)
		.protocols(Constants.httpProtocol.inferHtmlResources())
		.pauses(constantPauses)
		.maxDuration(Constants.duration)
		.assertions(
			global.responseTime.max.lessThan(Constants.responseTimeMs),
			global.successfulRequests.percent.greaterThan(Constants.responseSuccessPercentage)
		)
}

Person

object Person {

	private val added = "Added"
	private val updated = "Updated"

	private val reqGetAll = exec(http("Get All Persons")
		.get("/person/all")
		.check(regex("\"firstName\":\"(.*?)\"").count.greaterThan(1).saveAs("count"))
		.check(regex("\\[").count.is(1))
		.check(regex("\"id\":([\\d]{1,6})").findAll.saveAs("person_ids"))
	).exec(session => {
		val count = session("count").as[Int]
		val personIds = session("person_ids").as[List[Int]]
		val personId = personIds(Random.nextInt(count)).toString.toInt
		session.set("person_id", personId)
	}).exec(session => {
		println(session)
		session
	})

	private val reqGetPerson = exec(http("Get Person")
		.get("/person/get/${person_id}")
		.check(regex("\"firstName\":\"(.*?)\"").count.is(1))
		.check(regex("\\[").notExists)
	)

	private val reqSavePerson = exec(http("Save Person")
		.post("/person/save")
		.body(ElFileBody("person.json"))
		.header("Content-Type", "application/json")
		.check(regex("Person with id=([\\d]{1,6})").saveAs("person_id"))
		.check(regex("\\[").notExists)
		.check(regex("(" + added + "|" + updated + ") Person with id=").saveAs("action"))
	)

	private val reqGetPersonAferSave = exec(http("Get Person After Save")
		.get("/person/get/${person_id}")
		.check(regex("\"id\":${person_id}"))
		.check(regex("\"firstName\":\"${first_name}\""))
		.check(regex("\"lastName\":\"${last_name}\""))
		.check(regex("\"email\":\"${email}\""))
	)

	private val reqGetPersonAferUpdate = exec(http("Get Person After Update")
		.get("/person/get/${person_id}")
		.check(regex("\"id\":${person_id}"))
	)

	private val uniqueIds: List[String] = Source
		.fromInputStream(getClass.getResourceAsStream("/account_ids.txt"))
		.getLines().toList

	private val feedSearchTerms = Iterator.continually(buildFeeder(uniqueIds))

	private def buildFeeder(dataList: List[String]): Map[String, Any] = {
		Map(
			"id" -> (Random.nextInt(100) + 1),
			"first_name" -> Random.alphanumeric.take(5).mkString,
			"last_name" -> Random.alphanumeric.take(5).mkString,
			"email" -> Random.alphanumeric.take(5).mkString.concat("@na.na"),
			"unique_id" -> dataList(Random.nextInt(dataList.size))
		)
	}

	val scnGet = Constants.createScenario("Get all then one", feedSearchTerms,
		reqGetAll, reqGetPerson)

	val scnSaveAndGet = Constants.createScenario("Save and get", feedSearchTerms, reqSavePerson)
		.doIfEqualsOrElse("${action}", added) {
			reqGetPersonAferSave
		} {
			reqGetPersonAferUpdate
		}
}

PersonSimulation

class PersonSimulation extends Simulation {

	setUp(
		Person.scnGet.inject(atOnceUsers(Constants.numberOfUsers)),
		Person.scnSaveAndGet.inject(atOnceUsers(Constants.numberOfUsers))
	)
		.protocols(Constants.httpProtocol)
		.pauses(constantPauses)
		.maxDuration(Constants.duration)
		.assertions(
			global.responseTime.max.lessThan(Constants.responseTimeMs),
			global.successfulRequests.percent.greaterThan(Constants.responseSuccessPercentage)
		)
}

Access external configuration data

In order to have flexibility it is mandatory to be able to sent different configurations parameters from command line when invoking the scenario. With Gatling Maven plugin it is done with  configurations. See more in Performance testing with Gatling – integration with Maven post.

val numberOfUsers: Int = System.getProperty("numberOfUsers").toInt
val duration: FiniteDuration = System.getProperty("durationMinutes").toInt.minutes
private val url: String = System.getProperty("url")
private val repeatTimes: Int = System.getProperty("numberOfRepetitions").toInt
private val isDebug = System.getProperty("debug").toBoolean

Define single HTTP requests for better re-usability

It is good idea to define each HTTP request as a separate object. This gives flexibility to reuse one and the same requests in different scenarios. Bellow is shown how to create HTTP GET request with http().get().

Add checks for content on HTTP response

On HTTP request creation there is possibility to add checks that certain string or regular expression pattern exists in response. Code bellow created HTTP Request and add check that “Search: “ text exists in response. This is done with regex() method by passing just a string to it.

private val reqGoToHome = exec(http("Open home page")
	.get("/products")
	.check(regex("Search: "))
)

Check and extract data from HTTP response

It is possible along with the check to extract data into a variable that is being saved to session. This is done with saveAs() method. In some cases value we are searching for, might not be in the response. We can use optional method to specify that value is saved in session only if existing, if not existing it won’t be captured and this will not break the execution. As shown bellow session variables can be also used in the checks. Session variable is accessed with ${},  such as ${search_term}.

private val reqSearchProduct = exec(http("Search product")
	.get("/products?q=${search_term}&action=search-results")
	.check(regex("Your search for '${search_term}' gave ([\\d]{1,2}) results:")
		.saveAs("numberOfProducts"))
	.check(regex("NotFound").optional.saveAs("not_found"))
)

More checks and extract List with values

There are many type of checks. In code bellow count.greaterThan(1) and count.is(1) are used. It is possible to search for multiple occurrences of given regular expression with findAll. In such case saveAs() saves the results to a “person_ids” List object in session. More information about checks be found in Gatling Checks page.

private val reqGetAll = exec(http("Get All Persons")
	.get("/person/all")
	.check(regex("\"firstName\":\"(.*?)\"").count.greaterThan(1).saveAs("count"))
	.check(regex("\\[").count.is(1))
	.check(regex("\"id\":([\\d]{1,6})").findAll.saveAs("person_ids"))
)

Create HTTP POST request with body from template file

If you need to post data to server HTTP POST request is to be used. Request is created with http().post() method. Headers can be added to request with header() or headers() methods. In current example without Content-Type=application/json header REST service will throw an error for unrecognised content. Data that will be sent is added with body() method. It accepts Body object. You can generate body from file (RawFileBody method) or string (StringBody method).

private val reqSavePerson = exec(http("Save Person")
	.post("/person/save")
	.body(ElFileBody("person.json"))
	.header("Content-Type", "application/json")
	.check(regex("Person with id=([\\d]{1,6})").saveAs("person_id"))
	.check(regex("\\[").notExists)
	.check(regex("(" + added + "|" + updated + ") Person with id=")
		.saveAs("action"))
)

In current case body is generated from file, which have variables that can be later on found in session. This is done with ElFileBody (ELFileBody in 2.0.0) method and actual replace with value is done by Gatling EL (expression language). More about what can you do with EL can be found on Gatling EL page. EL body file is shown bellow, where variables ${id}, ${first_name}, ${last_name} ${email} are searched in session and replaced if found. If not found error is shown on scenario execution output.

{
	"id": "${id}",
	"firstName": "${first_name}",
	"lastName": "${last_name}",
	"email": "${email}"
}

Manage session variables

Each virtual user has it own session. Scenario can store or read data from session. Data is saved in session with key/value pairs, where key is variable name. Variables are stored in session in three ways: using feeders (this is explained later in current post), using saveAs() method and session API. More details on session API can be found in Gatling Session API page.

Manipulating session through API is kind of tricky. Gatling documentation is vague about it. Bellow is shown a code where session variable is extracted first as String and then converted to Int with: var numberOfProducts = session(“numberOfProducts”).as[String].toInt. On next step some manipulation is done with this variable, in current case a random product id from 1 to “numberOfProducts” to is picked. At last a new variable is saved in session with session.set(“productId”, productId). It is important that this is the last line of session manipulation code block done in first exec(). This is the return statement of the code block. In other words new Session object with saved “productId” in it is returned. If on the last line is just “session” as stated in the docs, then old, unmodified session object is returned without variable being added.

Sessions as most of the objects in Gatling and in Scala are immutable. This is designed for thread safety. So adding variable to session actually creates new object. This is why newly added session variable cannot be used in the same exec() block, but have to be used on next one, as in same block variable is yet not accessible. See code bellow in second exec() “productId” is already available and can be used in get().

private val reqOpenProduct = exec(session => {
	var numberOfProducts = session("numberOfProducts").as[String].toInt
	var productId = Random.nextInt(numberOfProducts) + 1
	session.set("productId", productId)
}).exec(http("Open Product")
	.get("/products?action=details&id=${productId}")
	.check(regex("This is 'Product ${productId} name' details page."))
)

Same logic being explained above is implemented in next code fragment. The bellow example shows usage of session variable saved in previous exec() fragment. Count of persons and List with ids are being saved by saveAs() method. List is extracted from session and random index of it has been accessed, so random person is being selected. This is again saved into session as “person_id”. In third exec() statement “session” object is just printed to output for debug purposes.

private val reqGetAll = exec(http("Get All Persons")
	.get("/person/all")
	.check(regex("\"firstName\":\"(.*?)\"").count.greaterThan(1).saveAs("count"))
	.check(regex("\\[").count.is(1))
	.check(regex("\"id\":([\\d]{1,6})").findAll.saveAs("person_ids"))
).exec(session => {
	val count = session("count").as[Int]
	val personIds = session("person_ids").as[List[Int]]
	val personId = personIds(Random.nextInt(count)).toString.toInt
	session.set("person_id", personId)
}).exec(session => {
	println(session)
	session
})

Standard CSV feeder

Feeder is a way to generate unique data for each virtual user. This how tests are made realistic. Bellow is a way to read data from CSV file. First line of the CSV file is the header which is saved to session as variable name. In current example CSV has only one column, but it is possible to have CSV file with several columns. circular means that if file end is reached feeder will start from the beginning. random means elements are taken in random order. More about feeders can be found in Gatling Feeders page.

private val csvFeeder = csv("search_terms.csv").circular.random

Create custom feeder

Feeder actually is Iterator[Map[String, T]], so you can do your own feeders. Bellow is shown code where some unique ids are read from file and converted to List[String] with Source .fromInputStream(getClass.getResourceAsStream(“/account_ids.txt”)) .getLines().toList. This list is used in buildFeeder() method to access random element from it. Finally Iterator.continually(buildFeeder(uniqueIds)) creates infinite length iterator.

private val uniqueIds: List[String] = Source
	.fromInputStream(getClass.getResourceAsStream("/account_ids.txt"))
	.getLines().toList

private val feedSearchTerms = Iterator.continually(buildFeeder(uniqueIds))

private def buildFeeder(dataList: List[String]): Map[String, Any] = {
	Map(
		"id" -> (Random.nextInt(100) + 1),
		"first_name" -> Random.alphanumeric.take(5).mkString,
		"last_name" -> Random.alphanumeric.take(5).mkString,
		"email" -> Random.alphanumeric.take(5).mkString.concat("@na.na"),
		"unique_id" -> dataList(Random.nextInt(dataList.size))
	)
}

Current business case doesn’t make much sense to have custom feeder with values from file, just Map() generator is enough. But lets imagine a case where you search for hotel by unique id and some date in the future. Hard coding date in CSV file is not a wise solution, you will want to be always in the future. Also making different combinations from hotelId, start and end dates is not possible to be maintained in a file. The best solution is to have file with hotel ids and dates to be dynamically generated as shown in buildFeeder() method.

Create unified scenarios

Scenario is created from HTTP requests. This is why it is good to have each HTTP request as a separate object so you can reuse them in different scenarios. In order to unify scenario creation there is special method. It takes scenario name, feeder and list of requests and returns a scenario object. Method checks if scenario is supposed to be repeated several times and uses repeat() method. Else scenarios is repeated forever(). In both cases there is constant pause time introduced between requests with pause().

def createScenario(name: String, 
					feed: FeederBuilder[_],
					chains: ChainBuilder*): ScenarioBuilder = {
	if (Constants.repeatTimes > 0) {
		scenario(name).feed(feed).repeat(Constants.repeatTimes) {
			exec(chains).pause(Constants.pause)
		}
	} else {
		scenario(name).feed(feed).forever() {
			exec(chains).pause(Constants.pause)
		}
	}
}

In this approach method can be reused from many places avoiding duplication of code.

val scnSearch = Constants.createScenario("Search", csvFeeder,
		reqGoToHome, reqSearchProduct, reqGoToHome)

Conditional scenario execution

It is possible one scenario to have different execution paths based on a condition. This condition is generally a value of a session variable. Branching is done with doIf, doIfElse, doIfEqualsOrElse, etc methods. In current example if this is Save request then additional reqGetPersonAferSave HTTP request is executed. Else additional reqGetPersonAferUpdate HTTP request is executed. In the end there is only one scenario scnSaveAndGet but it can have different execution paths based on “action” session variable.

val scnSaveAndGet = Constants
	.createScenario("Save and get", feedSearchTerms, reqSavePerson)
	.doIfEqualsOrElse("${action}", added) {
		reqGetPersonAferSave
	} {
		reqGetPersonAferUpdate
	}

Only one HTTP protocol

In general case several performance testing simulations can be done for one and the same application. During simulation setUp a HTTP protocol object is needed. Since application is the same HTTP protocol can be one and the same object, so it is possible to define it and reuse it. If changes are needed new HTTP protocol object can be defined or copy of current one can be created and modified.

val httpProtocol = http
	.baseURL(url)
	.check(status.is(successStatus))
	.extraInfoExtractor { extraInfo => List(getExtraInfo(extraInfo)) }

Extract data from HTTP request and response

In order to ease debugging of failures or debugging at all it is possible to extract information from HTTP request and response. Extraction is configured on HTTP protocol level with extraInfoExtractor { extraInfo => List(getExtraInfo(extraInfo)) } as shown above. In order to simplify code processing of extra info object is done in separate method. If debug is enabled or response code is not 200 or Gatling status is KO then request URL, request data and response body are dumped into simulation.log file that resides in results folder. Note that response body is extracted only if there is check on it, otherwise there is NoResponseBody in the output. This is done to improve performance.

private def getExtraInfo(extraInfo: ExtraInfo): String = {
	if (isDebug
		|| extraInfo.response.statusCode.get != successStatus
		|| extraInfo.status.eq(Status.apply("KO"))) {
		",URL:" + extraInfo.request.getUrl +
			" Request: " + extraInfo.request.getStringData +
			" Response: " + extraInfo.response.body.string
	} else {
		""
	}
}

Advanced simulation setUp

It is good idea to keep simulation class clean by defining all objects in external classes or singleton objects. Simulation is mandatory to have setUp() method. It receives comma separated list of scenarios. In order scenario to be valid it should have users injected with inject() method. There are different strategies to inject users. Protocol should also be defined per scenario setup. In this particular example default protocol is used with change to fetch all HTTP resources on a page (JS, CSS, images, etc.) with inferHtmlResources(). Since object are immutable this creates a copy of default HTTP protocol and does not modify the original one. Assertions is a way to verify certain performance KPI it is defined with assertions() method. In this example we should have response time less than 500ms and more than 99% of requests should be successful.

private val rampUpTime: FiniteDuration = 10.seconds

setUp(
	Product.scnSearch.inject(rampUsers(Constants.numberOfUsers) over rampUpTime),
	Product.scnSearchAndOpen.inject(atOnceUsers(Constants.numberOfUsers))
)
	.protocols(Constants.httpProtocol.inferHtmlResources())
	.pauses(constantPauses)
	.maxDuration(Constants.duration)
	.assertions(
		global.responseTime.max.lessThan(Constants.responseTimeMs),
		global.successfulRequests.percent
			.greaterThan(Constants.responseSuccessPercentage)
	)
	.throttle(reachRps(100) in rampUpTime, holdFor(Constants.duration))

Cookies management

Cookie support is enabled by default and then Gatling handles Cookies transparently, just like a browser would. It is possible to add or delete cookies during simulation run. See more details how this is done in Gatling Cookie management page.

Virtual users vs requests per second

Since users is vague metric, but requests per second is metric that most server monitoring tools support it is possible to use this approach. Gatling supports so called throttling: throttle(reachRps(100) in 10.seconds, holdFor(5.minutes)). It is important to put holdFor() method, otherwise Gatling goes to unlimited requests per second and can crash the server. More details on simulation setup can be found on Gatling Simulation setup page.

Conclusion

Keeping Gatling code maintainable and reusable is a good practice to create complex performance scenarios. Gatling API provides wide range of functionalities to support this task. In current post I have shown cases and solution to them which I have encountered in real life projects.

Read more...