Performance testing in the browser

Last Updated on by

Post summary: Approaches for performance testing in the browser using Puppeteer, Lighthouse, and PerformanceTiming API.

In the current post, I will give some examples how performance testing can be done in the browser using different metrics. Puppeteer is used as a tool for browser manipulation because it integrates easily with Lighthouse and DevTools Protocol. I have described all tools before giving any examples. The code can be found in GitHub sample-performance-testing-in-browser repository.

Puppeteer

Puppeteer is a tool by Google which allows you to control Chrome or Chromium browsers. It works over DevTools Protocol, which I will describe later. Puppeteer allows you to automate your functional tests. In this regards, it is very similar to Selenium but it offers many more features in terms of control, debugging, and information within the browser. Over the DevTools Protocol, you have programmatically access to all features available in DevTools (the tool that is shown in Chrome when you hit F12). You can check Puppeteer API documentation or check advanced Puppeteer examples such as JS and CSS code coverage, site crawler, Google search features checker.

Lighthouse

Lighthouse is again tool by Google which is designed to analyze web apps and pages, making a detailed report about performance, SEO, accessibility, and best practices. The tool can be used inside Chrome’s DevTools, standalone from CLI (command line interface), or programmatically from Puppeteer project. Google had developed user-centric performance metrics which Lighthouse uses. Here is a Lighthouse report example run on my blog.

PerformanceTimings API

W3C have Navigation Timing recommendation which is supported by major browsers. The interesting part is the PerformanceTiming interface, where various timings are exposed.

DevTools Protocol

DevTools Protocol comes by Google and is a way to communicate programmatically with DevTools within Chrome and Chromium, hence you can instrument, inspect, debug, and profile those browsers.

Examples

Now comes the fun part. I have prepared several examples. All the code is in GitHub sample-performance-testing-in-browser repository.

  • Puppeteer and Lighthouse – Puppeteer is used to login and then Lighthouse checks pages for logged in user.
  • Puppeteer and PerformanceTiming API – Puppeteer navigates the site and gathers PerformanceTiming metrics from the browser.
  • Lighthouse and PerformanceTiming API – comparison between both metrics in Lighthouse and NavigationTiming.
  • Puppeteer and DevTools Protocol – simulate low bandwidth network conditions with DevTools Protocol.

Before proceeding with the examples I will outline helper functions used to gather metrics. In the examples, I use Node.js 8 which supports async/await functionality. With it, you can use an asynchronous code in a synchronous manner.

Gather single PerformanceTiming metric

async function gatherPerformanceTimingMetric(page, metricName) {
  const metric = await page.evaluate(metric => 
     window.performance.timing[metric], metricName);
  return metric;
}

I will not go into details about Puppeteer API. I will describe functions I have used. Function page.evaluate() executes JavaScript in the browser and can return a result if needed. window.performance.timing returns all metrics from the browser and only needed by metricName one is returned by current function.

Gather all PerformaceTiming metrics

async function gatherPerformanceTimingMetrics(page) {
  // The values returned from evaluate() function should be JSON serializable.
  const rawMetrics = await page.evaluate(() => 
    JSON.stringify(window.performance.timing));
  const metrics = JSON.parse(rawMetrics);
  return metrics;
}

This one is very similar to previous. Instead of just one metric, all are returned. The tricky part is the call to JSON.stringify(). The values returned from page.evaluate() function should be JSON serializable. With JSON.parse() they are converted to object again.

Extract data from PerformanceTiming metrics

async function processPerformanceTimingMetrics(metrics) {
  return {
    dnsLookup: metrics.domainLookupEnd - metrics.domainLookupStart,
    tcpConnect: metrics.connectEnd - metrics.connectStart,
    request: metrics.responseStart - metrics.requestStart,
    response: metrics.responseEnd - metrics.responseStart,
    domLoaded: metrics.domComplete - metrics.domLoading,
    domInteractive: metrics.domInteractive - metrics.navigationStart,
    pageLoad: metrics.loadEventEnd - metrics.loadEventStart,
    fullTime: metrics.loadEventEnd - metrics.navigationStart
  }
}

Time data for certain events are compile from raw metrics. For e.g., if DNS lookup or TCP connection times are slow, then this could be some network specific thing and may not need to be acted. If response time is very high, then this is indicator backend might not be performing well and needs to be further performance tested. See How to do proper performance testing post for more details.

Gather Lighthouse metrics

const lighthouse = require('lighthouse');

async function gatherLighthouseMetrics(page, config) {
  // ws://127.0.0.1:52046/devtools/browser/675a2fad-4ccf-412b-81bb-170fdb2cc39c
  const port = await page.browser().wsEndpoint().split(':')[2].split('/')[0];
  return await lighthouse(page.url(), { port: port }, config).then(results => {
    delete results.artifacts;
    return results;
  });
}

The example above shows how to use Lighthouse programmatically. Lighthouse needs to connect to a browser on a specific port. This port is taken from page.browser().wsEndpoint() which is in format ws://127.0.0.1:52046/devtools/browser/{GUID}. It is good to delete results.artifacts; because they might get very big in size and are not needed. The result is one huge object. I will talk about this is more details. Before using Lighthouse is should be installed in a Node.js project with npm install lighthouse –save-dev.

Puppeteer and Lighthouse

In this example, Puppeteer is used to navigating through the site and authenticate the user, so Lighthouse can be run for a page behind a login. Lighthouse can be run through CLI as well but in this case, you just pass and URL and Lighthouse will check it.

puppeteer-lighthouse.js

const puppeteer = require('puppeteer');
const perfConfig = require('./config.performance.js');
const fs = require('fs');
const resultsDir = 'results';
const { gatherLighthouseMetrics } = require('./helpers');

(async () => {
  const browser = await puppeteer.launch({
    headless: true,
    // slowMo: 250
  });
  const page = await browser.newPage();

  await page.goto('https://automationrhapsody.com/examples/sample-login/');
  await verify(page, 'page_home');

  await page.click('a');
  await page.waitForSelector('form');
  await page.type('input[name="username"]', 'admin');
  await page.type('input[name="password"]', 'admin');
  await page.click('input[type="submit"]');
  await page.waitForSelector('h2');
  await verify(page, 'page_loggedin');

  await browser.close();
})();

verify()

const perfConfig = require('./config.performance.js');
const fs = require('fs');
const resultsDir = 'results';
const { gatherLighthouseMetrics } = require('./helpers');

async function verify(page, pageName) {
  await createDir(resultsDir);
  await page.screenshot({
    path: `./${resultsDir}/${pageName}.png`,
    fullPage: true
  });
  const metrics = await gatherLighthouseMetrics(page, perfConfig);
  fs.writeFileSync(`./${resultsDir}/${pageName}.json`,
    JSON.stringify(metrics, null, 2));
  return metrics;
}

createDir()

const fs = require('fs');

async function createDir(dirName) {
  if (!fs.existsSync(dirName)) {
    fs.mkdirSync(dirName, '0766');
  }
}

A new browser is launched with puppeteer.launch(), arguments { headless: true, //slowMo: 250 } are put for debugging purposes. If you want to view what is happening then set headless to false and slow the motions with slowMo: 250, where time is in milliseconds. Start a new page with browser.newPage() and navigate to some URL with page.goto(‘URL’). Then verify() function is invoked. It is shown on the second tab and will be described in a while. Next functionality is used to log in the user. With page.click(‘SELECTOR’), where CSS selector is specified, you can click an element on the page. With page.waitForSelector(‘SELECTOR’) Puppeteer should wait for the element with the given CSS selector to be shown. With page.type(‘SELECTOR’, ‘TEXT’) Puppeteer types the TEXT in the element located by given CSS selector. Finally browser.close() closes the browser.

So far only Puppeteer navigation is described. Lighthouse is invoked in verify() function. Results directory is created initially with createDir() function. Then a screenshot is taken on the full page with page.screenshot() function. Lighthouse is called with gatherLighthouseMetrics(page, perfConfig). This function was described above. Basically, it gets the port on which DevTools Protocol is currently running and passes it to lighthouse() function. Another approach could be to start the browser with hardcoded debug port of 9222 with puppeteer.launch({ args: [ ‘–remote-debugging-port=9222’ ] }) and pass nothing to Lighthouse, it will try to connect to this port by default. Function lighthouse() accepts also an optional config parameter. If not specified then all Lighthouse checks are done. In the current example, only performance is important, thus specific config file is created and used. This is config.performance.js file.

Puppeteer and PerformanceTiming API

In this example, Puppeteer is used to navigating the site and extract PerformanceTiming metrics from the browser.

const puppeteer = require('puppeteer');
const { gatherPerformanceTimingMetric,
  gatherPerformanceTimingMetrics,
  processPerformanceTimingMetrics } = require('./helpers');

(async () => {
  const browser = await puppeteer.launch({
    headless: true
  });
  const page = await browser.newPage();
  await page.goto('https://automationrhapsody.com/');

  const rawMetrics = await gatherPerformanceTimingMetrics(page);
  const metrics = await processPerformanceTimingMetrics(rawMetrics);
  console.log(`DNS: ${metrics.dnsLookup}`);
  console.log(`TCP: ${metrics.tcpConnect}`);
  console.log(`Req: ${metrics.request}`);
  console.log(`Res: ${metrics.response}`);
  console.log(`DOM load: ${metrics.domLoaded}`);
  console.log(`DOM interactive: ${metrics.domInteractive}`);
  console.log(`Document load: ${metrics.pageLoad}`);
  console.log(`Full load time: ${metrics.fullTime}`);

  const loadEventEnd = await gatherPerformanceTimingMetric(page, 'loadEventEnd');
  const date = new Date(loadEventEnd);
  console.log(`Page load ended on: ${date}`);

  await browser.close();
})();

Metrics are extracted with gatherPerformanceTimingMetrics() function described above and then data is collected from the metrics with processPerformanceTimingMetrics(). In the end, there is an example how to extract one metric such as loadEventEnd and display it as a date object.

Lighthouse and PerformanceTiming API

const puppeteer = require('puppeteer');
const perfConfig = require('./config.performance.js');
const { gatherPerformanceTimingMetrics,
  gatherLighthouseMetrics } = require('./helpers');

(async () => {
  const browser = await puppeteer.launch({
    headless: true
  });
  const page = await browser.newPage();
  const urls = ['https://automationrhapsody.com/',
    'https://automationrhapsody.com/examples/sample-login/'];

  for (const url of urls) {
    await page.goto(url);

    const lighthouseMetrics = await gatherLighthouseMetrics(page, perfConfig);
    const firstPaint = parseInt(lighthouseMetrics.audits['first-meaningful-paint']['rawValue'], 10);
    const firstInteractive = parseInt(lighthouseMetrics.audits['first-interactive']['rawValue'], 10);
    const navigationMetrics = await gatherPerformanceTimingMetrics(page);
    const domInteractive = navigationMetrics.domInteractive - navigationMetrics.navigationStart;
    const fullLoad = navigationMetrics.loadEventEnd - navigationMetrics.navigationStart;
    console.log(`FirstPaint: ${firstPaint}, FirstInterractive: ${firstInteractive}, 
      DOMInteractive: ${domInteractive}, FullLoad: ${fullLoad}`);
  }

  await browser.close();
})();

This example shows a comparison between Lighthouse metrics and PerformanceTiming API metrics. If you run the example and compare all the timings you will notice how much slower the site looks according to Lighthouse. This is because it uses 3G (1.6Mbit/s download speed) settings by default.

Puppeteer and DevTools Protocol

const puppeteer = require('puppeteer');
const throughputKBs = process.env.throughput || 200;

(async () => {
  const browser = await puppeteer.launch({
    executablePath: 
      'C:\\Program Files (x86)\\Google\\Chrome\\Application\\chrome.exe',
    headless: false
  });
  const page = await browser.newPage();
  const client = await page.target().createCDPSession();

  await client.send('Network.emulateNetworkConditions', {
    offline: false,
    latency: 200,
    downloadThroughput: throughputKBs * 1024,
    uploadThroughput: throughputKBs * 1024
  });

  const start = (new Date()).getTime();
  await client.send('Page.navigate', {
    'url': 'https://automationrhapsody.com'
  });
  await page.waitForNavigation({
    timeout: 240000,
    waitUntil: 'load'
  });
  const end = (new Date()).getTime();
  const totalTimeSeconds = (end - start) / 1000;

  console.log(`Page loaded for ${totalTimeSeconds} seconds 
    when connection is ${throughputKBs}kbit/s`);

  await browser.close();
})();

In the current example, network conditions with restricted bandwidth are emulated in order to test page load time and perception. With executablePath Puppeteer launches an instance of Chrome browser. The path given in the example is for Windows machine. Then a client is made to communicate with DevTools Protocol with page.target().createCDPSession(). Configurations are send to browser with client.send(‘Network.emulateNetworkConditions’, { }). Then URL is opened into the page with client.send(‘Page.navigate’, { URL}). The script can be run with different values for throughput passed as environment variable. Example waits 240 seconds for the page to fully load with page.waitForNavigation().

Conclusion

In the current post, I have described several ways to measure the performance of your web application. The main tool used to control the browser is Puppeteer because it integrated very easily with Lighthouse and DevTools Protocol. All examples can be executed through the CLI, so they can be easily plugged into CI/CD process. Among the various approaches, you can compile your preferred scenario which can be run on every commit to measure if the performance of your applicaition has been affected by certain code changes.

Related Posts

Read more...

What is The Test Mushroom and how to improve your testing

Last Updated on by

Post summary: In contrast to Test Pyramid, Test Mushroom shows a test portfolio which is restricted to costly and slow UI tests only. In the current post, I will describe approaches to act on your Test Mushroom hence improve your testing.

Test Pyramid

Test pyramid illustrates how your test portfolio is good to look like. The important thing about the test pyramid is the higher in the test pyramid the more brittle and expensive to maintain the tests are. In the bottom of the pyramid are the unit tests. They are fast and test most of the functionality in your code. By integration tests is meant the following thing. Defacto the standard now for web applications is to use some JavaScript UI framework that manipulates data from different APIs. With integration testing, we want to be in control of this data. We want for e.g. to test how web application behaves when API returns an error. By stubbing the data web application works with it is possible to fully test the web application. UI/E2E tests are the ones executed against deployed, configured, integrated and working web application. They are slow and flaky, thus they should be limited in number. In other versions of test pyramid, there is a layer called service tests, which is below UI and above integration. Those are API tests performed against deployed and working backend. I will not go into details about them, because API tests are by definition very stable.

Test Mushroom

I came up with this term when giving a presentation for Cypress, you can see more in Cypress vs. Selenium, is this the end of an era? post. This term was meant to be a funny and ironic description of a test portfolio that makes testing big pain because the quality of releases totally depends on UI tests. The mushroom leg represents unit and integration tests. It is shown with a dotted line as such test are totally missing. UI tests are slow and flaky, every release sign off takes a lot of time for debugging failed tests. Every release has a high risk of failure. This is very similar to the Test Ice-Cream Cone, with the difference that in case of Test Mushroom integration and unit tests are missing.

Need for an action

Whether it is a test mushroom or test ice-cream cone, it is not important. The important is both represent a situation where product quality depends on brittle and flaky UI tests. This should be acted upon in order to reduce the release related risk. In the current post, I will suggest some approaches how to act in this situation and put yourself in a better position. Below is a high-level list of what you can do. Each item from the list will be described with greater details later in the current post.

  • Refactor and optimize UI tests
  • Provision dedicated test environment
  • Increase integration testing
  • Unit testing and shift left

Refactor and optimize UI tests

The first thing you can do is act on UI tests because you have full control over them. Go over current tests and do a full review on them. In most of the cases, there are duplicated tests. With time being people add new tests and they keep piling up. This happens because it is easier to add new test rather than inspecting already existing ones and fit your tests scenario inside. You need to optimize your current tests. It might not happen immediately, it can be done a single step at a time, but you should do it. If several test scenarios can be fit into one automated test, then definitely do it. There are theories that say one test should test one thing only. I totally agree with this statement, but it is only relevant for the unit tests. For UI/E2E/functional tests this statement is more likely a good wish. UI tests are expensive, so we should optimize them as much as possible.

Classification tree test method

You should look for more details for classification tree method. I will give a short example. Imagine you have an e-commerce website. In this site, there are 3 main types of products: single product, a product with variations, e.g. several colors, and product set. This site offers 3 different deliveries, one international and two domestic shipping methods. Users can pay with Paypal, Visa, MasterCard, and Amex. With classification tree method you will have 3 different classifications: product type, shipping method, and payment method. The full test cases that can be done is a cartesian product from values of all classifications. In current example this 3 (product types) x 3 (shipping methods) x 4 (payment methods) = 36 full combinations. Minimum test cases that can be done though are 4, or the classification with most values. The 4 test cases we definitely must do are:

  • Single product with international delivery and PayPal
  • Blue variation product with domestic delivery #1 and Visa
  • Red variation product with international delivery and MasterCard
  • A product set with domestic delivery #2 and Amex

Soft assertions

Once you optimize tests number and workflow you would like to make as many assertions as possible while you are on each page. In this regards, so-called soft assertions can be used. Opposing to traditional unit testing asserts, where the test fails immediately when an error is found, a soft assertion is one that does not fail in case of a not critical problem. This provides the ability to execute all the steps in the test and then investigate the issues. Soft assertions that do not fail JUnit test and Soft assertions for C# unit testing frameworks (MSTest, NUnit, xUnit.net) posts can give you more details how to do it in Java and C#.

Rename test methods

It is good to rename your test methods to be as much descriptive as possible. It does not matter that method name will contradict with best practices for method naming because those are test methods. If your tests fail you can identify them very easily just by the name of the failed method.

Use smarter waits

I get really upset when I see in test something like Thread.Sleep(5000). You should never ever use such waits. Not only they are slowing you tests down but they will make the test fail if for some reason website is taking 6 seconds to render. Selenium offers explicit and implicit waits, you should be very familiar with them. Another approach is to use even smarter mechanisms. Like, check for jQuery opened connections or for the existence of some kind of loader on your web application. See Efficient waiting for Ajax call data loading with Selenium WebDriver post for more details. Cypress, on the other hand, eliminates waiting at all, as it knows what happens in the browser and it gives you the element once it is shown.

Retry failed tests

If you do not already have, you definitely need a retry mechanism for your tests. In Retry JUnit failed tests immediately post, I have described how this can be done for JUnit.

Screenshot failed tests

In order to ease yourself debugging a screenshot is a must. Along with the screenshot, it is good to have page source and URL at which screenshot was captured.

Provision dedicated test environment

In case of a shared environment, it is always possible that someone is doing something while tests are running. It is very good if you can provision a dedicated test environment. You should at any point know which version of the software under test are deployed on it and no one should mess with the test environment. If you have an application that consists of a database and API that is consumed by a UI then you can relatively easy use Docker to get a running test environment. If you are testing some application which is part of a big microservice ecosystem, then it might not be that easy, because you have to have dedicated environment for each dependent microservice, and they can be a big number.

Control test data in the database

Ideally, you want to have full control over the data in the database. In this way you can very easily assert and check for data you know is there. Ono option is in case of an application with own database, it is very easy to have a Docker image with already prefilled test data in the database and use it. If not using preloaded data you can still seed the data with API calls prior to the tests.

Dedicated test environment not only can make your tests more stable but can make them faster. Check Emanuil Slavov’s Need for Speed presentation, this talk is also available in GTAC 2016 video.

Increase integration testing

So far you have optimized your UI tests. If you are satisfied with the results, then maybe no further steps are needed. Remember, we do certain things, not because everybody is doing it but because we need it. If you need more improvements then you can look into integration testing. With term integration testing I mean testing of your application or parts of it by stubbing or mocking external dependencies. Below are several suggestions how you can do this.

JavaScript rich web application

In case of a web application built with some JavaScript framework that consumes the data from external APIs and renders the UI based on the data then there are two approaches to do integration testing. One is to use Cypress, which has a very good feature set for decent integration testing in the browser. See Cypress vs. Selenium, is this the end of an era? post for more details. The other approach is to use external stubs and have your application under test configured to work with the stubs. You can even make your testing framework to start and manage the stubs. See WireMock and Own stubbing sections below.

API backend application

In case of backend application that exposes different APIs for external consumption then the approach for integration testing is to stub or mock its dependencies. Dependency can be a database or external API that is being called. Database stubbing depends on the type of application and database used. For .NET application using Entity Framework, it is possible to mock the framework itself. The good thing about .NET is that it provides so-called TestHost, which can run your application in memory and you can also mock some of your dependencies if you have built your application properly to use inversion of control container. See more in .NET Core integration testing and mock dependencies post. When speaking with colleagues they say Spring framework for Java provides similar functionalities, but I do not have experience with it. In terms of database, it depends which database has been used. If it is MS SQL Server, then one option, besides totally mocking the DB calls, is to use SQL Express (localdb). It runs on Windows machine and is extremely fast. It is very easy to create a new database and then run your application with this database. For MySQL I’ve seen in presentations that it is possible to run it in memory but I haven’t tried this. Mocking dependencies to external applications again can be done either with WireMock or with Own Stubbing. You can have an instance of your application installed in separate integration environment and configured to use stubs instead of real dependency APIs.

Server-side HTML rendering

Integration testing of web application which HTML is rendered on the server and just given to the browser is very similar to the previous section API backend application.

WireMock

WireMock is a simulator for HTTP-based APIs. Some might consider it a service virtualization tool or a mock server. It enables you to stay productive when an API you depend on doesn’t exist or isn’t complete. It supports testing of edge cases and failure modes that the real API won’t reliably produce. And because it’s fast it can reduce your build time from hours down to minutes. I have shown how WireMock can be used in unit tests in Mock/Stub REST API with WireMock for better unit testing post. It can be run as standalone Java application with different endpoints and responses configured. So you can make WireMock reply differently based on the request it receives. This can be synchronized with your tests and you can automate whatever scenarios you need. Challenge with this approach is to keep both tests and mocked data in sync.

Own stubbing

It is possible to build own stub and configure it with whatever scenarios you want. I have described how you can do this in Java, .NET and Node.js in following posts: Build a RESTful stub server with Dropwizard, Build a REST API with .NET Core 2 and run it on Docker Linux container, and Build a REST API with Express on Node.js and run it on Docker.

Unit testing and shift left

The base of test pyramid are the unit tests. I have many posts on my blog regarding unit testing so in the current post, I will not go into further details about them because this more in development expertise. From my experience, if you have good UI and integration tests, you can have a very good level of quality without unit testing. I will refer to Emanuil Slavov’s Integration Tests Are Awesome post. He had spent a significant amount of time investigating bugs in their bug tracking system and linking them to a layer of testing. He had discovered that only 13% of the bugs they could have caught with unit testing. Another 57% of the bugs they would have caught with API and UI testing. Latter 30% they have discovered could not have been caught with any kind of testing. I guess integration testing can cover some of the 30% uncatchable bugs because you can stub and mock the dependencies and this gives you better flexibility. So this is a good proof that you can go without unit testing. The real benefit of the unit testing though is the Shift Left paradigm. It involves developers in the process of building quality in the application. If developers should write unit tests to their code they catch and fix bugs almost immediately. With the time being developers learn to imply quality in their code. Your UI and integration tests will also catch most of the bugs. The more important is that the process of reporting and fixing bugs caught during UI and integration testing includes more time and effort. This is why writing unit tests are mandatory for any organization that wants to deliver quality products.

Conclusion

This post had started with the funny term of the test mushroom. It later continues with important guidelines how you can improve your testing first by optimizing your UI tests, then develop integration tests, and finally describing why unit testing is important to an organization, because of the Shift Left paradigm which involves developers into building quality to the application.

Related Posts

Read more...

Cypress vs. Selenium, is this the end of an era?

Last Updated on by

Post summary: Blog post about a Cypress talk I did recently on a local conference. The presentation compares Cypress with well knows Selenium.

Background

This weekend I did a small talk about Cypress, named “Cypress vs. Selenium, the end of an era?” on QA Challenge Accepted, a local testing conference. This is my second talk on this conference. In 2016 I spoke about Gatling. I haven’t blogged about my Galing talks because my blog covers the tool very extensively. In Performance testing with Gatling post, there is complete Gatling tutorial. In the current post, I will show most of the slides of my presentation and will describe what I have spoken about. The full Cypress presentation can be found on SlideShare: QA Challenge Accepted 4.0 – Cypress vs. Selenium.

Presentation

Selenium Overview

Selenium is a very well known tool, so I will not get into details about it. I will emphasize on its architecture which will be important for the rest of the presentation. Selenium consists of two components. One is so-called bindings, libraries for different programming languages that we use to write our tests with. The other component is the WebDriver. WebDriver is a program that can manage and fully control a specific browser, for which it is designated. The important bit here is that those two components communicate over HTTP by exchanging JSON payload. This is well defined by WebDriver Protocol, which is W3C Candidate Recommendation. Every command used in tests results to a JSON sent through the network. This network communication happens even if tests are run locally. In this case, requests are sent to localhost behind which there is loopback network interface. Even on localhost request travels to Layer 3 of the OSI Model. Request travels through 5 layers, only layers 1 (physical) and 2 (data link) are skipped.

After the conference, I spoke to Anton Angelov, the founder of Automate The Planet and he pointed out that on some WebDrivers on .NET Core, the resolution from localhost in the request to 127.0.0.1, which is the IPv4 address of loopback interface, can take up to a second. This is, of course, .NET Core specific thing. Anyway, in general, resolving localhost to 127.0.0.1 also needs some time which is added to total execution time.

The bottom line of the slide: By architecture Selenium works through the network and this brings delay, which can sometimes be significant.

Cypress Overview

Cypress is used for UI testing but is not based on Selenium. There are many tools out there which bring a lot of abstractions over the WebDriver, but they are limited to the WebDriver as a technology for browser manipulation. All those tools inherit WebDriver limitations. Cypress has its own mechanism for manipulation DOM in the browser. Cypress runs directly in the browser, no network communication involved. By running directly in the browser Cypress has access to everything in the browser, including your application under test. I do not know a valid reason for this, but in my observations, developers strongly do not like Selenium. Cypress is designed with developers in mind, so it is very developer-friendly. Debugging tests with Cypress is easy, there is so-called travel back in time. I will speak for it later.

The bottom line of the slide: Cypress is made from scratch with its own unique DOM manipulation technology and is made with developers in mind.

How to do it

The bottom line of the slide: It is easy to install Cypress. It is easy to write tests with it. It is very easy to debug tests. It is easy to include it in continuous integration or continuous delivery pipelines.

Debug tests in Cypress Test Runner

Cypress Test Runner is a browser instance in which you see all your tests’ steps on the left-hand side. You can click on any step and in the right-hand side window, the application under test is visualized. Cypress makes DOM snapshot before each test steps, so you can easily inspect them.

The bottom line of the slide: Cypress provides DOM snapshots at each test step for easy test debugging.

Library or Framework

Comparison between both tools now begins. Selenium is a library. If you want to make real UI automation you have to combine it with a unit testing framework or make your own runner; you may want to add assertions library or reporting one. This is handy and gives you great flexibility because if you know what you do you can make miracles. You become a creator! If you don’t know what you do you can very easily shoot yourself in the foot. I think this is mostly because developers hate Selenium. Its usage is not straightforward. In order to start writing tests, you have to do a lot of preparational work. This is something developers do not want to invest in, they invest enough in learning all the frameworks related to their work. They do not want to spend time on several more. Cypress, on the other hand, is a complete framework. You install it and start writing tests. It includes Mocha, very famous JavaScript unit testing framework; Chai is assertions library; Chai-jQuery adds jQuery chainer methods to Chai; Sinon is famous JavaScript mocking library that provides mocks, stubs, and spies; Sinon-Chai brings Chai assertions on stubs and spies.

The bottom line of the slide: Selenium is a library allowing you great flexibility. It requires a lot of preparational work before you can start writing tests. Cypress is a complete framework. You install it and start writing tests.

Test Pyramid

Test pyramid illustrates how your test portfolio should look like. Currently is show a test pyramid for a modern web application. In the bottom are the unit tests. They are fast and test most of the functions in your code. By integration tests is meant the following thing. Defacto the standard now for web applications is to use some JavaScript framework and work with data from APIs. Modern web applications process data from APIs. With integration testing, we want to be in control of this data. We want for e.g. to test how web application behaves when API returns an error. If we have stable and well-tested API this scenario we won’t be able to test in reality. By stubbing the data web application works with it is possible to fully test the web application. UI tests are the one executed against deployed, configured and working web application. They are slow and flaky, thus they are limited in number. Selenium works in the UI part of the pyramid. Cypress is there as well, but Cypress is also very good in Integration tests. The dotted line is mostly a wishful thinking – we have unit testing framework included, why not create some unit tests.

The bottom line of the slide: Selenium works only in the UI part of the test pyramid, while Cypress is involved in UI tests and most important in the integration tests.

Programming languages

There are bindings in almost any programming language existing nowadays. If there is not such, by following WebDriver protocol you can create your own binding. Cypress on the other hand only uses JavaScript and will continue to only use JavaScript. There are two reasons for this. FIrst one is not significant, but it is good your tests code to be as you application under test’s code. The most important reason though is that as I said modern web applications are written in JavaScript frameworks. Developers do know JavaScript, so they can very easily write their own Cypress tests.

The bottom line of the slide: Selenium is available in all programming languages. Developers of modern web application know JavaScript. With Cypress, they can write their own tests.

Selectors

Selenium supports 8 different locators. CSS and XPath are the most powerful ones. Cypress supports jQuery selectors. What you can use as a CSS selector in Selenium, you can directly use as a selector in Cypress. The benefit is that jQuery provides more selectors on hand.

The bottom line of the slide: jQuery selectors give more capabilities than CSS selectors.

Supported Browsers

Selenium supports all significant browsers. You can even create your own browser, make WebDriver for it following WebDriver protocol and your current tests will work exactly the same on this new browser. Cypress at this point supports only Chrome. This is maybe the biggest weakness of the tool. Good thing though is that more than 60% of the web uses Chome. Another good thing is that Firefox support is on its way. IE 11 and Edge support is also on the roadmap but with no clear dates.

The bottom line of the slide: Cypress is weak at cross-browser testing. Cypress team is working through to get better in this area.

Cypress vs. Selenium (1)

Comparison of different characteristics:

  • Speed – Selenium tests are generally slow. WebDriver starting is slow, WebDriver is working slowly. Network operating nature of WebDriver also brings some delay. Cypress is super fast. There is no noticeable delay because of the tool itself.
  • Wait for element – in order to do some effective automation with Selenium, waiting for an element is an important part of your framework. You have to have good error catching mechanism as well as retry logic. It takes significant effort to make tests non-flaky. With Cypress, you do not wait. Cypress runs in the browser and knows what is happening behind the scenes, whether the application under test is still busy. In Cypress, you request the element and you get it when the element is ready, no extra code or logic needed.
  • Remote execution – this is what Selenium is made for. You can use Selenium Grid with different browsers and different browser versions. Cypress does not support remote execution.
  • Parallel execution – Selenium is a library that can manipulate the browser. If you make your code thread safe and use a unit testing framework that supports parallel runs then you will have parallel execution. Selenium does not really care about this. Cypress currently does not support parallel execution. It is possible to do it on your own with Docker images, but this involves additional effort. Currently, Cypress team is working on developing parallel execution, so this will happen soon.
  • Headless – both tools support headless Chrome.

Cypress vs. Selenium (2)

Comparison of different characteristics:

  • Screenshot – both perform equally bad because both make screenshot only of the visible part of the page. In order to get the full page, you need to use external JavaScript libraries to capture page and save it as a screenshot. Selenium is a little bit better on screenshot though, because it gives you screenshot object in your tests and you can save it wherever you want. Cypress makes an automatic screenshot with a fixed name. In order to make second screenshot for one test, you need to do some file manipulations for renaming the previous file.
  • Video – Selenium does not record video. Cypress records video by default when tests are run from command line.
  • Documentation – Selenium documentation for me is ugly and not complete. If one has to get acknowledged with Selenium by reading its documentation only that would be very difficult. Cypress team had invested a lot in the documentation. They have their API well described, they have examples and FAQ page.
  • Community – Selenium is an institution. Everybody is using Selenium. For every problem, you may encounter there are already tens of solutions. Cypress does not have such a community yet, not many people are using it. They have chat though in which Cypress developers answer your queries. This chat gets flooded with information, so you can easily get lost.

Cypress vs. Selenium (3)

Comparison of different characteristics:

  • Execute JS – Selenium allows JavaScript execution and this is fast. I’ve seen frameworks where Selenium is used only for JavaScript execution. Cypress is designed to work with JavaScript. It has full access to everything in the browser, including application under test.
  • Switch tabs – Selenium can switch between two tabs of the same browser, Cypress cannot.
  • Several browsers – Selenium can work with several browser windows, even from different browsers. Cypress can work with only one browser instance.
  • Load extensions – both tools allow you to run your tests with some Chrome extension.
  • Manage cookies – both tools manage cookies equally well.

Test Mushroom

This is some funny, ironic but mostly tragic term I’ve made up. It is made up with an analogy to the test pyramid. This mushroom represents a very common test portfolio where the quality of releases totally depends on UI tests. The mushroom leg represents Unit and Integration tests. It is shown with a dotted line as such test are missing. UI tests are slow and flaky, every release sign off takes a lot of time for debugging failed tests. Every release has a risk of failure.

The bottom line of the slide: There are many real-life scenarios where web application quality depends only on UI tests, which are brittle.Cypress gives you instrumentation to act on integrations test, thus reducing the number of UI ones.

The end of an era?

Now is the time to give an answer to the most interesting part of presentation name. Is this an end of an era? My presentation is not really about the competition between those two tools. Everyone has its strengths and weaknesses. They work perfectly combined together. My main point is and I truly believe it is the end of Developers don’t test era. Selenium may not be their favorite tool, but Cypress is made from developers for developers. It is easy to work with and provides features to speed up test writing. If you try the tool and do not like it, at least try to introduce it to developers in your company.

The bottom line of the slide: Cypress is a tool created by developers for developers. Try to introduce it to developers in your organization.

Cypress Sugar (1)

Those are features that make Cypress interesting tool and that make integration testing easy. Cypress runs directly in the browser and has access to everything in the browser, including application under test. Sometimes Selenium tests go through several pages just to bring the application in some desired state. With Cypress, you can programmatically bring the application to this desired state. Cypress provides spies, stubs, and clocks. With spies, you can verify if given JavaScript function has been called, with what arguments or how many times. Stubs allow you to change the default behavior of JavaScript functions and feed to the application under test the data that you need. For e.g. window.fetch is the new way of getting data from API. This can be very easily stubbed. Cypress provides control over the clock in the browser. If you have some animation, instead of waiting for it, you can move the clock forcing animation to show. Cypress allows you to have full control over the network traffic within the browser. You can assert on XMLHttpRequest to an API, verifying that API is called with proper arguments. You can intercept, change, delay, or block response from the API. This allows you to cover various integration scenarios. With Cypress, you can develop even though there is no backend ready yet. You can do TDD (test driven development), create tests and stub the data from missing API in them, develop the UI and then run the tests until they get green.

The bottom line of the slide: Cypress provides great functionalities for stubbing JavaScript functions and control on network traffic within the browser. Those can be used for creating various integration tests.

Cypress Sugar (2)

The essential functionality of most applications is hidden behind a login. Login through the UI slows up the tests. Cypress allows you to send a login request to the backend and it extracts cookies from the response, injects them in the browser and from now on the user is logged in tests. This request takes browser user agent and existing cookies but skips some security limitations, such as CORS (cross-origin resource sharing). Same can be done with Selenium. You can use some HTTP client, send login request, get the response and inject the cookies in the browser. With Selenium, this requires additional effort though, with Cypress it comes out of the box. Last but not least, with Cypress you can test Electron applications. Electron is a framework which enables you to write desktop applications in HTML, CSS, and JavaScript. Those applications are run within Electron browser, which is based on Chromium and Node.js. A good example of an Electron application is Postman (check Introduction to Postman with examples post). Cypress Test Runner is also an Electron application. And Cypress uses Cypress to test Cypress (Test Runner). Testing of Electron applications is not really a straight-forward task. It requires some amount of functions stubbing.

Conclusion

Cypress is a really great tool. It provides very good features to enable you to create integration tests. I have used Selenium way too much in order to dislike it. Tests with it are slow and flaky. I really hope there is something better out there. On the other hand, I have used Cypress way too little to like it very much and think this is the tool. In any way do try Cypress. If you do not like it, then definitely introduce it to your developers.

Related Posts

Read more...

.NET Core code coverage on Linux with MiniCover

Last Updated on by

Post summary: How to run code coverage of your unit tests as part of your build on Linux build agents.

Code below can be found in GitHub SampleDotNetCore2RestStub repository. In Code coverage of .NET Core unit tests with OpenCover post, I have shown how to do code coverage with OpenCover. Commands shown in that post can be made part of your CI or CD build. There is a but though, this works only for windows. If you are having build machines on Linux you need another alternative. In this post, I’m going to show this alternative.

MiniCover

MiniCover is a lightweight code coverage tool for .NET Core on Linux. It is in an early stage yet and there is no big community, but I really hope this is going to change soon as it looks a very promising tool.

Include in project

In order to use MiniCover it has to be installed as .NET CLI Tool. This is done with following code:

<ItemGroup>
	<DotNetCliToolReference Include="MiniCover" Version="2.0.0-ci-*" />
</ItemGroup>

In order to keep your original projects intact, the best approach is to create tools project and add it to its tools.csproj file, which will look:

<Project Sdk="Microsoft.NET.Sdk">

  <PropertyGroup>
    <TargetFramework>netcoreapp2.0</TargetFramework>
  </PropertyGroup>

  <ItemGroup>
    <DotNetCliToolReference Include="MiniCover" Version="2.0.0-ci-*" />
  </ItemGroup>

</Project>

Commands

At this stage following command line options are available:

  • instrument – Instrument assemblies
  • uninstrument – Uninstrument assemblies
  • reset – Reset hits count
  • report – Outputs coverage report
  • htmlreport -Write HTML report to a folder
  • xmlreport – Write an NCover-formatted XML report to folder

Run coverage

In case of a project structure where you have your code in src folder and your tests in test folder following bash script can be used directly. It accepts as parameter threshold coverage percentage, if not provided it uses 80% by default. Script restores NuGet packages and builds the projects. It navigates to tools folder and restores NuGet packages again. This is very important as it is the only way to get MiniCover NuGet package. Inside tools folder, it instruments assemblies and resets previous statistics. Script navigates to root folder and runs all tests inside every project in test folder. Afterward, script navigates again to tools folder and uninstruments all assemblies so far. No matter this operation is safe I would recommend to run one more build or publish before assemblies go into production. In the end, the script generates reports.

if [ ! -z $1 ]; then
  if [ $1 -lt 0 ] || [ $1 -gt 100 ]; then
    echo "Threshold should be between 0 and 100"
    threshold=80
  fi
  threshold=$1
else
  threshold=80
fi

dotnet restore
dotnet build

cd tools
dotnet restore

# Instrument assemblies inside 'test' folder to detect hits for source files inside 'src' folder
dotnet minicover instrument --workdir ../ --assemblies test/**/bin/**/*.dll --sources src/**/*.cs 

# Reset hits count in case minicover was run for this project
dotnet minicover reset

cd ..

for project in test/**/*.csproj; do dotnet test --no-build $project; done

cd tools

# Uninstrument assemblies, it's important if you're going to publish or deploy build outputs
dotnet minicover uninstrument --workdir ../

# Create HTML reports inside folder coverage-html
# This command returns failure if the coverage is lower than the threshold
dotnet minicover htmlreport --workdir ../ --threshold $threshold

# Print console report
# This command returns failure if the coverage is lower than the threshold
dotnet minicover report --workdir ../ --threshold $threshold

# Create NCover report
dotnet minicover xmlreport --workdir ../ --threshold $threshold

cd ..

Reports

There are 3 types of reports: Console, HTML and NCover XML.

Console report

Console report is dumping results to the console and returns 1 if given threshold is not met, which basically fails the CI/CD build. In the example below, codeCoverage.sh was called with argument 40, which means threshold is 40%.

HTML report

HTML report is also failing the build and gives similar summary information as console report, but also give detailed information for each class coverage. An example report can be found in MiniCover HTML report. I have to praise myself, as the summary file that is shown below was something I contributed, because I like the project very much.

NCover report

NCover report creates XML file in its format. The beauty of it is that you can additionally use ReportGenerator on Windows machine and convert XML to nice HTML report. Assuming ReportGenerator is extracted on your C:\ then the command is shown below. The report can be found in MiniCover ReportGenerator report.

C:\ReportGenerator\ReportGenerator.exe
	-reports:coverage.xml
	-targetdir:coverage

Compare with OpenCover

If you check both OpenCover .Net Core report and MiniCover ReportGenerator report you can notice some difference in metrics. First is that MiniCover does not support branch coverage. This is not that bad after all if you have your code nicely indented, line coverage is sufficient. For e.g., if your ternary operator is not on one line, but on three and you have missed testing one of the conditions, then line coverage will state that there is a not tested line. If the ternary operator is on one line though then line coverage will miss this test problem. Another difference is Coverable lines and Covered lines. OpenCover counts opening and closing brackets as such, so its numbers are bigger. Because of this conceptual difference Line coverage percentage has a small difference. MiniCover (35%) is more generous and give more percentage than OpenCover (33.6%).

Conclusion

MiniCover is very nice and compact tool that can be put in place during your continuous integration or continuous delivery to measure code coverage on each build. The most important advantage is that it is designed and works on Linux.

Related Posts

Read more...

Build a REST API with Express on Node.js and run it on Docker

Last Updated on by

Post summary: Code examples how to create RESTful API with Node.js using Express web framework and then run it on Docker.

Code below can be found in GitHub sample-nodejs-rest-stub repository. This is my first JavaScript post, so bare with me if something is not as perfect as it should be.

Node.js

Node.js is a JavaScript runtime built on Chrome’s V8 JavaScript engine. Node.js uses an event-driven, non-blocking I/O model that makes it lightweight and efficient. Node.js’ package ecosystem, npm, is the largest ecosystem of open source libraries in the world.

Create Node.js project

Node.js project is created with npm init, which guides you through a wizard with several questions.

In the end package.json file is created. This is the file with all your project’s configuration.

{
  "name": "sample-nodejs-rest-stub",
  "version": "1.0.0",
  "description": "Sample Node.js REST API",
  "main": "app.js",
  "scripts": {
    "test": "echo \"Error: no test specified\" && exit 1"
  },
  "author": "Lyudmil Latinov",
  "license": "ISC"
}

Once created it is good to add some JavaScript code to test that project is working. I called the file app.js and it will be later extended. It does nothing, but writing Hello world! to the console.

'use strict';

console.log('Hello world!');

File with JavaScript code can be run with node app.js command. You can change package.json file by adding start script and then run the application with npm start:

"scripts": {
  "start": "node app.js",
  "test": "echo \"Error: no test specified\" && exit 1"
}

Express

Express is a web framework for Node.js. It is the most used one. In order to use express it has to be added as a dependency and saved to package.json file.

npm install express --save

Change app.js in order to verify Express is working correctly. The ‘use strict’ literal is used for enabling ECMAScript 5 strict mode, which has several restrictions, like warnings are thrown as errors, usage of undeclared variables is prohibited, etc. I would prefer using constants declared with const whenever possible. Express module is assigned to express variable with require(‘express’) directive. Then new express object is created and assigned to app variable. HTTP GET endpoint that listens to ‘/’ is configured with app.get(path, callback) function. Callback is a function that is called inside another function, in our case inside get() function. In the current example, callback has arguments req and res which gives you access to Express’ Request and Response objects. What is done below is that send([body]) function on the response is called, which returns the result. Socket that listens for incoming connections is started with app.listen(path, [callback]) function. More details can be found in Express API reference documentation.

'use strict';

const express = require('express');
const app = new express();

app.get('/', (req, res) => {
    res.send('Hello World!');
});

app.listen(3000, () => {
    console.log('Server up!');
});

If you run with npm start you should see Server is up! text. Firing GET request to http://localhost:3000 should return Hello World! response.

Add REST API

Functionality is a sample Person service that is used also in Build a RESTful stub server with Dropwizard and Build a REST API with .NET Core 2 and run it on Docker Linux container posts.

The first step is to include body-parser, an Express middleware which parses request body and makes it available as an object in req.body property.

npm install body-parser --save

Express middleware is series of function calls that have access to req and res objects. Middleware is used in our application. I will explain as much as possible, if you are interested in more details you can read in Express using middleware documentation.

Person class

A standard model class or POJO is needed in order to transfer and process JSON data. It is standard ECMAScript 6 Person class with constructor which is then exported as a module with module.exports = Person.

Person repository

Again there will be no real database layer, but a functionality that acts as such. In the constructor, a Map with several Person objects is created. There are getByIdgetAllremove and save functions which simulate different CRUD operations on data. Inside them, various Map functions are used. I’m not going to explain those in details, you can read more about maps in JavaScript Map object documentation. In the end, PersonRepository is instantiated to a personRepository variable which is exported as a module. Later, when require is used, this instance will be accessible only, not the PersonRepository class itself.

Person routes

In initial example routes and their handling was done with app.get(), here express.Router is used. It is complete middleware routing system. See more in Express routing documentation. Router class is imported const Router = require(‘express’) and new instance is created const router = new Router(). Registering path handlers is same as in application. There are get(), post(), etc., functions resp. for GET and POST requests. Specific when using the router is that it should be registered as application middleware with app.use(‘/person’, router). This makes router handle all defined in it paths which are now under /person base path. Current route configuration is defined as a function with name getPersonRoutes which takes app as an argument. This function is exported as a module.

Application

Important bit here is require(‘./routes/personRoutes’)(app) which uses getPersonRoutes function and registers person routes.

person.js

'use strict';

class Person {
    constructor(id, firstName, lastName, email) {
        this.id = id;
        this.firstName = firstName;
        this.lastName = lastName;
        this.email = email;
    }
}

module.exports = Person;

personRepository.js

'use strict';

const Person = require('../json/person');

class PersonRepository {
    constructor() {
        this.persons = new Map([
            [1, new Person(1, 'FN1', 'LN1', 'email1@email.na')],
            [2, new Person(2, 'FN2', 'LN2', 'email2@email.na')],
            [3, new Person(3, 'FN3', 'LN3', 'email3@email.na')],
            [4, new Person(4, 'FN4', 'LN4', 'email4@email.na')]
        ]);
    }

    getById(id) {
        return this.persons.get(id);
    }

    getAll() {
        return Array.from(this.persons.values());
    }

    remove() {
        const keys = Array.from(this.persons.keys());
        this.persons.delete(keys[keys.length - 1]);
    }

    save(person) {
        if (this.getById(person.id) !== undefined) {
            this.persons[person.id] = person;
            return "Updated Person with id=" + person.id;
        }
        else {
            this.persons.set(person.id, person);
            return "Added Person with id=" + person.id;
        }
    }
}

const personRepository = new PersonRepository();

module.exports = personRepository;

personRoutes.js

'use strict';

const Router = require('express');
const personRepo = require('../repo/personRepository');

const getPersonRoutes = (app) => {
    const router = new Router();

    router
        .get('/get/:id', (req, res) => {
            const id = parseInt(req.params.id);
            const result = personRepo.getById(id);
            res.send(result);
        })
        .get('/all', (req, res) => {
            const result = personRepo.getAll();
            res.send(result);
        })
        .get('/remove', (req, res) => {
            personRepo.remove();
            const result = 'Last person remove. Total count: '
                + personRepo.persons.size;
            res.send(result);
        })
        .post('/save', (req, res) => {
            const person = req.body;
            const result = personRepo.save(person);
            res.send(result);
        });

    app.use('/person', router);
};

module.exports = getPersonRoutes;

app.js

'use strict';

const express = require('express');
const app = new express();
const bodyParser = require('body-parser');

// register JSON parser middlewear
app.use(bodyParser.json());

require('./routes/personRoutes')(app);

app.listen(3000, () => {
    console.log("Server is up!");
});

Debug with Visual Studio Code

I started to like Visual Studio Code – an open source multi-platform editor maintained by Microsoft. Once project folder is imported, hitting F5 starts to debug on the project.

External configuration

External configuration from a file is a must for every serious application, so this also has to be handled. A separate config.js file is keeping the configuration and exposing it as a module. There is versionRoutes.js file added which is reading configuration value and exposing it to the API. It follows the same pattern as personRoutes.js, but it has config as function argument as well. Also, app.js has to be changed, import config and pass it to getVersionRoutes function.

config.js

'use strict';

const config = {
    version: '1.0'
};

module.exports = config;

versionRoutes.js

'use strict';

const getVersionRoutes = (app, config) => {
    app.get('/api/version', (req, res) => {
        res.send(config.version);
    });
};

module.exports = getVersionRoutes;

app.js

'use strict';

const express = require('express');
const bodyParser = require('body-parser');
const config = require('./config/config');
const app = new express();

// register JSON parser middlewear
app.use(bodyParser.json());

require('./routes/personRoutes')(app);
require('./routes/versionRoutes')(app, config);

app.listen(3000, () => {
    console.log("Server is up!");
});

Code style checker

It is very good practice to have consistency over projects code. The more important benefit of using it is that it can catch bugs that otherwise will be caught later in when the application is run. This is why using a code style checker is recommended. Most popular for JavaScript is ESLint. In order to have it has to be added as a dependency to the project:

npm install eslint --save-dev

Notice the –save-dev option, this creates a new devDependencies node in package.json. This means that project needs this packages, but just for development. Those dependencies will not be available if someone is importing your project. Entry in scripts node in package.json file can be added: “lint”: “eslint .”. This will allow you to run ESLint with npm run lint. ESLint configuration is present in .eslintrc file. In .eslintignore are listed folders to be skipped during the check.

.eslintrc

{
  "extends": "eslint:recommended",
  "parserOptions": {
    "ecmaVersion": 6
  },
  "env": {
    "es6": true,
    "node": true
  },
  "globals": {
  },
  "rules": {
    "quotes": [2, "single"]
  }
}

.eslintignore

node_modules

package.json

"scripts": {
  "start": "node app.js",
  "test": "echo \"Error: no test specified\" && exit 1",
  "lint": "eslint ."
},
...
"devDependencies": {
  "eslint": "^4.15.0"
}

app.js

'use strict';

const express = require('express');
const bodyParser = require('body-parser');
const config = require('./config/config');
const app = new express();

// register JSON parser middlewear
app.use(bodyParser.json());

require('./routes/personRoutes')(app);
require('./routes/versionRoutes')(app, config);

app.listen(3000, () => {
    /* eslint-disable */
    console.log('Server is up!');
});

During the check, some issues were found. One of the issues is that console.log() is not allowed. This is a pretty good rule as all logging should be done to some specific logger, but in our case, we need app.js to have text showing that server is up. In order to ignore this error /* eslint-disable */ comment can be used, see app.js above.

Docker file

Docker file that packs application is shown below:

FROM node:8.6-alpine

RUN mkdir -p /usr/src/app
WORKDIR /usr/src/app

COPY package.json ./
RUN npm install

COPY . .

EXPOSE 3000
CMD ["npm", "start"]

Docker container that is used is node:8.6-alpine. Folder /usr/src/app and is made current working directory. Then package.json file is copied into container and npm install is run, this will download all dependencies. All files from the current folder are copied on docker image with: COPY . .. Port 3000 is exposed so it is, later on, available from the container. With CMD is configured the command that is run when the container is started.

Build and run Docker container

Docker container is packaged with tag nodejs-rest with the following command:

docker build . -t nodejs-rest

Docker container is run with exposing port 3000 from the container to port 9000 on the host with the following command:

docker run -e VERSION=1.1 -p 9000:3000 nodejs-rest

Notice the -e VERSION=1.1 which sets an environment variable to be used inside the container. The intention is to use this variable in the application. This can be enabled with modifying config.js file by changing to: version: process.env.VERSION || ‘1.0’. If environment variable VERSION is available then save it in version, if not use 1.0.

'use strict';

const config = {
    version: process.env.VERSION || '1.0'
};

module.exports = config;

If invoked now /api/version returns 1.1.

Conclusion

In the current post, I have shown how to make very basic REST API with Node.js and Express. It can be very easy run on Docker container.

Related Posts

Read more...

Code coverage of .NET Core unit tests with OpenCover

Last Updated on by

Post summary: Examples how to measure code coverage of .NET Core unit tests with OpenCover.

Examples below are based on GitHub SampleDotNetCore2RestStub repository. Examples use code from .NET Core integration testing and mock dependencies post. Those are integration tests because they test more than one application module at a time, but they are run with a unit testing framework, this is why current post title is such.

Code coverage

This topic is how to do the code coverage on .NET Core unit tests with OpenCover. Theory on what is code coverage, why it is needed can be found in What about code coverage post.

OpenCover

OpenCover is open source tool for code coverage for .NET 2.0 and above applications for Windows only. You can read more details about OpenCover in Code coverage of manual or automated tests with OpenCover for .NET applications post or you can visit their OpenCover Wiki page.

Run OpenCover

In order to make this examples work you need to check out SampleDotNetCore2RestStub repository to C:\ and run all commands from project root folder C:\SampleDotNetCore2RestStub. OpenCover and ReportGenerator should be installed in C:\ as well. If you have different paths, just adjust them in commands shown below.

C:\OpenCover\OpenCover.Console.exe
	-target:"c:\Program Files\dotnet\dotnet.exe"
	-targetargs:"test"
	-output:coverage.xml
	-oldStyle
	-filter:"+[SampleDotNetCore2RestStub*]* -[SampleDotNetCore2RestStub*Test*]*"
	-register:user

Enable .NET Core to debug output

If you run the command above you will get the following message:

Committing…
No results, this could be for a number of reasons. The most common reasons are:
1) missing PDBs for the assemblies that match the filter please review the
output file and refer to the Usage guide (Usage.rtf) about filters.
2) the profiler may not be registered correctly, please refer to the Usage
guide and the -register switch.

Note: error with red text shown on image above is because with -targetargs:”test” dotnet.exe tries to run tests inside all projects, but src\SampleDotNetCore2RestStub simply does not have tests. You can refine which test project to get run by changing to: -targetargs:”test test\SampleDotNetCore2RestStub.Integration.Test\SampleDotNetCore2RestStub.Integration.Test.csproj”.

Message for no results is because debug output is not enabled on .NET Core project and OpenCover does not have needed data to work on. Change src\SampleDotNetCore2RestStub\SampleDotNetCore2RestStub.csproj file by adding <DebugType>full</DebugType>:

<PropertyGroup>
	<OutputType>Exe</OutputType>
	<TargetFramework>netcoreapp2.0</TargetFramework>
	<DebugType>full</DebugType>
</PropertyGroup>

Now running the command gives proper output:

Committing…
Visited Classes 5 of 12 (41.67)
Visited Methods 17 of 36 (47.22)
Visited Points 43 of 123 (34.96)
Visited Branches 18 of 44 (40.91)

==== Alternative Results (includes all methods including those without corresponding source) ====
Alternative Visited Classes 5 of 12 (41.67)
Alternative Visited Methods 20 of 43 (46.51)

Generate report

ReportGenerator is used to convert XML reports generated by OpenCoverPartCoverVisual Studio or NCover into human-readable reports in various formats. To generate report use following command:

C:\ReportGenerator\ReportGenerator.exe
	-reports:coverage.xml
	-targetdir:coverage

Inspect report

The report can be found in my examples: OpenCover .Net Core report. You can see what code is being covered during testing and what not.

Conclusion

In this post, I have shown how to run code coverage with OpenCover on .NET Core unit tests.

Related Posts

Read more...

Useful .NET Core SDK CLI commands

Last Updated on by

Post summary: Some useful .NET Core SDK CLI commands.

Commands in the current post are extraction from Build a REST API with .NET Core 2 and run it on Docker Linux container and .NET Core integration testing and mock dependencies posts where I have used them in a real project.

All commands

All available commands are accessed with: dotnet –help.

Initialise .NET projects

Creating new projects is done with: dotnet new [options]. I have used following commands:

  • Create new console application project: dotnet new console -o <ProjectName>
  • Create new MS Test project: dotnet new mstest -o <ProjectName>
  • Create new solution file: dotnet new sln –name <SolutionName>

To list all available project types use: dotnet new –help.

Custom templates

.NET SDK allows you to create custom project template and then install it to SDK with the command: dotnet new -i <TEMPLATE_FOLDER>. Once installed can create a new project out of your custom template. This is valuable in big organizations where cohesion is needed between similar project types. See more for templates in Custom templates for dotnet new article.

Manage dependencies

Two types of dependencies are available, in a NuGet package or to another .NET project.

  • Add reference to a NuGet package: dotnet add package <NuGetPackageName>
  • Add reference to another .NET project: dotnet add reference <PathToProjectFile>

Similar commands are available for removing dependencies with: dotnet remove.

Project dependencies can be shown with: dotnet list reference <PathToProjectFile>.

Actions on a project

  • Build a project: dotnet build
  • Run a project: dotnet run
  • Run tests of a project: dotnet test
  • Publish project artefacts: dotnet publish

All commands have a bunch of configuration options to be provided. More details on each command can be obtained by adding –help at the end.

Manage solution file

  • Create new solution file: dotnet new sln –name <SolutionName>
  • Add project to solution: dotnet sln <SolutionFileName> add <PathToProjectFile>
  • Remove project from solution: dotnet sln <SolutionFileName> remove <PathToProjectFile>
  • List projects in a solution: dotnet sln <SolutionFileName> list

Conclusion

This post is showing some useful .NET SDK CLI commands that make management of .NET project easy without Visual Studio 2017. More practical examples can be found in post where those commands are actually used: Build a REST API with .NET Core 2 and run it on Docker Linux container and .NET Core integration testing and mock.

Related Posts

Read more...

.NET Core integration testing and mock dependencies

Last Updated on by

Post summary: How to do integration testing on .NET Core application and stub or mock some inconvenient dependencies.

Code below can be found in GitHub SampleDotNetCore2RestStub repository. In Build a REST API with .NET Core 2 and run it on Docker Linux container post I have shown how to create .NET Core application. In the current post, I will show how to do integration testing on the same application. The post is for REST API, but principles here apply for web UI as well, the difference is that the response will be HTML, which is slightly harder to process compared to JSON.

Refactor project structure

Currently, there is only one project created which contains .NET Core application. Since this is going to grow it has to be refactored and structured properly.

  • SampleDotNetCore2RestStub folder which contains the API is moved to src folder.
  • Solution file is created with dotnet new sln –name SampleDotNetCore2RestStub. Note that .sln extension is omitted as it is added automatically. Although everything in the example is done with open source tools, it is good to have solution file to keep compatibility with Visual Studio 2017.
  • API project file is added to solution file with:
    dotnet sln SampleDotNetCore2RestStub.sln add src/SampleDotNetCore2RestStub/SampleDotNetCore2RestStub.csproj.
  • In order to test that moving of files did not affected the functionality, API can be run with: dotnet run –project src/SampleDotNetCore2RestStub/SampleDotNetCore2RestStub.csproj.

Add test project

It is time to create integration tests project. We speak for integration tests, but they will be run with unit testing framework MSTest. I do not have some particular favor of it, it comes by default with .NET Core, along with xUnit, and I do not want to change it.

  • Create test folder: mkdir test.
  • Navigate to it: cd test.
  • MSTest project is created with: dotnet new mstest -o SampleDotNetCore2RestStub.Integration.Test.
  • Navigate to test project: cd SampleDotNetCore2RestStub.Integration.Test.
  • Run the unit tests: dotnet test. By default, there is one dummy test that passes.
  • Go to root folder: cd .. and cd ..
  • Add test project to solution file: dotnet sln SampleDotNetCore2RestStub.sln add test/SampleDotNetCore2RestStub.Integration.Test/SampleDotNetCore2RestStub.Integration.Test.csproj.

Open with Visual Studio Code

Once refactored and opened in Visual Studio Code project has following structure:

Unit vs Integration testing

I would not like to focus on theory and terminology as this post is not intended to, but I have to do some theoretical setup before proceeding with the code. Generally speaking, term integration testing is used in two cases. One is when different systems are interconnected together and tested, other is when different components of one system are grouped together and tested. In the current post with term integration testing, I will refer the latter. In unit testing, each separate class is tested in isolation. In order to do so all external dependencies, like a database, file system, web requests, and response, etc., are mocked. This makes tests run very fast but has a very high risk of false positives because of mocking. When mocking a dependency there is always an assumption how it works and is being used. The mocked behavior might be significantly different than actual one, then the unit test is compromised. On the other hand, integration testing verifies that different parts of the application work correctly when grouped together. It is much slower than unit testing because more and real resources are being used. Some parts of the application still can be mocked which can increase execution time. In the current post, I will show how to run a full application with only the database being mocked.

The Test Host

One way to run the fully assembled application is by building and deploying it. Then, the application will use real resources to work. Functional testing should also be done during testing but is not part of the current post. A more interesting scenario is to run fully assembled or partially mocked application in memory, without deployment and run tests against it. This approach has benefits, e.g. since the application is run locally its response time is very low, which speeds up tests; some parts, like database connection, can be mocked and thus speed up tests. .NET Core Test Host is a tool that can host web or API .NET Core applications serving requests and responses. It eliminates the need for having a testing environment.

Add dependencies

In order to use test host dependency to its NuGet package should be added. Navigate to test/SampleDotNetCore2RestStub.Integration.Test and add a dependency:

dotnet add package Microsoft.AspNetCore.TestHost

SampleDotNetCore2RestStub.Integration.Test project should depend on SampleDotNetCore2RestStub in order to use its code. This is done with:

dotnet add reference ../../src/SampleDotNetCore2RestStub/SampleDotNetCore2RestStub.csproj

Create the first test

Existing UnitTest1 class will be changed to start application inside test host and make a request.

using System.Net.Http;
using System.Threading.Tasks;
using Microsoft.AspNetCore.Hosting;
using Microsoft.AspNetCore.TestHost;
using Microsoft.VisualStudio.TestTools.UnitTesting;
using Newtonsoft.Json;
using SampleDotNetCore2RestStub.Models;

namespace SampleDotNetCore2RestStub.Integration.Test
{
	[TestClass]
	public class PersonsTest
	{
		private TestServer _server;
		private HttpClient _client;

		[TestInitialize]
		public void TestInitialize()
		{
			_server = new TestServer(new WebHostBuilder()
				.UseStartup<Startup>());
			_client = _server.CreateClient();
		}

		[TestMethod]
		public async Task GetPerson()
		{
			var response = await _client.GetAsync("/person/get/1");
			response.EnsureSuccessStatusCode();

			var result = await response.Content.ReadAsStringAsync();
			var person = JsonConvert.DeserializeObject<Person>(result);

			Assert.AreEqual("LN1", person.LastName);
		}
	}
}

TestServer uses an instance of IWebHostBuilder. Startup from UseStartup<Startup> is same class that is used to run the application, but here it is run inside TestServer instance. CreateClient() method returns instance of standard HttpClient, with which request to /person/get/1 endpoint is made. EnsureSuccessStatusCode() throws exception if response code is not inside 200-299 range. The response is then taken as a string and deserialized to Person object with Newtonsoft.Json, which is now part of .NET Core.

Test can be run from test\SampleDotNetCore2RestStub.Integration.Test folder with the command: dotnet test. If you type dotnet test from root folder it will search for tests inside all projects.

Debug tests in Visual Studio Code

Before proceeding any further with the code it should be possible to debug unit tests inside VS Code. It is not as easy as with VS 2017, but still manageable. First, you need to run your test from the command prompt in debug mode:

set VSTEST_HOST_DEBUG=1
dotnet test

Once this is done there is a message with specific process ID:

Starting test execution, please wait...
Host debugging is enabled. Please attach debugger to testhost process to continue.
Process Id: 16032, Name: dotnet

Now from Visual Studio Code, you have to attach to given process, 16032 in the current example. This is done from Debug View, then select .Net Core Attach launch configuration. If such is not existing, add it. Running this configuration shows a list of all processes with name dotnet. Select the proper one, 16032 in the current example.

Create PersonServiceClient and BaseTest

Tests should be easy to write, read and maintain, thus PersonServiceClient class is created. It exposes methods that hit the endpoints and return the result. Since testing is not only happy path, it should be possible to have some negative scenarios. You may want to hit the API with invalid data and verify it returns BadRequest (400) HTTP response code, or Unauthorized (401) HTTP response code, etc. In order to fulfill this test requirement, a separate class ApiResponse<T> is created. It stores response code along with response content as a string. In case that response string can be deserialized to an object of given generic type T it is also stored in ApiResponse object.

Client is instantiated as a protected variable in BaseTest constructor. PersonsTest extends BaseTest and has access to PersonServiceClient.

ApiResponse

using System.Net;

namespace SampleDotNetCore2RestStub.Integration.Test.Client
{
	public class ApiResponse<T>
	{
		public HttpStatusCode StatusCode { get; set; }
		public T Result { get; set; }
		public string ResultAsString { get; set; }
	}
}

PersonServiceClient

using System;
using System.Collections.Generic;
using System.Net.Http;
using System.Threading.Tasks;
using Newtonsoft.Json;
using SampleDotNetCore2RestStub.Models;

namespace SampleDotNetCore2RestStub.Integration.Test.Client
{
	public class PersonServiceClient
	{
		private readonly HttpClient _httpClient;

		public PersonServiceClient(HttpClient httpClient)
		{
			_httpClient = httpClient;
		}

		public async Task<ApiResponse<Person>> GetPerson(string id)
		{
			var person = await GetAsync<Person>($"/person/get/{id}");
			return person;
		}

		public async Task<ApiResponse<List<Person>>> GetPersons()
		{
			var persons = await GetAsync<List<Person>>("/person/all");
			return persons;
		}

		public async Task<ApiResponse<string>> Version()
		{
			var version = await GetAsync<string>("api/version");
			return version;
		}

		private async Task<ApiResponse<T>> GetAsync<T>(string path)
		{
			var response = await _httpClient.GetAsync(path);
			var value = await response.Content.ReadAsStringAsync();
			var result = new ApiResponse<T>
			{
				StatusCode = response.StatusCode,
				ResultAsString = value
			};

			try
			{
				result.Result = JsonConvert.DeserializeObject<T>(value);
			}
			catch (Exception)
			{
				// Nothing to do
			}

			return result;
		}
	}
}

BaseTest

using System.Net.Http;
using Microsoft.AspNetCore.Hosting;
using Microsoft.AspNetCore.TestHost;
using Microsoft.VisualStudio.TestTools.UnitTesting;
using SampleDotNetCore2RestStub.Integration.Test.Client;

namespace SampleDotNetCore2RestStub.Integration.Test
{
	public abstract class BaseTest
	{
		protected PersonServiceClient PersonServiceClient;

		public BaseTest()
		{
			var server = new TestServer(new WebHostBuilder()
				.UseStartup<Startup>());
			var httpClient = server.CreateClient();
			PersonServiceClient = new PersonServiceClient(httpClient);
		}
	}
}

PersonsTest

using System.Net;
using System.Net.Http;
using System.Threading.Tasks;
using Microsoft.AspNetCore.Hosting;
using Microsoft.AspNetCore.TestHost;
using Microsoft.VisualStudio.TestTools.UnitTesting;
using Newtonsoft.Json;
using SampleDotNetCore2RestStub.Models;

namespace SampleDotNetCore2RestStub.Integration.Test
{
	[TestClass]
	public class PersonsTest : BaseTest
	{
		[TestMethod]
		public async Task GetPerson()
		{
			var response = await PersonServiceClient.GetPerson("1");

			Assert.AreEqual(HttpStatusCode.OK, response.StatusCode);
			Assert.AreEqual("LN1", response.Result.LastName);
		}

		[TestMethod]
		public async Task GetPersons()
		{
			var response = await PersonServiceClient.GetPersons();

			Assert.AreEqual(HttpStatusCode.OK, response.StatusCode);
			Assert.AreEqual(4, response.Result.Count);
			Assert.AreEqual("LN1", response.Result[0].LastName);
		}
    }
}

Stub the database

So far there is integration test that starts the application with its actual external dependencies and makes requests against it. Current API service does not connect to a real database, because this will make running the API harder. Instead, there is a fake PersonRepository which stores data in memory. In reality, the repository will connect to a database with a given connection string in appsettings.json, and will perform CRUD operations on it. Database operations might slow down the application response time, or test might not have full control over data in the database, which makes testing harder. In order to solve those two issues database can be stubbed to serve test data. Actually, anything that is not convenient can be stubbed with the examples given below.

In order to make stubbing possible and to keep application structure intact Startup has to be changed. Registering PersonRepository to .NET Core IoC container is extracted to separate virtual method that can be overridden later. All dependencies that are to be stubbed or mocked can be extracted to such methods. Then StartupStub overrides this method and registers stubbed repository PersonRepositoryStub. In it all database operations are substituted with an in-memory equivalence, hence skipping database calls. It might not be a full and accurate substitution, as long as it serves your testing purpose. After all this PersonRepositoryStub will be used only for testing. BaseTest should be changed to start the application with StartupStub instead of Statup. Finally, PersonsTest should be changed to assert on new data that is configured in PersonRepositoryStub.

Startup

public void ConfigureServices(IServiceCollection services)
{
	services.AddMvc();
	services.Configure<AppConfig>(Configuration);
	services.AddScoped<AuthenticationFilterAttribute>();

	ConfigureRepositories(services);
}

public virtual void ConfigureRepositories(IServiceCollection services)
{
	services.AddSingleton<IPersonRepository, PersonRepository>();
}

StartupStub

using Microsoft.Extensions.DependencyInjection;
using SampleDotNetCore2RestStub.Repositories;

namespace SampleDotNetCore2RestStub.Integration.Test.Mocks
{
	public class StartupStub : Startup
	{
		public override void ConfigureRepositories(IServiceCollection services)
		{
			services.AddSingleton<IPersonRepository, PersonRepositoryStub>();
		}
	}
}

PersonRepositoryStub

using System.Collections.Generic;
using System.Linq;
using SampleDotNetCore2RestStub.Models;
using SampleDotNetCore2RestStub.Repositories;

namespace SampleDotNetCore2RestStub.Integration.Test.Mocks
{
	public class PersonRepositoryStub : IPersonRepository
	{
		private Dictionary<int, Person> _persons 
					= new Dictionary<int, Person>();

		public PersonRepositoryStub()
		{
			_persons.Add(1, new Person
			{
				Id = 1,
				FirstName = "Stubed FN1",
				LastName = "Stubed LN1",
				Email = "stubed.email1@email.na"
			});
		}

		public Person GetById(int id)
		{
			return _persons[id];
		}

		public List<Person> GetAll()
		{
			return _persons.Values.ToList();
		}

		public int GetCount()
		{
			return _persons.Count();
		}

		public void Remove()
		{
			if (_persons.Keys.Any())
			{
				_persons.Remove(_persons.Keys.Last());
			}
		}

		public string Save(Person person)
		{
			if (_persons.ContainsKey(person.Id))
			{
				_persons[person.Id] = person;
				return "Updated Person with id=" + person.Id;
			}
			else
			{
				_persons.Add(person.Id, person);
				return "Added Person with id=" + person.Id;
			}
		}
	}
}

BaseTest

using System.Net.Http;
using Microsoft.AspNetCore.Hosting;
using Microsoft.AspNetCore.TestHost;
using Microsoft.VisualStudio.TestTools.UnitTesting;
using SampleDotNetCore2RestStub.Integration.Test.Client;
using SampleDotNetCore2RestStub.Integration.Test.Mocks;

namespace SampleDotNetCore2RestStub.Integration.Test
{
	public abstract class BaseTest
	{
		protected PersonServiceClient PersonServiceClient;

		public BaseTest()
		{
			var server = new TestServer(new WebHostBuilder()
				.UseStartup<StartupStub>());
			var httpClient = server.CreateClient();
			PersonServiceClient = new PersonServiceClient(httpClient);
		}
	}
}

PersonsTest

using System.Net;
using System.Net.Http;
using System.Threading.Tasks;
using Microsoft.AspNetCore.Hosting;
using Microsoft.AspNetCore.TestHost;
using Microsoft.VisualStudio.TestTools.UnitTesting;
using Newtonsoft.Json;
using SampleDotNetCore2RestStub.Models;

namespace SampleDotNetCore2RestStub.Integration.Test
{
	[TestClass]
	public class PersonsTest : BaseTest
	{
		[TestMethod]
		public async Task GetPerson()
		{
			var response = await PersonServiceClient.GetPerson("1");

			Assert.AreEqual(HttpStatusCode.OK, response.StatusCode);
			Assert.AreEqual("Stubed LN1", response.Result.LastName);
		}

		[TestMethod]
		public async Task GetPersons()
		{
			var response = await PersonServiceClient.GetPersons();

			Assert.AreEqual(HttpStatusCode.OK, response.StatusCode);
			Assert.AreEqual(1, response.Result.Count);
			Assert.AreEqual("Stubed LN1", response.Result[0].LastName);
		}
	}
}

Mock the database

Stubbing is an option, but mocking is much better as you have direct control over the mock itself. The most famous .NET mocking framework is Moq. It is added to the project with the command:

dotnet add package Moq

StartupMock extends Starup and overrides its ConfigureRepositories. It registers an instance of IPersonRepository which is injected by its constructor. BaseTest is changed to use StartupMock in UseStartup method. Repository mock is instantiated with PersonRepositoryMock = new Mock<IPersonRepository>(). It is injected into StartupMock constructor with ConfigureServices(services => services.AddSingleton(PersonRepositoryMock.Object)). This is how mock instance is registered into IoC container of .NET Core application that is being tested. Once the mock instance is registered it can be controlled. In BaseTest it is reset to defaults after each test with BaseTearDown method. It is run after each test because of [TestCleanup] MSTest attribute. Inside, the PersonRepositoryMock.Reset() resets mock state.

Test specific setup can be done for each test. For e.g. GetPerson_ReturnsCorrectResult has following setup: PersonRepositoryMock.Setup(x => x.GetById(It.IsAny<int>())).Returns(_person); That means when mock’s GetById method is called with whatever int value the _person object is returned. Another example is GetPerson_ThrowsException test. When mock’s GetById is called then InvalidOperationException is thrown. In this way, you can test exception handling, which in current demo application is missing. The exception is not that easy to reproduce if you are using repository stubbing.

StartupMock

using Microsoft.Extensions.DependencyInjection;
using SampleDotNetCore2RestStub.Repositories;

namespace SampleDotNetCore2RestStub.Integration.Test.Mocks
{
	public class StartupMock : Startup
	{
		private IPersonRepository _personRepository;
		
		public StartupMock(IPersonRepository personRepository)
		{
			_personRepository = personRepository;
		}

		public override void ConfigureRepositories(IServiceCollection services)
		{
			services.AddSingleton(_personRepository);
		}
	}
}

BaseTest

using System.Net.Http;
using Microsoft.AspNetCore.Hosting;
using Microsoft.AspNetCore.TestHost;
using Microsoft.VisualStudio.TestTools.UnitTesting;
using Microsoft.Extensions.DependencyInjection;
using Moq;
using SampleDotNetCore2RestStub.Integration.Test.Client;
using SampleDotNetCore2RestStub.Integration.Test.Mocks;
using SampleDotNetCore2RestStub.Repositories;

namespace SampleDotNetCore2RestStub.Integration.Test
{
	public abstract class BaseTest
	{
		protected PersonServiceClient PersonServiceClient;
		protected Mock<IPersonRepository> PersonRepositoryMock;

		public BaseTest()
		{
			PersonRepositoryMock = new Mock<IPersonRepository>();

			var server = new TestServer(new WebHostBuilder()
				.UseStartup<StartupMock>()
				.ConfigureServices(services =>
				{
					services.AddSingleton(PersonRepositoryMock.Object);
				}));

			var httpClient = server.CreateClient();
			PersonServiceClient = new PersonServiceClient(httpClient);
		}

		[TestCleanup]
		public void BaseTearDown()
		{
			PersonRepositoryMock.Reset();
		}
	}
}

PersonsTest

using System;
using System.Collections.Generic;
using System.Net;
using System.Net.Http;
using System.Threading.Tasks;
using Microsoft.AspNetCore.Hosting;
using Microsoft.AspNetCore.TestHost;
using Microsoft.VisualStudio.TestTools.UnitTesting;
using Moq;
using Newtonsoft.Json;
using SampleDotNetCore2RestStub.Models;

namespace SampleDotNetCore2RestStub.Integration.Test
{
	[TestClass]
	public class PersonsTest : BaseTest
	{
		private readonly Person _person = new Person
		{
			Id = 1,
			FirstName = "Mocked FN1",
			LastName = "Mocked LN1",
			Email = "mocked.email1@email.na"
		};

		[TestMethod]
		public async Task GetPerson_ReturnsCorrectResult()
		{
			PersonRepositoryMock.Setup(x => x.GetById(It.IsAny<int>()))
				.Returns(_person);

			var response = await PersonServiceClient.GetPerson("1");

			Assert.AreEqual(HttpStatusCode.OK, response.StatusCode);
			Assert.AreEqual("Mocked LN1", response.Result.LastName);
		}

		[TestMethod]
		[ExpectedException(typeof(InvalidOperationException))]
		public async Task GetPerson_ThrowsException()
		{
			PersonRepositoryMock.Setup(x => x.GetById(It.IsAny<int>()))
				.Throws(new InvalidOperationException());

			var result = await PersonServiceClient.GetPerson("1");
		}

		[TestMethod]
		public async Task GetPersons()
		{
			PersonRepositoryMock.Setup(x => x.GetAll())
				.Returns(new List<Person> { _person });

			var response = await PersonServiceClient.GetPersons();

			Assert.AreEqual(HttpStatusCode.OK, response.StatusCode);
			Assert.AreEqual(1, response.Result.Count);
			Assert.AreEqual("Mocked LN1", response.Result[0].LastName);
		}
	}
}

Nicer database mock

As of version 2.1.0 of Microsoft.AspNetCore.TestHost, which is currently in pre-release, there is a method called ConfigureTestServices, which saves us from having separate StartupMock class. You can directly inject your mocks with following code:

var server = new TestServer(new WebHostBuilder()
	.UseStartup<Startup>()
	.ConfigureTestServices(services =>
	{
		services.AddSingleton(PersonRepositoryMock.Object);
	}));

Conclusion

In the current post, I have shown how to do integration testing on .NET Core applications. This is a very convenient approach which eliminates some of the disadvantages of stubbing or mocking all dependencies in unit testing. Because of using all dependencies, integration testing can be much slower. This can be improved by mocking some of them. Integration testing is not a substitute for unit testing, nor for functional testing, but it is a good approach in you testing portfolio that should be considered.

Related Posts

Read more...

Build a REST API with .NET Core 2 and run it on Docker Linux container

Last Updated on by

Post summary: Code examples how to create RESTful API with .NET Core 2.0 and then run it on Docker Linux container.

Code below can be found in GitHub SampleDotNetCore2RestStub repository. In the current post is shown a sample application that can be a very good foundation for a real production application. This project can be easily used as a template for real API service.

Microsoft and open source

I was doing Java for about 2 years and got back to .NET six months ago. Recently we had to do a project in .NET Core 2.0, a technology I haven’t heard of before. I was truly amazed how much open source Microsoft had begun. .NET now can be developed and even run on Linux. This definitely makes it really competitive to Java which advantage was multi-platform ability. Another benefit is that documentation is very extensive and there is a huge community out there that makes solving issues really fast and easy.

.NET Core

In short .NET Core is a cross-platform development platform supporting Windows, macOS, and Linux, and can be used in device, cloud, and embedded/IoT scenarios. It is maintained by Microsoft and the .NET community on GitHub. More can be read on .NET Core Guide.

.NET Core 2.0

The special thing about .NET Core 2.0 is the implementation of .NET Standard 2.0. This makes it possible to use almost 70% of already existing NuGet packages, which is a big step forward and eases development of .NET applications because of reusability.

Create simple .NET Core project

Making default .NET Core console application is really simple:

  1. Download and install .NET Core SDK. For Windows and MacOS there are installers available. For Linux it depends on distribution used, see more at .NET Core Linux installation guide.
  2. Create an application with following command: dotnet new console -o ProjectName. Option -o specifies the output folder to be created which also becomes the project name. If -o is omitted then the project will be created in the current folder with current folder’s name.
  3. Run the newly created application with: dotnet run.

Using Visual Studio Code

Once the project is created it can be developed in any text editor. Most convenient is Visual Studio 2017 because it provides lots of tools that make development very fast and efficient. In this tutorial, I will be using Visual Studio Code – open-source multi-platform editor maintained by Microsoft. I admit it is much harder that Visual Studio 2017 but is free and multi-platform. Once project folder is imported, hitting Ctrl+F5 runs the project.

ASP.NET Core MVC

ASP.NET Core MVC provides features to build web APIs or web UIs. It has to be used in order to continue with the current example. Dependency to its NuGet package is added with the following command:

dotnet add package Microsoft.AspNetCore
dotnet add package Microsoft.AspNetCore.All

Create REST API

After project structure is done it is time to add classes needed to make the REST API. Functionality is very similar to one described in Build a RESTful stub server with Dropwizard post. There is a Person API which can retrieve, save or delete persons. They are kept in an in-memory data structure which mimics DB layer. Following classes are needed:

  • PersonController – a controller that exposes the API endpoints. By extending Controller class the runtime makes all endpoints available as long as they have proper routing. In current example routing is done inside action attributes [HttpGet(“person/get/{id}”)]. There are different routing options described in this extensive documentation Routing to Controller Actions. Adding of person is done with POST: [HttpPost(“person/save”)]. The important bit here is [FromBody] attribute which takes HTTP body and deserializes it to a Person object.
  • Person – this is data model class with properties.
  • PersonRepository – in-memory DB abstraction that keeps the data in a Dictionary. In reality, there will be DB layer responsible for managing data.
  • Startup – class with services configuration. Both ConfigureServices and Configure methods are called behind the scenes from the runtime. Any configurations needed goes to those two methods. Current configuration adds MVC to services and instructs the application to use it. This is not really Model View Controller pattern, but this is what is needed to enable controllers and get API running.
  • Program – main program entry point where web host is built and started. It uses Startup.cs to run the configurations. More details on WebHost can be found in Hosting in ASP.NET Core. This article also shows how the external configuration is managed, something that will be presented later in the current post.

PersonController

using System.Collections.Generic;
using System.Linq;
using Microsoft.AspNetCore.Mvc;
using SampleDotNetCore2RestStub.Models;
using SampleDotNetCore2RestStub.Repositories;

namespace SampleDotNetCore2RestStub.Controllers
{
	public class PersonController : Controller
	{
		[HttpGet("person/get/{id}")]
		public Person GetPerson(int id)
		{
			return PersonRepository.GetById(id);
		}

		[HttpGet("person/remove")]
		public string RemovePerson()
		{
			PersonRepository.Remove();
			return "Last person remove. Total count: " 
						+ PersonRepository.GetCount();
		}

		[HttpGet("person/all")]
		public List<Person> GetPersons()
		{
			return PersonRepository.GetAll();
		}

		[HttpPost("person/save")]
		public string AddPerson([FromBody]Person person)
		{
			return PersonRepository.Save(person);
		}
	}
}

Person

namespace SampleDotNetCore2RestStub.Models
{
	public class Person
	{
		public int Id { get; set; }
		public string FirstName { get; set; }
		public string LastName { get; set; }
		public string Email { get; set; }
	}
}

PersonRepository

using System.Collections.Generic;
using System.Linq;
using SampleDotNetCore2RestStub.Models;

namespace SampleDotNetCore2RestStub.Repositories
{
	public class PersonRepository
	{
		private static Dictionary<int, Person> PERSONS 
								= new Dictionary<int, Person>();

		static PersonRepository()
		{
			PERSONS.Add(1, new Person
			{
				Id = 1,
				FirstName = "FN1",
				LastName = "LN1",
				Email = "email1@email.na"
			});
			PERSONS.Add(2, new Person
			{
				Id = 2,
				FirstName = "FN2",
				LastName = "LN2",
				Email = "email2@email.na"
			});
		}

		public static Person GetById(int id)
		{
			return PERSONS[id];
		}

		public static List<Person> GetAll()
		{
			return PERSONS.Values.ToList();
		}

		public static int GetCount()
		{
			return PERSONS.Count();
		}

		public static void Remove()
		{
			if (PERSONS.Keys.Any())
			{
				PERSONS.Remove(PERSONS.Keys.Last());
			}
		}

		public static string Save(Person person)
		{
			var result = "";
			if (PERSONS.ContainsKey(person.Id))
			{
				result = "Updated Person with id=" + person.Id;
			}
			else
			{
				result = "Added Person with id=" + person.Id;
			}
			PERSONS.Add(person.Id, person);
			return result;
		}
	}
}

Startup

using Microsoft.Extensions.Configuration;
using Microsoft.Extensions.DependencyInjection;
using Microsoft.AspNetCore.Hosting;
using Microsoft.AspNetCore.Builder;

namespace SampleDotNetCore2RestStub
{
	public class Startup
	{
		public Startup(IConfiguration configuration)
		{
			Configuration = configuration;
		}

		public IConfiguration Configuration { get; }

		public void ConfigureServices(IServiceCollection services)
		{
			services.AddMvc();
		}

		public void Configure(IApplicationBuilder app,
					IHostingEnvironment env)
		{
			app.UseMvc();
		}
	}
}

Program

using System;
using Microsoft.AspNetCore;
using Microsoft.AspNetCore.Hosting;

namespace SampleDotNetCore2RestStub
{
	public class Program
	{
		public static void Main(string[] args)
		{
			BuildWebHost(args).Run();
		}

		public static IWebHost BuildWebHost(string[] args) =>
			WebHost.CreateDefaultBuilder(args)
				.UseStartup<Startup>()
				.Build();
	}
}

External configuration

Service so far is pretty much useless as it does not give an opportunity for external configurations. Adding external configuration consist of adding and changing following files:

    • VersionController – controller to actually show full working configuration. Routing in this controller is handled by [Route(“api/[controller]”)]. This exposes /api/version endpoint because [controller] is a template that stands for controller name. Controller constructor takes IOptions object and extracts Value out of it. Actual object value is injected in Startup.cs.
    • appsettings.json – JSON file with application configurations.
    • AppConfig – data model class that represents JSON configuration as an object.
    • Startup – change is needed to read file appsettings.json and bind it to AppConfig object. Configuration is read with: var configurationBuilder = new ConfigurationBuilder().AddJsonFile(“appsettings.json”, false, true) then it is saved internally with Configuration = configurationBuilder.Build(). JSON configuration is bound to a AppConfig object with following line: services.Configure<AppConfig>(Configuration).
    • SampleDotNetCore2RestStub.csproj – change is needed in the project file to instruct build process to copy appsettings.json to the output folder. This is where VS 2017 makes it much easier as it exposes property config to change, with VS Code you have to edit the csproj XML.

VersionController

using Microsoft.AspNetCore.Mvc;
using Microsoft.Extensions.Options;

namespace SampleDotNetCore2RestStub.Controllers
{
	[Route("api/[controller]")]
	public class VersionController : Controller
	{
		private readonly AppConfig _config;

		public VersionController(IOptions<AppConfig> options)
		{
			_config = options.Value;
		}

		[HttpGet]
		public string Version()
		{
			return _config.Version;
		}
	}
}

appsettings.json

{
	"Version": "1.0"
}

AppConfig

namespace SampleDotNetCore2RestStub
{
	public class AppConfig
	{
		public string Version { get; set; }
	}
}

Startup

using Microsoft.Extensions.Configuration;
using Microsoft.Extensions.DependencyInjection;
using Microsoft.AspNetCore.Hosting;
using Microsoft.AspNetCore.Builder;

namespace SampleDotNetCore2RestStub
{
	public class Startup
	{
		public Startup()
		{
			var configurationBuilder = new ConfigurationBuilder()
				.AddJsonFile("appsettings.json", false, true);

			Configuration = configurationBuilder.Build();
		}

		public IConfiguration Configuration { get; }

		public void ConfigureServices(IServiceCollection services)
		{
			services.AddMvc();
			services.Configure<AppConfig>(Configuration);
		}

		public void Configure(IApplicationBuilder app, 
					IHostingEnvironment env)
		{
			app.UseMvc();
		}
	}
}

csproj

<ItemGroup>
	<None Include="appsettings.json" CopyToOutputDirectory="Always" />
</ItemGroup>

Request filtering

An almost mandatory feature is to have some kind of filtering on the request. The current example will provide a very basic implementation of authentication filter achieved with an attribute. Following files are needed:

  • SecurePersonController – controller that demonstrates filtering. The controller is no more different than other discussed above. Important bit is [ServiceFilter(typeof(AuthenticationFilterAttribute))] which assigns AuthenticationFilterAttribute to current controller.
  • AuthenticationFilterAttribute – very basic implementation to illustrate how it works. Request headers are extracted from HttpContext and are checked for the existence of Authorization. If not found Exception is thrown. In next section, I will show how to handle this exception more gracefully.
  • StartupAuthenticationFilterAttribute is registered to runtime with: services.AddScoped<AuthenticationFilterAttribute>(). .NET Core dependency injection mechanism is used here, which I have described it in more details in separate section below.

SecurePersonController

using System.Collections.Generic;
using Microsoft.AspNetCore.Mvc;
using SampleDotNetCore2RestStub.Attributes;
using SampleDotNetCore2RestStub.Models;
using SampleDotNetCore2RestStub.Repositories;

namespace SampleDotNetCore2RestStub.Controllers
{
	[ServiceFilter(typeof(AuthenticationFilterAttribute))]
	public class SecurePersonController : Controller
	{
		[HttpGet("secure/person/all")]
		public List<Person> GetPersons()
		{
			return PersonRepository.GetAll();
		}
	}
}

AuthenticationFilterAttribute

using System;
using System.Linq;
using Microsoft.AspNetCore.Mvc.Filters;

namespace SampleDotNetCore2RestStub.Attributes
{
	public class AuthenticationFilterAttribute : ActionFilterAttribute
	{
		public override void OnActionExecuting(ActionExecutingContext ctx)
		{
			string authKey = ctx.HttpContext.Request
					.Headers["Authorization"].SingleOrDefault();

			if (string.IsNullOrWhiteSpace(authKey))
				throw new Exception();
		}
	}
}

Startup

public void ConfigureServices(IServiceCollection services)
{
	services.AddMvc();
	services.Configure<AppConfig>(Configuration);
	services.AddScoped<AuthenticationFilterAttribute>();
}

If endpoint /secure/person/all is queried without Authorization header there is 500 Internal Server Error response from the application. If Authorization header is present with any value all persons are retrieved.

Middleware

Middleware is a software that is assembled into an application pipeline to handle requests and responses. Each component chooses whether to pass the request to the next component in the pipeline or perform work before that. More on middleware can be found in ASP.NET Core Middleware Fundamentals. In current example middleware is used to handle better exceptions. In the previous point, AuthenticationFilterAttribute was throwing an exception which was transformed to 500 Internal Server Error which is not pretty. In case of not authorized application should return 401 Unauthorized. In order to do this following files are needed:

  • HttpException – a custom exception which then will be caught and processed in HttpExceptionMiddleware.
  • HttpExceptionMiddleware – this is where handling happens. Code checks for custom HttpException and if such is thrown pipeline changes HttpContext.Response object with proper values.
  • AuthenticationFilterAttribute – instead of Exception filter attribute throws new
    HttpException(HttpStatusCode.Unauthorized). This way middleware will get invoked.
  • Startup – middleware get registered here with app.UseMiddleware<HttpExceptionMiddleware>(). It is extremely important that this stands before app.UseMvc() otherwise it will not work.

HttpException

using System;
using System.Net;

namespace SampleDotNetCore2RestStub.Exceptions
{
	public class HttpException : Exception
	{
		public int StatusCode { get; }

		public HttpException(HttpStatusCode httpStatusCode)
			: base(httpStatusCode.ToString())
		{
			this.StatusCode = (int)httpStatusCode;
		}
	}
}

HttpExceptionMiddleware

using System.Threading.Tasks;
using Microsoft.AspNetCore.Http;
using Microsoft.AspNetCore.Http.Features;
using SampleDotNetCore2RestStub.Exceptions;

namespace SampleDotNetCore2RestStub.Middleware
{
	public class HttpExceptionMiddleware
	{
		private readonly RequestDelegate _next;

		public HttpExceptionMiddleware(RequestDelegate next)
		{
			_next = next;
		}

		public async Task Invoke(HttpContext context)
		{
			try
			{
				await _next.Invoke(context);
			}
			catch (HttpException httpException)
			{
				context.Response.StatusCode = httpException.StatusCode;
				var feature = context.Features.Get<IHttpResponseFeature>();
				feature.ReasonPhrase = httpException.Message;
			}
		}
	}
}

AuthenticationFilterAttribute

public override void OnActionExecuting(ActionExecutingContext context)
{
	string authKey = context.HttpContext.Request
			.Headers["Authorization"].SingleOrDefault();

	if (string.IsNullOrWhiteSpace(authKey))
		throw new HttpException(HttpStatusCode.Unauthorized);
}

Startup

public void Configure(IApplicationBuilder app, IHostingEnvironment env)
{
	app.UseMiddleware<HttpExceptionMiddleware>();
	app.UseMvc();
}

Dependency Injection

So far there is running service with basic functionality. It is missing very important bit though, something that should have been considered and added earlier. Actually, it was added but only when registering AuthenticationFilterAttribute, but here I will go into more details. Dependency injection (DI) is a technique for achieving loose coupling between objects and their dependencies. Rather than directly instantiating an object or using static references, the objects a class needs are provided to the class in some fashion. ASP.NET Core provides its own dependency injection mechanisms, read more on Introduction to Dependency Injection in ASP.NET Core. The code will now get refactored to match this pattern.

  • IPersonRepository – all database operations are declared in this interface.
  • PersonRepository – implements all methods of IPersonRepository interface. It still does not have real interaction with the database, data is kept in a dictionary. Refactor is that all static methods are removed. In order to use this class, you need an instance of it. Sample data is populated on object creation in its constructor.
  • SecurePersonController – an instance of an implementation of IPersonRepository is passed through the constructor and is used internally. By using interfaces a level of abstraction is achieved, where multiple implementations may be used for the same interface.
  • PersonController – same as SecurePersonController.
  • Startup – this is where DI is used to register that PersonRepository is the implementation of IPersonRepositoryservices.AddSingleton<IPersonRepository, PersonRepository>().

Three different object life scopes are available in .NET Core DI. It is important to know the difference in order to use them properly. If object creation is expensive operation misuse of proper DI lifetime scope might be crucial for performance:

  • AddSingleton – only one instance is created for the whole application. In the example above PersonRepository needed to have one instance because sample data is initialized in the constructor.
  • AddScoped – one instance is created per HTTP request scope. 
  • AddTransient – instance is created every time it is needed. Let us say there are 3 places where an object is needed and an HTTP request is coming to the application. AddTransient will create 3 different objects, while AddScoped will create just one that will be used for current HTTP request scope.

IPersonRepository

using System.Collections.Generic;
using SampleDotNetCore2RestStub.Models;

namespace SampleDotNetCore2RestStub.Repositories
{
	public interface IPersonRepository
	{
		Person GetById(int id);
		List<Person> GetAll();
		int GetCount();
		void Remove();
		string Save(Person person);
	}
}

PersonRepository

using System.Collections.Generic;
using System.Linq;
using SampleDotNetCore2RestStub.Models;

namespace SampleDotNetCore2RestStub.Repositories
{
	public class PersonRepository : IPersonRepository
	{
		private Dictionary<int, Person> _persons 
						= new Dictionary<int, Person>();

		public PersonRepository()
		{
			_persons .Add(1, new Person
			{
				Id = 1,
				FirstName = "FN1",
				LastName = "LN1",
				Email = "email1@email.na"
			});
			_persons .Add(2, new Person
			{
				Id = 2,
				FirstName = "FN2",
				LastName = "LN2",
				Email = "email2@email.na"
			});
		}

		public Person GetById(int id)
		{
			return _persons[id];
		}

		public List<Person> GetAll()
		{
			return _persons.Values.ToList();
		}

		public int GetCount()
		{
			return _persons.Count();
		}

		public void Remove()
		{
			if (_persons.Keys.Any())
			{
				_persons.Remove(_persons.Keys.Last());
			}
		}

		public string Save(Person person)
		{
			if (_persons.ContainsKey(person.Id))
			{
				_persons[person.Id] = person;
				return "Updated Person with id=" + person.Id;
			}
			else
			{
				_persons.Add(person.Id, person);
				return "Added Person with id=" + person.Id;
			}
		}
	}
}

SecurePersonController

using System.Collections.Generic;
using Microsoft.AspNetCore.Mvc;
using SampleDotNetCore2RestStub.Attributes;
using SampleDotNetCore2RestStub.Models;
using SampleDotNetCore2RestStub.Repositories;

namespace SampleDotNetCore2RestStub.Controllers
{
	[ServiceFilter(typeof(AuthenticationFilterAttribute))]
	public class SecurePersonController : Controller
	{
		private readonly IPersonRepository _personRepository;

		public SecurePersonController(IPersonRepository personRepository)
		{
			_personRepository = personRepository;
		}

		[HttpGet("secure/person/all")]
		public List<Person> GetPersons()
		{
			return _personRepository.GetAll();
		}
	}
}

Startup

public void ConfigureServices(IServiceCollection services)
{
	services.AddMvc();
	services.Configure<AppConfig>(Configuration);
	services.AddScoped<AuthenticationFilterAttribute>();
	services.AddSingleton<IPersonRepository, PersonRepository>();
}

Docker file

Docker file that packs application is shown below:

FROM microsoft/dotnet:2.0-sdk
COPY pub/ /root/
WORKDIR /root/
ENV ASPNETCORE_URLS="http://*:80"
EXPOSE 80/tcp
ENTRYPOINT ["dotnet", "SampleDotNetCore2RestStub.dll"]

Docker container that is used is microsoft/dotnet:2.0-sdk. Everything from pub folder is copied to container root folder. ASPNETCORE_URLS is used to set the URLs that the server listens on by default. Current config runs and exposes application at port 80 in the container. With ENTRYPOINT is configured the command that is run when the container is started.

Build, package and run Docker

The application is built and published in Release mode into pub folder with the following command:

dotnet publish --configuration=Release -o pub

Docker container is packaged with tag netcore-rest with the following command:

docker build . -t netcore-rest

Docker container is run with exposing port 80 from the container to port 9000 on the host with the following command:

docker run -e Version=1.1 -p 9000:80 netcore-rest

Notice the -e Version=1.1 which sets an environment variable to be used inside the container. The intention is to use this variable in application. This can be enabled by modifying Startup.cs file by adding AddEnvironmentVariables():

public Startup()
{
	var configurationBuilder = new ConfigurationBuilder()
		.AddJsonFile("appsettings.json", optional: false, reloadOnChange: true)
		.AddEnvironmentVariables();

	Configuration = configurationBuilder.Build();
}

If invoked now /api/version returns 1.1.

Docker optimisation

When the container with microsoft/dotnet:2.0-sdk is packed it gets to a size of 1.7GB which is quite a lot. There is much leaner container image: microsoft/dotnet:2.0-runtime, but it requires all runtime assemblies to be present in pub folder. This can be done by changing the csproj file by adding PublishWithAspNetCoreTargetManifest = false:

<PropertyGroup>
	<OutputType>Exe</OutputType>
	<TargetFramework>netcoreapp2.0</TargetFramework>
	<PublishWithAspNetCoreTargetManifest>false</PublishWithAspNetCoreTargetManifest> 
</PropertyGroup>

This makes pub folder about 37MB, but container size is 258MB. Problem with this proposal is that it might not be very reliable as some assemblies might not be copied or might not be the correct version.

Since Docker is keeping layers in the repository, proposed optimization might turn out not to be actual optimization. It will consume much more space in the repository since layer that changes and is always saved is 258MB. Layers with OS might not change often if it changes at all.

Testing

How to given application can be integration tested is described in .NET Core integration testing and mock dependencies post.

Conclusion

In the current tutorial, I have shown how to create API from scratch with .NET Core 2.0 SDK on any platform. It is very easy to run .NET Core app and even run it Docker with Linux container.

Related Posts

Read more...

How to run Linux on Windows 10

Last Updated on by

Post summary: Details how to install Ubuntu Linux on Windows 10 and some reasons why to do it.

Why

I will first start with some examples why would you need Linux. So far in my career, I’ve been writing code only on Windows and did not have issues with that, except two cases where Linux was really needed.

Git keeps file permissions

I had a Linux continuous integration agent (GoCD which I do not like very much, but this is another topic) that runs some build commands from a Bash script located inside project’s Git repository. By default Windows creates those scripts with read-only rights, so GoCD was not able to execute them. While Git Bash is in great help to run and test those scripts on Windows platform it cannot help to manage their permissions. The only solution was to clone project on Linux, modify file permissions and commit them back.

Developing Java applications to be run on Linux

Another reason to have Linux is if you develop Java applications that are going to be hosted on Linux. Java has different implementations for Path interface: WindowsPath and UnixPath. While Windows is smart enough to work with ‘/’, WindowsPath is not. So it is a little nightmare when you develop on Windows application that is doing manipulations on files and will be hosted on Linux. Having a fast and reliable build and deployment infrastructure can help overcome this problem with trial and error approach, but having local Linux might speed up development.

How to install Linux on Windows 10

Whatever the reason is to get Linux running on your Windows 10 here are the steps to do it.

Install Windows Subsystem for Linux (WSL)

Windows Subsystem for Linux (WSL) is a compatibility layer for running Linux binary executables (in ELF format) natively on Windows 10. In order to install it start PowerShell as administrator and run following command that will require a restart afterward:

Enable-WindowsOptionalFeature -Online -FeatureName Microsoft-Windows-Subsystem-Linux

Install Linux from Windows store

I have not used that option, but in Windows Store app you can find several Linux distributions which you can download and install. Visit Windows 10 Installation Guide for more details.

Install using lxrun

I find this more easy to follow. First, you have to enable developer mode: Settings -> Update and Security -> For developers. Then run following command and follow the steps:

lxrun /install

This is it!

Using Linux on Windows 10

In order to access Linux installation, you need to run Command Prompt and type: bash. Now you are in Linux. Note that by default you are accessing /mnt/c/Users/{USERNAME} which is a link to the Windows file system. Changing file permission from the scenario above will not work here as well, you have to go to some other folder.

Good thing is that you have direct access to Linux file from your Windows 10. They are located in: %localappdata%\lxss\rootfs or in my case: C:\Users\llatinov\AppData\Local\lxss\rootfs

Conslusion

Having Linux access directly on your Windows 10 workstation is a really nice feature you can very much benefit from.

Related Posts

Read more...

Best practices in software delivery process

Last Updated on by

Post summary: Short overview of a software delivery process which I consider very good and worth the “best practice” label that is being practiced in a very successful software company.

Recently I finished an assignment in a company which I rate as the best I’ve worked so far in terms of software delivery process, individuals professionalism and company culture. Most of the things I’ve blogged about last 2.5 years I’ve heard, seen, learned and mastered working for that company. I decided to describe the process because for me this is a very successful practice.

Background

The company provides B2B services by exposing a lot of APIs to its clients which then compose different functionality to their end customers. Business functionality is broken down into numerous amount of micro-services. Every micro-service is a separate project and is deployed on a separate machine. Those micro-services interconnect with each other and depend on each other. Micro-services are discovered through Netflix’s Eureka, no endpoint is ever hard-coded, except Eureka’s.

Technologies

There are different tools and frameworks used in order to deliver quality software on time. List of tools consists of following: Jira for a project and issue tracking, Confluence for documents collaboration, Bamboo for continuous integration and deployment, Bitbucket (former Stash) for code reviews, HipChat or Slack for communication, SonarQube for static code analysis, Fortify for security static code analysis. Software code is stored in Git, written in Java, build with Gradle and deployed on Linux servers with Chef or Ansible.

Planning

In order to plan the work Agile methodologies are followed – Scrum or Kanban. There is an external team of scrum masters which facilitate Scrum ceremonies and Scrum is being very dogmatic followed.

Development

Every story from the Jira board is developed in a separate branch. On every commit, there is a Bamboo plan that builds the branch, runs the unit tests and runs SonarQube static code analysis. In order to pass the build different code style rules should be met, also it is mandatory to have 80% code coverage of the unit tests. On each commit, Jira number is put into commit comments. This provides traceability between Jira and tools like Bamboo and Bitbucket. Built artifacts are uploaded into Amazon S3 bucket where later they are used by Chef deployment. Each branch build can be deployed to Dev test node and tested by a developer in a real environment. Branch can be merged to master only if there are two code reviews done by other team members. Code reviews are done with Bitbucket.

Testing

The main pillar of quality is the unit testing. Although JUnit is the main framework some teams are using Spock and are very successful with it. Code coverage threshold is above 80%. Between 75% and 80% SonarQube reports warning, below 75% build fails and you cannot release. Some teams practice mutation testing with PITest to improve their unit testing. This definitely eliminates a lot of the bugs, but just unit testing is not enough. We have reached up to 97% code coverage (JUnit, Spock, and PITest) by the unit testing and still have seen small bugs in production. Although there are no strict rules about it every team is required to have automated functional testing. It could be very basic, it could be very advanced, but in order to release functional tests should be green.

Deployment

Deployment is fully automatic using Chef. It is development team responsibility to prepare the cookbooks and provision test environments. Deployment is triggered by Bamboo deployment plan which calls Chef on the specified node. This makes the traceability between what Jira is being implemented, when it was built and when it was deployed, to which environment and in which build number.

Test environments

Apart from production, there are three other test environments: Dev, QA, and Staging. Each test environment can have one ore mode nodes. Each different micro-service provides at least one node in order to make complete and working B2B solution. Test nodes are in the cloud and their management is done with Scalr as well as a custom framework that uses Amazon EC2 API and spins up nodes. Spinning up a new node is as simple as a button click. Before spinning a node test environment should be properly configured, this includes network, Chef cookbook, hardware capabilities, software setup, every detail needed to have a ready to test environment. Each test environment has a different purpose:

  • Dev – used by developers, the main idea is to have some code committed into a branch, build it and deploy that branch to Dev environment in order to test with real dependencies given feature. Most micro-services have their test nodes working. Since there is a lot of development ongoing, sometimes happens that some micro-service is with the incorrect version of is down.
  • QA – this is used mainly by QAs to verify build that is a candidate to go for a release. This environment is stable. All micro-services have test nodes and downtime is something exceptional. Data in this environment is a dummy and incomplete one.
  • Staging – this is pre-production environment. It is mandatory each micro-service have a working node there. Data is in very mature state and more reliable than other environments.

Release process

Once the feature is implemented, code reviewed and tested its branch can be merged to master. Once merged team can decide to release it to production right away or wait for more features to pile up and then release. In order to release there is separate Bamboo build plan that is run manually. It builds the master branch, runs SonarQube analysis, runs Fortify security scan, deploys to QA test environment and runs the functional tests. Then build is deployed to Staging and functional tests are run again. If everything is green at this point there is a stable release candidate. In order to release to production, there is a manual step that has to be done. Release slot is negotiated with DevOps engineer. For every production deployment, there should be DevOps standby if something goes wrong. Once DevOps time is provisioned then release request with proposed release time is made with information which is the Bamboo build plan that is released. This request is managed by a separate team. They check what Jiras are being implemented, if all builds are green and if Staging deployment is green. If everything is green then release is approved. In release window deployment to production is made by a team member with a single button click in Bamboo. In most of the cases everything is good, but in case of issues DevOps engineer has access to production nodes and can fix any issue. Important thing is that deployment is done firstly on one node, then this node is verified. In case there is an issue with the new code, the latest version can be reverted back to this node and release is aborted. If the new code is OK then deployment can continue on other nodes with a rate of 2-3-4 nodes at a time. The idea is not to have too many nodes down at a time.

Canary releases

Some features are way too big, way too risky or way to unpredictable how they will behave in production. In such cases, there is a practice of canary releases. Real production node is detached from load balancer and does not receive live traffic anymore. New functionality is deployed there, it is evaluated by product owners, monitored by DevOps for issues. If functionality is OK then the node can be attached to load balancer again and be left for some time to see how production traffic influences it.

Introducing a brand new micro-service

If new micro-service has to be introduced in then it should go through an architectural review. It is being evaluated what technologies are there how it operates and most importantly how it fits the micro-service landscape. There is a team of architects that are responsible to keep landscape tidy and focused. There is extensive operational requirements checklist, such as: is HTTPS used, is logging following company standards, are passwords encrypted in DB, are sensitive configuration data encrypted on file system. There are many requirements that service should cover in order to go live. Even if it goes live the first stage is a beta release where this service is exposed to a selected number of partners which evaluate it first. Then it can be revealed to the mass public.

Conclusion

I really enjoyed working for this company. It was a great learning opportunity because they keep up to date with new technologies and good practices. Processes and tools are constantly evolving keeping a good quality of the code and the products. I definitely encourage to take a deep look, understand the process and eventually apply something to your software delivery process. Most important is the traceability that makes very transparent what feature is implemented, in which build deployed, etc. And traceability is something ISO auditors care very much about.

Read more...

Partial JSON deserialize by JsonPath with Json.NET

Last Updated on by

Post summary: Code examples how to deserialize only part of a big JSON file by JsonPath when using NewtonSoft Json.NET.

The code shown in examples below is available in GitHub DotNetSamples/JsonPathConverter repository.

Use case description

Imagine you have a big JSON which you want to deserialize into a C# object.

{
  "node1": {
    "node1node1": "node1node1value",
    "node1node2": [ "value1", "value2" ],
    "node1node3": {
      "node1node3node1": "node1node3node1value"
    }
  },
  "node2": true,
  "node3": {
    "node3node1": "node3SubNode1Value",
    "node3node2": {
      "node3node2node1": {
        "node3node2node1node1": [ 1, 2, 3 ]
      },
      "node3node2node2": "node3node2node1value"
    }
  },
  "node4": "{\"node4node1\": \"n4n1value\", \"node4node2\": \"n4n1value\"}"
}

The file above is actually pretty small and used for demo purposes. In practice, you can stumble upon terrifyingly big JSON files. NewtonSoft.Json or Json.NET is defacto the JSON standard for .NET, so it is being used to parse the JSON file. In order to deserialize this JSON to a C# object, you need a model class that represents the JSON nodes. Although immense effort you can create such, why bother if you are going to use just a fraction of all JSON data. This is where JsonPath comes into play. Json.NET allows you to query JSON by JsonPath, so one option is to manually query the JSON, find data you need and assign it to your C# object. This is not an elegant solution. Since query by JsonPath is possible this can be used in a JsonConverter that will automatically do the job. What is needed is a custom JsonPathConverter and a model class that will be deserialized to, both are described below.

JSON model class

It is easier to describe the JSON model first. Below is a code for JSON model class that will collect only data we need.

using System.Collections.Generic;
using Newtonsoft.Json;

namespace JsonPathConverter
{
	[JsonConverter(typeof(JsonPathConverter))]
	public class JsonModel
	{
		[JsonProperty("node1.node1node2")]
		public IList<string> Node1Array { get; set; }

		[JsonProperty("node2")]
		public bool Node2 { get; set; }

		[JsonProperty("node3.node3node2.node3node2node1.node3node2node1node1")]
		public IList<int> Node3Array { get; set; }

		[JsonConverter(typeof(JsonPathConverter))]
		[JsonProperty("node4")]
		public NestedJsonModel Node4 { get; set; }
	}

	public class NestedJsonModel
	{
		[JsonProperty("node4node2")]
		public string NestedNode2 { get; set; }
	}
}

JSON model class is annotated with [JsonConverter(typeof(JsonPathConverter))] which tells Json.NET to use JsonPathConverter class to do the conversion. JsonPathConverter is implemented in such a way that JsonProperty is a mandatory for each property in order to be parsed: [JsonProperty(“node1.node1node2”)].

JSON as a string

You may have noticed already the weird case where node4 in JSON file has actually a string value which is escaped JSON string. This is something unusual and may not be pretty good programming practice, but I’ve encountered it in a production code, so examples given here cover this weirdo as well. There is a special NestedJsonModel class which this JSON string is being deserialized to.

JsonPathConverter

Code below implements JsonConverter abstract class and implements needed methods.

public class JsonPathConverter : JsonConverter
{
	public override bool CanWrite => false;

	public override object ReadJson(JsonReader reader, Type objectType, object existingValue, JsonSerializer serializer)
	{
		var jObject = JObject.Load(reader);
		var targetObj = Activator.CreateInstance(objectType);

		foreach (var prop in objectType.GetProperties().Where(p => p.CanRead && p.CanWrite))
		{
			var jsonPropertyAttr = prop.GetCustomAttributes(true).OfType<JsonPropertyAttribute>().FirstOrDefault();
			if (jsonPropertyAttr == null)
			{
				throw new JsonReaderException($"{nameof(JsonPropertyAttribute)} is mandatory when using {nameof(JsonPathConverter)}");
			}

			var jsonPath = jsonPropertyAttr.PropertyName;
			var token = jObject.SelectToken(jsonPath);

			if (token != null && token.Type != JTokenType.Null)
			{
				var jsonConverterAttr = prop.GetCustomAttributes(true).OfType<JsonConverterAttribute>().FirstOrDefault();
				object value;
				if (jsonConverterAttr == null)
				{
					serializer.Converters.Clear();
					value = token.ToObject(prop.PropertyType, serializer);
				}
				else
				{
					value = JsonConvert.DeserializeObject(token.ToString(), prop.PropertyType,
						(JsonConverter)Activator.CreateInstance(jsonConverterAttr.ConverterType));
				}
				prop.SetValue(targetObj, value, null);
			}
		}

		return targetObj;
	}

	public override bool CanConvert(Type objectType)
	{
		return true;
	}

	public override void WriteJson(JsonWriter writer, object value, JsonSerializer serializer)
	{
		throw new NotImplementedException();
	}
}

Deserialization work is done in public override object ReadJson(JsonReader reader, Type objectType, object existingValue, JsonSerializer serializer) method. JSON is loaded to a NewtonSoft JObject and instance of result object is created. All properties of this result object are iterated in a foreach loop. It is important to note that properties should have both get and set in order to be considered in deserialization: objectType.GetProperties().Where(p => p.CanRead && p.CanWrite). If you have properties with just get or just set they will be ignored. JsonPropertyAttribute for each property is taken. If there is no such then an exception is thrown. This part can be changed. JsonPath can be considered to be the property name: var jsonPath = jsonPropertyAttr == null ? prop.Name : jsonPropertyAttr.PropertyName. This is tricky though as C# is case sensitive and it might not work as property could start with capital letter, but JSON itself to be with lower case. Once there is JsonPath defined JObject is queried with jObject.SelectToken(jsonPath). This should return a valid token. In case of a valid token result, object property is checked for JsonConverterAttribute. If such exists then JSON is deserialized with this newly found JsonConverter instance. If there is no converter attached to this property then all existing converters are cleared and the token is converted into an object. Clearing part is important as in case of recursive call it will throw an exception.

Usage

Once job above is done usage is pretty easy:

var fileContent = File.ReadAllText("jsonFile.json");
var result = JsonConvert.DeserializeObject<JsonModel>(fileContent);

result.Node1Array.Should().BeEquivalentTo(new List<string> {"value1", "value2"});
result.Node2.Should().Be(true);
result.Node3Array.Should().BeEquivalentTo(new List<int> { 1, 2, 3 });
result.Node4.NestedNode2.Should().Be("n4n1value");

Conclusion

In this post, I have shown how to partially deserialize JSON by JsonPath picking only data that you need.

Read more...

Soft assertions for C# unit testing frameworks (MSTest, NUnit, xUnit.net)

Last Updated on by

Post summary: Code example of very easy and useful custom implementation of soft assertions in C# unit testing frameworks such as MSTest, NUnit or xUnit.net.

The code shown in examples below is available in GitHub DotNetSamples/SoftAssertions repository.

Unit vs Functional testing

Unit testing paradigm states that each test exercises particular code behavior. So in a perfect world, one unit test would have one assertion which defines unit test result – either passed or failed. This is why unit testing frameworks provide only asserts which stop further execution of current test method. In functional testing usually, one test verifies several conditions. Not debating if this is good or bad. Assume you are doing GUI testing, once you have opened particular page you’d better do as much verification as possible to reduce the risk of bugs. Having this page opened over and over for every single check is not the most efficient way of testing. This is why when you run functional tests you need some kind of assert that indicates whether passed or failed but to let the test continue in no critical issue is present. Those are generally called “soft” asserts.

Soft assertions code

Following code is an implementation of soft assertions:

using System.Collections.Generic;
using System.Linq;
using FluentAssertions;

public class SoftAssertions
{
	private readonly List<SingleAssert> 
		_verifications = new List<SingleAssert>();

	public void Add(string message, string expected, string actual)
	{
		_verifications.Add(new SingleAssert(message, expected, actual));
	}

	public void Add(string message, bool expected, bool actual)
	{
		Add(message, expected.ToString(), actual.ToString());
	}

	public void Add(string message, int expected, int actual)
	{
		Add(message, expected.ToString(), actual.ToString());
	}

	public void AddTrue(string message, bool actual)
	{
		_verifications
			.Add(new SingleAssert(message, true.ToString(), actual.ToString()));
	}

	public void AssertAll()
	{
		var failed = _verifications.Where(v => v.Failed).ToList();
		failed.Should().BeEmpty();
	}

	private class SingleAssert
	{
		private readonly string _message;
		private readonly string _expected;
		private readonly string _actual;

		public bool Failed { get; }

		public SingleAssert(string message, string expected, string actual)
		{
			_message = message;
			_expected = expected;
			_actual = actual;

			Failed = _expected != _actual;
			if (Failed)
			{
				// TODO Act in case of failure, e.g. take screenshot
				var screenshot = "MethodToSaveScreenshotAndReturnFilename";
				_message += $". Screenshot captured at: {screenshot}";
			}
		}

		public override string ToString()
		{
			return $"'{_message}' assert was expected to be " +
					$"'{_expected}' but was '{_actual}'";
		}
	}
}

Soft assertions details

The actual assertion is handled by SingleAssert class. It contains a message to be displayed to the user in case of failing test as well as expected and actual values. It is possible to extend the SingleAssert class so in case of failure you can do some specific actions, such as taking a screenshot. They are stored as strings. All asserts during testing are stored in a List<SingleAssert>. There are several methods that add assert. There are such that accept bool, string, and int. You can extend and add as many as you want. It is mandatory to call AssertAll() method so asserts can be evaluated. The evaluation consists of filtering out passed asserts leaving only failed: var failed = _verifications.Where(v => v.Failed).ToList(). Then list with failed is checked for empty failed.Should().BeEmpty(). In this case, FluentAssertions framework is used, but the code can be changed to such that suits your particular needs.

Soft assertions usage

Usage is pretty straightforward. SoftAssertions object should be created before each test and asserted after each test:


[TestClass]
public class UnitTest
{
	private SoftAssertions _softAssertions;

	[TestInitialize]
	public void SetUp()
	{
		_softAssertions = new SoftAssertions();
	}

	[TestCleanup]
	public void TearDown()
	{
		_softAssertions.AssertAll();
	}

	[TestMethod]
	public void TestMixedSoftAssertions()
	{
		_softAssertions.Add("Passing bool Add assertion", true, true);
		_softAssertions.Add("Failing bool Add assertion", true, false);
		_softAssertions
			.Add("Passing string Add assertion", "SameString", "SameString");
		_softAssertions
			.Add("Failing string Add assertion", "SameString", "OtherString");
		_softAssertions.Add("Passing int Add assertion", 1, 1);
		_softAssertions.Add("Failing int Add assertion", 1, 2);
		_softAssertions.AddTrue("Passing AddTrue assertion", true);
		_softAssertions.AddTrue("Failing AddTrue assertion", false);
	}
}

Soft assertions result

Result of test shown above is: Result Message: Expected collection to be empty, but found {‘Failing bool Add assertion’ assert was expected to be ‘True’ but was ‘False’, ‘Failing string Add assertion’ assert was expected to be ‘SameString’ but was ‘DifferentString’, ‘Failing int Add assertion’ assert was expected to be ‘1’ but was ‘2’, ‘Failing AddTrue assertion’ assert was expected to be ‘True’ but was ‘False’}.

This comes out of the box because FluentAssertions is used. Otherwise, you have to do some other output and assertions.

Other soft assertions

Some custom implementation of soft assertions is as well available in NTestRunner framework, but it is more complex and demanding special approach for writing tests.

Conclusion

Soft assertions are very useful in functional testing. With this simple class, you can directly have them in your functional tests.

Related Posts

Read more...

Convert NUnit 3 to NUnit 2 results XML file

Last Updated on by

Post summary: Examples how to convert NUnit 3 result XML file into NUnit 2 result XML file.

Although NUnit 3 was officially released in November 2015 still there are CI tools that do not provide support for parsing NUnit 3 result XML files. In this post, I will show how to convert between formats so CI tools can read NUnit 2 format.

NUnit 3 console runner

The easiest way is if you are using NUnit 3 console runner. It can be provided with an option: –result=TestResult.xml;format=nunit2.

Nota bene: Mandatory for this to work is to have nunit-v2-result-writer in NuGet packages directory otherwise an error will be shown: Unknown result format: nunit2.

Convert NUnit 3 to NUnit 2

If tests are being run in some other way other than NUnit 3 console runner then solution below is needed. There is no program or tool that can do this conversion, so custom one is needed. This is a Powershell script that uses nunit-v2-result-writer assemblies and with their functionality converts the XML files:

$assemblyNunitEngine = 'nunit.engine.api.dll';
$assemblyNunitWriter = 'nunit-v2-result-writer.dll';
$inputV3Xml = 'TestResult.xml';
$outputV2Xml = 'TestResultV2.xml';

Add-Type -Path $assemblyNunitEngine;
Add-Type -Path $assemblyNunitWriter;
$xmldoc = New-Object -TypeName System.Xml.XmlDataDocument;
$fs = New-Object -TypeName System.IO.FileStream -ArgumentList $inputV3Xml,'Open','Read';
$xmldoc.Load($fs);
$xmlnode = $xmldoc.GetElementsByTagName('test-run').Item(0);
$writer = New-Object -TypeName NUnit.Engine.Addins.NUnit2XmlResultWriter;
$writer.WriteResultFile($xmlnode, $outputV2Xml);

Important here is to give a proper path to nunit.engine.api.dll, nunit-v2-result-writer.dll and NUnit 3 TestResult.xml files. Powershell script above is equivalent to following C# code:

using System.IO;
using System.Xml;
using NUnit.Engine.Addins;

public class NUnit3ToNUnit2Converter
{
	public static void Main(string[] args)
	{
		var xmldoc = new XmlDataDocument();
		var fileStream 
			= new FileStream("TestResult.xml", FileMode.Open, FileAccess.Read);
		xmldoc.Load(fileStream);
		var xmlnode = xmldoc.GetElementsByTagName("test-run").Item(0);

		var writer = new NUnit2XmlResultWriter();
		writer.WriteResultFile(xmlnode, "TestResultV2.xml");
	}
}

File samples

Here NUnitFileSamples.zip is a collection of several NUnit result files. there with V3 are NUnit 3 format, those with V2_NUnit are generated with –result=TestResult.xml;format=nunit2 option and those with V2_Converted are converted with the code above.

Conclusion

Although little inconvenient it is possible to convert NUnit 3 to NUnit 2 result XML files using Powershell scripts and nunit-v2-result-writer assemblies.

Read more...

Java 8 features – Stream API advanced examples

Last Updated on by

Post summary: This post explains Java 8 Stream API with very basic code examples.

In Java 8 features – Lambda expressions, Interface changes, Stream API, DateTime API post I have briefly described most interesting Java 8 features. In the current post, I will give special attention to Stream API. This post is with more advanced code examples to elaborate on basic examples described in Java 8 features – Stream API basic examples post. Code examples here can be found in GitHub java-samples/java8 repository.

Memory consumption and better design

Stream API has operations that are short-circuiting, such as limit(). Once their goal is achieved they stop processing the stream. Most of the operators are not such. Here I have prepared an example for possible pitfall when using not short-circuiting operators. For testing purposes, I have created PeekObject which outputs a message to the console once its constructor is called.

public class PeekObject {
	private String message;

	public PeekObject(String message) {
		this.message = message;
		System.out.println("Constructor called for: " + message);
	}

	public String getMessage() {
		return message;
	}
}

Assume a situation where there is a stream of many instances of PeekObject, but only several elements of the stream are needed, thus they have to be limited. Only 2 constructors are called in this case.

limit the stream

public static List<PeekObject> limit_shortCircuiting(List<String> stringList,
							int limit) {
	return stringList.stream()
		.map(PeekObject::new)
		.limit(limit)
		.collect(Collectors.toList());
}

unit test

@Test
public void test_limit_shortCircuiting() {
	System.out.println("limit_shortCircuiting");

	List<String> stringList = Arrays.asList("a", "b", "a", "c", "d", "a");

	List<PeekObject> result = AdvancedStreamExamples
		.limit_shortCircuiting(stringList, 2);

	assertThat(result.size(), is(2));
}

console output

limit_shortCircuiting
Constructor called for: a
Constructor called for: b

Now stream has to be sorted before the limit is applied.

code

public static List<PeekObject> sorted_notShortCircuiting(
					List<String> stringList, int limit) {
	return stringList.stream()
		.map(PeekObject::new)
		.sorted((left, right) -> 
			left.getMessage().compareTo(right.getMessage()))
		.limit(limit)
		.collect(Collectors.toList());
}

unit test

@Test
public void test_sorted_notShortCircuiting() {
	System.out.println("sorted_notShortCircuiting");

	List<String> stringList = Arrays.asList("a", "b", "a", "c", "d", "a");

	List<PeekObject> result = AdvancedStreamExamples
		.sorted_notShortCircuiting(stringList, 2);

	assertThat(result.size(), is(2));
}

console output

sorted_notShortCircuiting
Constructor called for: a
Constructor called for: b
Constructor called for: a
Constructor called for: c
Constructor called for: d
Constructor called for: a

Notice that constructors for all objects in the stream are called. This will require Java to allocate enough memory for all the objects. There are 6 objects in this example, but what if there are 6 million. Also, current objects are very lightweight, but what if they are much bigger. The conclusion is that you have to know very well Stream API operations and apply them carefully when designing your stream pipeline.

Convert comma separated List to a Map with handling duplicates

There is a List of comma separated values which need to be converted to a Map. List value “11,21” should become Map entry with key 11 and value 21. Duplicated keys also should be considered: Arrays.asList(“11,21”, “12,21”, “13,23”, “13,24”).

code

public static Map<Long, Long> splitToMap(List<String> stringsList) {
	return stringsList.stream()
		.filter(StringUtils::isNotEmpty)
		.map(line -> line.split(","))
		.filter(array -> array.length == 2 
			&& NumberUtils.isNumber(array[0])
			&& NumberUtils.isNumber(array[1]))
		.collect(Collectors.toMap(array -> Long.valueOf(array[0]), 
			array -> Long.valueOf(array[1]), (first, second) -> first)));
}

unit test

@Test
public void test_splitToMap() {
	List<String> stringList = Arrays
			.asList("11,21", "12,21", "13,23", "13,24");

	Map<Long, Long> result = AdvancedStreamExamples.splitToMap(stringList);

	assertThat(result.size(), is(3));
	assertThat(result.get(11L), is(21L));
	assertThat(result.get(12L), is(21L));
	assertThat(result.get(13L), is(23L));
}

The important bit in this conversion is (first, second) -> first), if it is not present there will be error java.lang.IllegalStateException: Duplicate key 23 (slightly misleading error, as the duplicated key is 13, the value is 23). This is a merge function which resolves collisions between values associated with the same key. It evaluates two values found for the same key – first and second where current lambda returns the first. If overwrite is needed, hence keep the last entered value then lambda would be: (first, second) -> second).

Examples of custom object

Examples to follow use custom object Employee, where Position is an enumeration: public enum Position { DEV, DEV_OPS, QA }.

import java.util.List;

public class Employee {
	private String firstName;
	private String lastName;
	private Position position;
	private List<String> skills;
	private int salary;

	public Employee() {
	}

	public Employee(String firstName, String lastName,
				Position position, int salary) {
		this.firstName = firstName;
		this.lastName = lastName;
		this.position = position;
		this.salary = salary;
	}

	public void setSkills(String... skills) {
		this.skills = Arrays.stream(skills).collect(Collectors.toList());
	}

	public String getName() {
		return this.firstName + " " + this.lastName;
	}

	... Getters and Setters
}

A company has been created, it consists of 6 developers, 2 QAs and 2 DevOps..

private List<Employee> createCompany() {
	Employee dev1 = new Employee("John", "Doe", Position.DEV, 110);
	dev1.setSkills("C#", "ASP.NET", "React", "AngularJS");
	Employee dev2 = new Employee("Peter", "Doe", Position.DEV, 120);
	dev2.setSkills("Java", "MongoDB", "Dropwizard", "Chef");
	Employee dev3 = new Employee("John", "Smith", Position.DEV, 115);
	dev3.setSkills("Java", "JSP", "GlassFish", "MySql");
	Employee dev4 = new Employee("Brad", "Johston", Position.DEV, 100);
	dev4.setSkills("C#", "MSSQL", "Entity Framework");
	Employee dev5 = new Employee("Philip", "Branson", Position.DEV, 140);
	dev5.setSkills("JavaScript", "React", "AngularJS", "NodeJS");
	Employee dev6 = new Employee("Nathaniel", "Barth", Position.DEV, 99);
	dev6.setSkills("Java", "Dropwizard");
	Employee qa1 = new Employee("Ronald", "Wynn", Position.QA, 100);
	qa1.setSkills("Selenium", "C#", "Java");
	Employee qa2 = new Employee("Erich", "Kohn", Position.QA, 105);
	qa2.setSkills("Selenium", "JavaScript", "Protractor");
	Employee devOps1 = new Employee("Harold", "Jess", Position.DEV_OPS, 116);
	devOps1.setSkills("CentOS", "bash", "c", "puppet", "chef", "Ansible");
	Employee devOps2 = new Employee("Karl", "Madsen", Position.DEV_OPS, 123);
	devOps2.setSkills("Ubuntu", "bash", "Python", "chef");

	return Arrays.asList(dev1, dev2, dev3, dev4, dev5, dev6,
				qa1, qa2, devOps1, devOps2);
}

Company skill set

This method accepts none, one or many positions. If no positions are provided then information for all positions is printed. Positions array is transferred to List<String> because all objects used in lambda should be effectively final. Transferring array to stream is done with Arrays.stream() method. Employees are filtered based on the desired position. Each skills list is concatenated and flattened to a stream with flatMap(). After this operation, there is a stream of strings with all skills. Duplicates are removed with distinct(). Finally, stream is collected to a list.

code

public static List<String> gatherEmployeeSkills(
		List<Employee> employees, Position... positions) {
	positions = positions == null || positions.length == 0 
		? Position.values() : positions;
	List<Position> searchPositions = Arrays.stream(positions)
			.collect(Collectors.toList());
	return employees == null ? Collections.emptyList()
		: employees.stream()
			.filter(employee 
				-> searchPositions.contains(employee.getPosition()))
			.flatMap(employee -> employee.getSkills().stream())
			.distinct()
			.collect(Collectors.toList());
}

unit test

@Test
public void test_gatherEmployeeSkills() {
	List<Employee> company = createCompany();

	List<String> skills = AdvancedStreamExamples
			.gatherEmployeeSkills(company);

	assertThat(skills.size(), is(25));
}

Skillset per position

This method first received a list of all skills per position and converts it to a stream. The stream can be collected to a String with Collectors.joining() method. It accepts delimiter, prefix, and suffix.

code

public static String printEmployeeSkills(
		List<Employee> employees, Position position) {
	List<String> skills = gatherEmployeeSkills(employees, position);
	return skills.stream()
		.collect(Collectors.joining("; ",
			"Our " + position + "s have: ", " skills"));
}

unit test

@Test
public void test_printEmployeeSkills() {
	List<Employee> company = createCompany();

	String skills = AdvancedStreamExamples
			.printEmployeeSkills(company, Position.QA);

	assertThat(skills, is("Our employees have: "
		+ "Selenium; C#; Java; JavaScript; Protractor skills"));
}

Salary statistics

This method returns Map with Position as key and IntSummaryStatistics as value. Collectors.groupingBy() groups employees by position key and then using Collectors.summarizingInt() to get statistics of employee’s salary.

code

public static Map<Position, IntSummaryStatistics> salaryStatistics(
		List<Employee> employees) {
	return employees.stream()
		.collect(Collectors.groupingBy(Employee::getPosition,
			Collectors.summarizingInt(Employee::getSalary)));
}

unit test

@Test
public void test_salaryStatistics() {
	List<Employee> company = createCompany();

	Map<Position, IntSummaryStatistics> salaries = AdvancedStreamExamples
			.salaryStatistics(company);

	assertThat(salaries.get(Position.DEV).getAverage(), is(114D));
	assertThat(salaries.get(Position.QA).getAverage(), is(102.5D));
	assertThat(salaries.get(Position.DEV_OPS).getAverage(), is(119.5D));
}

Position with the lowest average salary

Map with position and salary summary is retrieved and then with entrySet().stream() map is converted to stream of Entry<Position, IntSummaryStatistics> objects. Entries are sorted by average value in ascending order by custom comparator Double.compare(). findFirst() returns Optional<Entry>. The entry itself is obtained with get() method. The key which is basically the position is obtained with getKey() method.

code

public static Position positionWithLowestAverageSalary(
		List<Employee> employees) {
	return salaryStatistics(employees)
		.entrySet().stream()
		.sorted((entry1, entry2) 
			-> Double.compare(entry1.getValue().getAverage(),
				entry2.getValue().getAverage()))
		.findFirst()
		.get()
		.getKey();
}

unit test

@Test
public void test_positionWithLowestAverageSalary() {
	List<Employee> company = createCompany();

	Position position = AdvancedStreamExamples
			.positionWithLowestAverageSalary(company);

	assertThat(position, is(Position.QA));
}

Employees per each position

Grouping is done per position and employees are aggregated to list with Collectors.toList() method.

code

public static Map<Position, List<Employee>> employeesPerPosition(
		List<Employee> employees) {
	return employees.stream()
		.collect(Collectors.groupingBy(Employee::getPosition,
				Collectors.toList()));
}

unit test

@Test
public void test_employeesPerPosition() {
	List<Employee> company = createCompany();

	Map<Position, List<Employee>> employees = AdvancedStreamExamples
			.employeesPerPosition(company);

	assertThat(employees.get(Position.QA).size(), is(2));
	assertThat(employees.get(Position.QA).get(0).getName(),
		is("Ronald Wynn"));
	assertThat(employees.get(Position.QA).get(1).getName(),
		is("Erich Kohn"));
}

Employee names per each position

Similar to the method above, but one more mapping is needed here. Employee name should be extracted and converted to List<String>. This is done with Collectors.mapping(Employee::getName, Collectors.toList()) method.

code

public static Map<Position, List<String>> employeeNamesPerPosition(
		List<Employee> employees) {
	return employees.stream()
		.collect(Collectors.groupingBy(Employee::getPosition,
			Collectors.mapping(Employee::getName,
						Collectors.toList())));
}

unit test

@Test
public void test_employeeNamesPerPosition() {
	List<Employee> company = createCompany();

	Map<Position, List<String>> employees = AdvancedStreamExamples
			.employeeNamesPerPosition(company);

	assertThat(employees.get(Position.QA).size(), is(2));
	assertThat(employees.get(Position.QA).get(0), is("Ronald Wynn"));
	assertThat(employees.get(Position.QA).get(1), is("Erich Kohn"));
}

Employee count per position

Getting the count is done by Collectors.counting() method. It returns Long by default. If Integer is needed then this can be changed to Collectors.reducing(0, e -> 1, Integer::sum).

code

public static Map<Position, Long> employeesCountPerPosition(
			List<Employee> employees) {
	return employees.stream()
		.collect(Collectors.groupingBy(Employee::getPosition,
						Collectors.counting()));
}

unit test

@Test
public void test_employeesCountPerPosition() {
	List<Employee> company = createCompany();

	Map<Position, Long> employees = AdvancedStreamExamples
				.employeesCountPerPosition(company);

	assertThat(employees.get(Position.DEV), is(6L));
	assertThat(employees.get(Position.QA), is(2L));
	assertThat(employees.get(Position.DEV_OPS), is(2L));
}

Employees with duplicated first name

Employees are grouped into a map with key first name and List<Employee> as value. This map is converted to stream and filtered for List<Employee> greater than 1 element. The list is flattened with flatMap() and collected to List<Employee>.

code

public static List<Employee> employeesWithDuplicateFirstName(
		List<Employee> employees) {
	return employees.stream()
		.collect(Collectors.groupingBy(Employee::getFirstName,
						Collectors.toList()))
		.entrySet().stream()
		.filter(entry -> entry.getValue().size() > 1)
		.flatMap(entry -> entry.getValue().stream())
		.collect(Collectors.toList());
}

unit test

@Test
public void test_employeesWithDuplicateFirstName() {
	List<Employee> company = createCompany();

	List<Employee> employees = AdvancedStreamExamples
			.employeesWithDuplicateFirstName(company);

	assertThat(employees.size(), is(2));
	assertThat(employees.get(0).getName(), is("John Doe"));
	assertThat(employees.get(1).getName(), is("John Smith"));
}

Conclusion

In this post, I have just scratched the Java 8 Stream API. It offers a vast amount of functionalities which can be very useful for data processing. Beware when generating stream pipeline because it might end up consuming too many resources.

Related Posts

Read more...

Java 8 features – Stream API basic examples

Last Updated on by

Post summary: This post explains Java 8 Stream API with very basic code examples.

In Java 8 features – Lambda expressions, Interface changes, Stream API, DateTime API post I have briefly described most interesting Java 8 features. In the current post, I will give special attention to Stream API. This post is with very basic code examples to explain the theory described in Java 8 features – Stream API explained post. Code examples here can be found in GitHub java-samples/java8 repository.

Example for filter, map, distinct, sorted, peek and collect

I will cover all those operations in one example. Code below takes a list of strings and converts it to stream by stream() method. For debug purposes peek() is used in the beginning and at the end of stream operations. It only prints to the console elements from the stream. Filtering of the elements is done by filter() method. Lambda expression is used as a predicate. This lambda expression is a method call to verify current element is a number: element-> NumberUtils.isNumber(element). Since it is a single method call it is substituted with method reference: NumberUtils::isNumber. All elements that are evaluated to false are removed from further processing. It is good practice to use filtering at the beginning of stream pipeline so stream elements are reduced. Next operation is converting String values in the stream to Long values. This is done with map() method again with method reference. Duplicated elements are removed by calling distinct(). Stream elements are sorted by element’s natural order, in the current example, they are Long values. In the end, the stream is materialized into a List by using collect(Collectors.toList()) method. If this code has to be written without streams it would have looked as shown in “no stream code” tab. Note that using stream code is much more readable. Actually, in the beginning, it is not that easy to think in a stream-oriented way, but once you get used to it, you will never want to see the non-streams code.

code

public static List<Long> toLongList(List<String> stringList) {
	return stringList.stream()
		.peek(element -> System.out.println("Before: " + element))
		.filter(NumberUtils::isNumber)
		.map(Long::valueOf)
		.distinct()
		.sorted()
		.peek(element -> System.out.println("After: " + element))
		.collect(Collectors.toList());
}

unit test

@Test
public void test_toLongList() {
	List<String> stringList = Arrays
		.asList(null, "", "aaa", "345", "123", "234", "123");

	List<Long> result = BasicStreamExamples.toLongList(stringList);

	assertEquals(3, result.size());
	assertEquals(123L, (long) result.get(0));
	assertEquals(234L, (long) result.get(1));
	assertEquals(345L, (long) result.get(2));
}

console output

Before: null
Before: 
Before: aaa
Before: 345
Before: 123
Before: 234
Before: 123
After: 123
After: 234
After: 345

no stream code

public static List<Long> toLongListWithoutStream(List<String> stringList) {
	List<Long> result = new ArrayList<>();
	for (String value : stringList) {
		System.out.println("Before: " + value);
		if (NumberUtils.isNumber(value)) {
			Long longValue = Long.valueOf(value);
			if (!result.contains(longValue)) {
				result.add(longValue);
				System.out.println("After: " + value);
			}
		}
	}
	Collections.sort(result);
	return result;
}

Example for toArray

This example is similar to the example above, instead of collecting as a list here stream elements are returned in the array.

toArray code

public static Long[] toLongArray(String[] stringArray) {
	return Arrays.stream(stringArray)
		.filter(NumberUtils::isNumber)
		.map(Long::valueOf)
		.toArray(Long[]::new);
}

unit test

@Test
public void test_toLongArray() {
	String[] stringArray = new String[] {null, "", "aaa", "123", "234"};

	Long[] result = BasicStreamExamples.toLongArray(stringArray);

	assertEquals(2, result.length);
	assertEquals(123L, (long) result[0]);
	assertEquals(234L, (long) result[1]);
}

Example for flatMap

This function is pretty complex and hard to understand. In the current example, there is a map with String for key and List for value. The example below merges all list values in one result list. Note that Map interface does not have stream() method. Instead, first entrySet() is invoked which returns Set and then invoke its stream() method. Once stream is created flatMap() is called and result of Function argument should be stream: map -> map.getValue().stream(). This resultant stream is a merge of all list values streams, which is then collected to a List.

flatMap code

public static List<String> flapMap(Map<String, List<String>> mapToProcess) {
	return mapToProcess.entrySet()
		.stream()
		.flatMap(map -> map.getValue().stream())
		.collect(Collectors.toList());
}

unit test

@Test
public void test_flapMap() {
	Map<String, List<String>> map = new HashMap<>();
	map.put("1", Arrays.asList("a", "b"));
	map.put("2", Arrays.asList("C", "D"));

	List<String> expectedResult = Arrays.asList("a", "b", "C", "D");

	List<String> result = BasicStreamExamples.flapMap(map);

	assertEquals(expectedResult, result);
}

Examples of limit and skip

limit code

public static List<String> limitValues(List<String> stringList, long limit) {
	return stringList.stream()
		.limit(limit)
		.collect(Collectors.toList());
}

limit unit test

@Test
public void test_limitValues() {
	List<String> stringList = Arrays.asList("a", "b", "c", "d");

	List<String> result = BasicStreamExamples.limitValues(stringList, 2);

	assertEquals(2, result.size());
	assertEquals("a", result.get(0));
	assertEquals("b", result.get(1));
}

skip code

public static List<String> skipValues(List<String> stringList, long skip) {
	return stringList.stream()
		.skip(skip)
		.collect(Collectors.toList());
}

skip unit test

@Test
public void test_skipValues() {
	List<String> stringList = Arrays.asList("a", "b", "c", "d");

	List<String> result = BasicStreamExamples.skipValues(stringList, 2);

	assertEquals(2, result.size());
	assertEquals("c", result.get(0));
	assertEquals("d", result.get(1));
}

Example for forEach

forEach code

public static void printEachElement(List<String> stringList) {
	stringList.stream()
		.forEach(element -> System.out.println("Element: " + element));
}

unit test

@Test
public void test_printEachElement() {
	List<String> stringList = Arrays.asList("a", "b", "c", "d");

	BasicStreamExamples.printEachElement(stringList);
}

console output

Element: a
Element: b
Element: c
Element: d

Examples of min and max

min code

public static Optional<Integer> getMin(List<Integer> stringList) {
	return stringList.stream()
		.min(Long::compare);
}

min unit test

@Test
public void test_getMin() {
	List<Integer> integerList = Arrays.asList(234, 123, 345);

	Optional<Integer> result = BasicStreamExamples.getMin(integerList);

	assertEquals(123, (int) result.get());
}

max code

public static Optional<Integer> getMax(List<Integer> integers) {
	return integers.stream()
		.max(Long::compare);
}

max unit test

@Test
public void test_getMax() {
	List<Integer> integerList = Arrays.asList(234, 123, 345);

	Optional<Integer> result = BasicStreamExamples.getMax(integerList);

	assertEquals(345, (int) result.get());
}

Example for reduce

This also is a bit complex method. The method that is given below sums all elements in the provided stream.

reduce code

public static Optional<Integer> sumByReduce(List<Integer> integers) {
	return integers.stream()
		.reduce((x, y) -> x + y);
}

unit test

@Test
public void test_sumByReduce() {
	List<Integer> integerList = Arrays.asList(100, 200, 300);

	Optional<Integer> result = BasicStreamExamples.sumByReduce(integerList);

	assertEquals(600, (int) result.get());
}

Example for count

count code

public static long count(List<Integer> integers) {
	return integers.stream()
		.count();
}

unit test

@Test
public void test_count() {
	List<Integer> integerList = Arrays.asList(234, 123, 345);

	long result = BasicStreamExamples.count(integerList);

	assertEquals(3, result);
}

Example for anyMatch, allMatch, and noneMatch

anyMatch code

public static boolean isOddElementPresent(List<Integer> integers) {
	return integers.stream()
		.anyMatch(element -> element % 2 != 0);
}

allMatch code

public static boolean areAllElementsOdd(List<Integer> integers) {
	return integers.stream()
		.allMatch(element -> element % 2 != 0);
}

noneMatch code

public static boolean areAllElementsEven(List<Integer> integers) {
	return integers.stream()
		.noneMatch(element -> element % 2 != 0);
}

unit test 1

@Test
public void test_anyMatch_allMatch_noneMatch_allEven() {
	List<Integer> integerList = Arrays.asList(234, 124, 346, 124);

	assertFalse(BasicStreamExamples.isOddElementPresent(integerList));
	assertFalse(BasicStreamExamples.areAllElementsOdd(integerList));
	assertTrue(BasicStreamExamples.areAllElementsEven(integerList));
}

unit test 2

@Test
public void test_anyMatch_allMatch_noneMatch_evenAndOdd() {
	List<Integer> integerList = Arrays.asList(234, 123, 345, 123);

	assertTrue(BasicStreamExamples.isOddElementPresent(integerList));
	assertFalse(BasicStreamExamples.areAllElementsOdd(integerList));
	assertFalse(BasicStreamExamples.areAllElementsEven(integerList));
}

unit test 3

@Test
public void test_anyMatch_allMatch_noneMatch_allOdd() {
	List<Integer> integerList = Arrays.asList(233, 123, 345, 123);

	assertTrue(BasicStreamExamples.isOddElementPresent(integerList));
	assertTrue(BasicStreamExamples.areAllElementsOdd(integerList));
	assertFalse(BasicStreamExamples.areAllElementsEven(integerList));
}

Examples for findFirst

In case of List stream has an order and it will return always 234 as result.

findFirst code for List

public static Optional<Integer> getFirstElementList(List<Integer> integers) {
	return integers.stream()
		.findFirst();
}

findFirst unit test for List

@Test
public void test_getFirstElementList() {
	List<Integer> integerList = Arrays.asList(234, 123, 345, 123);

	Optional<Integer> result = BasicStreamExamples
		.getFirstElementList(integerList);

	assertEquals(Integer.valueOf(234), result.get());
}

Since Set has no natural order then there is no guarantee which element is to be returned by findFirst(). On my machine, with my JVM it is 345, but on another machine, with other JVM it might be a different value, so this test most likely will fail for someone else.

findFirst code for Set

public static Optional<Integer> getFirstElementSet(Set<Integer> integers) {
	return integers.stream()
		.findFirst();
}

findFirst unit test for Set

@Test
public void test_getFirstElementSet() {
	Set<Integer> integerSet = new HashSet<>();
	integerSet.add(234);
	integerSet.add(123);
	integerSet.add(345);
	integerSet.add(123);

	Optional<Integer> result = BasicStreamExamples
		.getFirstElementSet(integerSet);

	assertEquals(Integer.valueOf(345), result.get());
}

Examples for findAny

There is no guarantee which element is to be returned by findAny(). On my machine, with my JVM it is 234, but on another machine, with other JVM it might be a different value, so this test most likely will fail for someone else.

findAny code

public static Optional<Integer> getAnyElement(List<Integer> integers) {
	return integers.stream()
		.findAny();
}

findAny unit test

@Test
public void test_getGetAnyElement() {
	List<Integer> integerList = Arrays.asList(234, 123, 345, 123);

	Optional<Integer> result = BasicStreamExamples
		.getAnyElement(integerList);

	assertEquals(Integer.valueOf(234), result.get());
}

Conclusion

These basic code examples give an idea how Java 8 Stream API operations work. More advanced examples are shown in Java 8 features – Stream API advanced examples post.

Related Posts

Read more...

Java 8 features – Stream API explained

Last Updated on by

Post summary: Code examples of Java 8 Stream API showing useful use cases.

In Java 8 features – Lambda expressions, Interface changes, Stream API, DateTime API post I have briefly described most interesting Java 8 features. In the current post, I will give special attention to Stream API. This post is more theoretical which lays the foundation for next posts: Java 8 features – Stream API basic examples and Java 8 features – Stream API advanced examples that gives code examples to explain the theory. Code examples here can be found in GitHub java-samples/java8 repository.

Functional interfaces

Before explaining Stream API it is needed to understand the idea of a functional interface as they are leveraged for use with lambda expressions. A functional interface is an interface that has only one abstract method that is to be implemented. A functional interface may or may not have default or static methods. Although the not mandatory good practice is to annotate a functional interface with @FunctionalInterface. Functional interfaces mostly used in Stream API operations are explained below. You can also use functional interfaces in a method signature, hence lambda expressions can be passed when calling a method. If one’s below are not suitable you can always create own functional interface.

Predicate

Method for implementation is: boolean test(T t). This interface is used in order to evaluate condition to an input object to a boolean expression.

Supplier

Method for implementation is: T get(). This interface is used in order to get output object as a result.

Function

Method for implementation is: R apply(T t). This interface is used in order to produce a result object based on a given input object.

Consumer

Method for implementation is: void accept(T t). This interface is used in order to do an operation on a single input object that does not produce any result.

BiConsumer

Method for implementation is: void accept(T t, U u). This interface is used in order to do an operation on two input objects that do not produce any result.

Method reference

Sometimes when using lambda expression all that is done is calling a single method by name. Method reference provides an easy way to call the method making the code more readable.

Stream API

Stream API is used for data processing which supports parallel operations. It enables data processing in a declarative way. Streams are sequences of elements that support different operations. Streams are lazily computed on demand when elements are needed. The stream is like a recipe that gets executed when actual result is needed.

Stream operations

Stream operations are divided into intermediate and terminal operations combined to form stream pipelines. Intermediate operations return a new stream. They are always lazy. Executing an intermediate operation such as filter() does not actually perform any filtering, but instead creates a new stream. Terminal operations on the other hand, such as collect() generates a result or final value. After the terminal operation is performed, the stream pipeline is considered consumed, and can no longer be used. Intermediate and terminal operators, such as limit() or findFirst() can be short-circuiting, once they achieve their goal they stop further stream processing. Intermediate operations are further divided into stateless and stateful operations. Stateless operations, such as filter() and map(), retain no state from the previously seen element when processing a new element, hence each element can be processed independently of operations on other elements. Stateful operations, such as distinct() and sorted(), may incorporate state from previously seen elements when processing new elements. For example, one cannot produce any results from sorting a stream until one has seen all elements of the stream. As a result, under parallel computation, some pipelines containing stateful intermediate operations may require multiple passes on the data or may need to buffer significant data. Stateful operations should be carefully considered when constructing stream pipeline because they might require significant resources.

Stream API methods

Below is a list of most of the methods available in Stream interface with a short description. Code examples with explanations are in the following post.

filter

Stream filter(Predicate<? super T> predicate) – a stateless intermediate operation that returns a stream consisting of the elements of this stream matching the given predicate.

map

Stream map(Function<? super T, ? extends R> mapper) – a stateless intermediate operation that converts a value of one type into another by applying a function that does the conversion. Result is one output value for one input value.

distinct

Stream distinct() – stateful intermediate operation that removes duplicated elements using equals() method.

sorted

Stream sorted() or Stream sorted(Comparator<? super T> comparator) – stateful intermediate operation that sorts stream elements according to given or default comparator.

peek

Stream peek(Consumer<? super T> action) – a stateless intermediate operation that performs an action on an element once the stream is consumed. It does not change the stream or alter stream elements. It is mainly used for debugging purposes.

collect

<R, A> R collect(Collector<? super T, A, R> collector) or R collect(Supplier supplier, BiConsumer<R, ? super T> accumulator, BiConsumer<R, R> combiner) – terminal operation that performs mutable reduction operation on the stream elements reducing the stream to a mutable result collector, such as an ArrayList. Stream elements are incorporated into the result by updating it instead of replacing.

toArray

Object[] toArray() – terminal operation that returns array containing elements of this stream.

flatMap

<R> Stream<R> flatMap(Function<? super T, ? extends Stream<? extends R>> mapper) – stateless intermediate operation that replaces value with a stream. A result is an arbitrary number of output values to a single input value.

limit

Stream<T> limit(long maxSize) – a short-circuiting stateful intermediate operation that truncates a stream to a given length.

skip

Stream<T> skip(long n) – a stateful intermediate operation that skips first elements from a stream.

forEach

void forEach(Consumer<? super T> action) – a terminal operation that performs an action for each element in the stream

reduce

T reduce(T identity, BinaryOperator<T> accumulator) or Optional<T> reduce(BinaryOperator<T> accumulator) or <U> U reduce(U identity, BiFunction<U, ? super T, U> accumulator, BinaryOperator<U> combiner) – terminal operation that performs reduction on the elements in the stream.

min

Optional<T> min(Comparator<? super T> comparator) – terminal operation that returns min element in stream based on given comparator. Special case of reduce operator.

max

Optional<T> max(Comparator<? super T> comparator) – terminal operation that returns max element in stream based on given comparator. Special case of reduce operator.

count

long count() – a terminal operation that counts elements in a stream.

anyMatch

boolean anyMatch(Predicate<? super T> predicate) – a short-circuiting terminal operation that returns a boolean result if an element in stream conforms to given predicate. Once the result is true operation is cancelled and the result is returned.

allMatch

boolean allMatch(Predicate<? super T> predicate) – a short-circuiting terminal operation that returns a boolean result if all elements in stream conforms to given predicate. Once the result is false operation is cancelled and the result is returned.

noneMatch

boolean noneMatch(Predicate<? super T> predicate) – a short-circuiting terminal operation that returns a boolean result if none elements in stream conform to given predicate. Once the result is false operation is cancelled and the result is returned.

findFirst

Optional<T> findFirst() – a short-circuiting terminal operation that returns an Optional with the first element of this stream or an empty Optional if the stream is empty. If the stream has no order, such as Map or Set, then any element may be returned.

findAny

Optional<T> findAny()  – a short-circuiting terminal operation that returns an Optional with some element of the stream or an empty Optional if the stream is empty.

Conclusion

Stream API is very powerful instrument provided in Java 8. They allow data processing in a declarative way and in parallel. Code looks very neat and easy to read.

Related Posts

Read more...

Java 8 features – Lambda expressions, Interface changes, Stream API, DateTime API

Last Updated on by

Post summary: Short overview of most interesting and useful Java 8 features.

More details and code examples are available for Stream API in a post to follow.

Java 8

Java 8 is released March 2014, more three years ago, so we should have already been familiar with its features, which are really nice and can significantly improve our code. Below are some of them I find most interesting and important.

Lambda expressions

In math, Lambda calculus is a way for expressing computation based on function abstraction and was first introduced in the 1930s. This is where the name of Lambda expressions in Java comes from. Functional interface is another concept that is closely related to lambda expressions. A functional interface is an interface with just one method that is to be implemented. Lambda expression is an inline code that implements this interface without creating a concrete or anonymous class. Lambda expression is basically an anonymous method. With lambda expression code is treated as data and lambda expression can be passed as an argument to another method allowing code itself to be invoked at a later stage. Sometimes when using lambda expression all you do is call a single method by name. Method reference is a shortcut for calling a method making the code more readable. Lambdas, functional interfaces, and method reference are very much used with Stream API and will be covered in details in a separate post.

Method implementation in an Interface

With this feature interfaces are not what they used to be. It is now possible to have method implementation inside an interface. There are two types of methods – default and static. Default methods have implementation and all classes implementing this interface inherit this implementation. It is possible to override existing default method. Static methods also have an implementation, but cannot be overridden. Static methods are accessible from interfaces only (InterfaceName.methodName()), they are not accessible from classes implementing those interfaces. Having said that it seems now that interface with static methods is a good candidate for utility class, instead of having a final class with private constructor as it is usually done. I will not give code examples for this feature, there are lots of resources online.

Stream API

This might be the most significant feature in Java 8 release. It is related to lambda expressions as Stream methods have functional interfaces in their signature, so it is nice and easy to pass lambda expression. Stream API was introduced because default methods in interfaces were allowed. Interface java.util.Collection was extended with stream() method and if default methods were not allowed this would have meant a lot of custom implementations broken, essentially an incompatible change. Stream API provides methods for building pipelines for data processing. Unlike collections streams are not physical objects, they are abstractions and become physical when they are needed. Huge benefits of streams are that they are designed to facilitate multi-core architectures without developers to worry about it. Everything happens behind the scenes. Stream API is explained in more details in following posts:

Date and Time API

Prior to Java 8 date-time classes were not thread-safe and calculations and date-time manipulations were very hard. Also, time zones management was hard. In Java 8 date-time classes are now immutable which makes then thread-safe. In most of the projects, I’ve seen prior to Java 8 instead of using default Java time classes Joda-Time library was used. It is an amazing library providing so much features to manipulate date and time. In Java 8 date and time classes follow principles from Joda-Time which makes Java 8 Date and Time API very efficient. Actually, the Joda-Time designer was the Java specification lead for JSR 310. In Java 8 there is local and zoned date-time classes. I’m not going to get into details here, there are many tutorials online for Java 8 Date and Time API usage. I just say – start using it! It is located in java.time.* package.

Conclusion

Java 8 has really great features. I anticipate you are already using it, if not – start right now!

Related Posts

Read more...

Run multiple machines in a single Vagrant file

Last Updated on by

Post summary: How to run multiple machines on Vagrant described in a single Vagrantfile.

The code below can be found in GitHub sample-dropwizard-rest-stub repository in Vagrantfile file. This post is part of Vagrant series. All of other Vagrant related posts, as well as more theoretical information what is Vagrant and why to use it, can be found in What is Vagrant and why to use it post.

Vagrantfile

As described in Vagrant introduction post all configurations are done in a single text file called Vagrantfile. Below is a Vagrant file which can be used to initialize two machines. One is same as described in Run Dropwizard Java application on Vagrant post, the other is the one described in Run Docker container on Vagrant post.

Vagrant.configure('2') do |config|

  config.vm.hostname = 'dropwizard'
  config.vm.box = 'opscode-centos-7.2'
  config.vm.box_url = 'http://opscode-vm-bento.s3.amazonaws.com/vagrant/virtualbox/opscode_centos-7.2_chef-provisionerless.box'

  config.vm.synced_folder './', '/vagrant'

  config.vm.define 'jar' do |jar|
    jar.vm.network :forwarded_port, guest: 9000, host: 9100
    jar.vm.network :forwarded_port, guest: 9001, host: 9101

    jar.vm.provider :virtualbox do |vb|
      vb.name = 'dropwizard-rest-stub-jar'
    end

    jar.vm.provision :shell do |shell|
      shell.inline = <<-SHELL
        sudo service dropwizard stop
        sudo yum -y install java
        sudo mkdir -p /var/dropwizard-rest-stub
        sudo mkdir -p /var/dropwizard-rest-stub/logs
        sudo cp /vagrant/target/sample-dropwizard-rest-stub-1.0-SNAPSHOT.jar /var/dropwizard-rest-stub/dropwizard-rest-stub.jar
        sudo cp /vagrant/config-vagrant.yml /var/dropwizard-rest-stub/config.yml
        sudo cp /vagrant/linux_service_file /etc/init.d/dropwizard
        # Replace CR+LF with LF because of Windows
        sudo sed -i -e 's/\r//g' /etc/init.d/dropwizard
        sudo service dropwizard start
      SHELL
    end
  end

  config.vm.define 'docker' do |docker|
    docker.vm.network :forwarded_port, guest: 9000, host: 9000
    docker.vm.network :forwarded_port, guest: 9001, host: 9001

    docker.vm.provider :virtualbox do |vb|
      vb.name = 'dropwizard-rest-stub-docker'
      vb.customize ['modifyvm', :id, '--memory', '768', '--cpus', '2']
    end
  
    docker.vm.provision :shell do |shell|
      shell.inline = <<-SHELL
        sudo yum -y install epel-release
        sudo yum -y install python-pip
        sudo pip install --upgrade pip
        sudo pip install six==1.4
        sudo pip install docker-py
      SHELL
    end
  
    docker.vm.provision :docker do |docker|
      docker.build_image '/vagrant/.', args: '-t dropwizard-rest-stub'
      docker.run 'dropwizard-rest-stub', args: '-it -p 9000:9000 -p 9001:9001 -e ENV_VARIABLE_VERSION=1.1.1'
    end
  end
  
end

Vagrantfile explanation

The file starts with a Vagrant.configure(‘2’) do |config| which states that version 2 of Vagrant API will be used and defines constant with name config to be used below. Guest operating system hostname is set to config.vm.hostname. If you use vagrant-hostsupdater plugin it will add it to your hosts file and you can access it from a browser in case you are developing web applications. With config.vm.box you define which would be the guest operating system. Vagrant maintains config.vm.box = “hashicorp/precise64” which is Ubuntu 12.04 (32 and 64-bit), they also recommend to use Bento’s boxes, but I found issues with Vagrant’s as well as Bento’s boxes so I’ve decided to use one I know is working. I specify where it is located with config.vm.box_url. It is It is CentOS 7.2. With config.vm.synced_folder command, you specify that Vagrantfile location folder is shared as /vagrant/ in the guest operating system. This makes it easy to transfer files between guest and host operating systems. Now comes the part where two different machines are defined. First one is defined with config.vm.define ‘jar’ do |jar|, which declares variable jar to be used later in configurations. All other configurations are well described in Run Dropwizard Java application on Vagrant post. The specific part here is port mapping. In order to avoid port collision port 9000 from the guest is mapped to port 9100 to host with jar.vm.network :forwarded_port, guest: 9000, host: 9100 line. This is because the second machine uses port 9000 from the host. The second machine is defined in config.vm.define ‘docker’ do |docker|, which declares variable docker to be used in further configurations. All other configurations are described in Run Docker container on Vagrant post.

Running Vagrant

Command to start Vagrant machine is: vagrant up. Then in order to invoke provisioning section with actual deployment, you have to call: vagrant provision. All can be done in one step: vagrant up –provision. To shut down the machine use vagrant halt. To delete machine: vagrant destroy.

Conclusion

It is very easy to create Vagrantfile that builds and runs several machines with different applications. It possible to make those machine communicate with each other, hence simulation real environment. Once created file can be reused by all team members. It is executed over and over again making provisioning extremely easy.

Related Posts

Read more...

Run Docker container on Vagrant

Last Updated on by

Post summary: How to run Docker container on Vagrant.

The code below can be found in GitHub sample-dropwizard-rest-stub repository in Vagrantfile-docker file. Since Vagrant requires to have only one Vagrantfile if you want to run this example you have to rename Vagrantfile-docker to Vagrantfile then run Vagrant commands described at the end of this post. This post is part of Vagrant series. All of other Vagrant related posts, as well as more theoretical information what is Vagrant and why to use it, can be found in What is Vagrant and why to use it post.

Vagrantfile

As described in Vagrant introduction post all configurations are done in a single text file called Vagrantfile. Below is a Vagrant file which can be used to deploy and start Docker container on Vagrant. The example here uses Dockerised application that is described in Run Dropwizard application in Docker with templated configuration using environment variables post.

Vagrant.configure('2') do |config|

  config.vm.hostname = 'dropwizard'
  config.vm.box = 'opscode-centos-7.2'
  config.vm.box_url = 'http://opscode-vm-bento.s3.amazonaws.com/vagrant/virtualbox/opscode_centos-7.2_chef-provisionerless.box'

  config.vm.synced_folder './', '/vagrant'

  config.vm.network :forwarded_port, guest: 9000, host: 9000
  config.vm.network :forwarded_port, guest: 9001, host: 9001

  config.vm.provider :virtualbox do |vb|
    vb.name = 'dropwizard-rest-stub-docker'
    vb.customize ['modifyvm', :id, '--memory', '768', '--cpus', '2']
  end

  config.vm.provision :shell do |shell|
    shell.inline = <<-SHELL
      sudo yum -y install epel-release
      sudo yum -y install python-pip
      sudo pip install --upgrade pip
      sudo pip install six==1.4
      sudo pip install docker-py
    SHELL
  end

  config.vm.provision :docker do |docker|
    docker.build_image '/vagrant/.', args: '-t dropwizard-rest-stub'
    docker.run 'dropwizard-rest-stub', args: '-it -p 9000:9000 -p 9001:9001 -e ENV_VARIABLE_VERSION=1.1.1'
  end

end

Vagrantfile explanation

The file starts with a Vagrant.configure(‘2’) do |config| which states that version 2 of Vagrant API will be used and defines constant with name config to be used below. Guest operating system hostname is set to config.vm.hostname. If you use vagrant-hostsupdater plugin it will add it to your hosts file and you can access it from a browser in case you are developing web applications. With config.vm.box you define which would be the guest operating system. Vagrant maintains config.vm.box = “hashicorp/precise64” which is Ubuntu 12.04 (32 and 64-bit), they also recommend to use Bento’s boxes. I have found issues with Vagrant’s as well as Bento’s boxes so I’ve decided to use one I know is working. I specify where it is located with config.vm.box_url. It is CentOS 7.2. With config.vm.synced_folder command, you specify that Vagrantfile location folder is shared as /vagrant/ in the guest operating system. This makes it easy to transfer files between guest and host operating systems. This mount is done by default, but it is good to explicitly state it for better readability. With config.vm.network :forwarded_port port from guest OS is forwarded to your hosting OS. Without exposing any port you will not have access to guest OS, only port open by default is 22 for SSH. With config.vm.provider :virtualbox do |vb| you access VirtualBox provider for more configurations, vb.name = ‘dropwizard-rest-stub-docker’ sets the name that you see in Oracle VirtualBox Manager. With vb.customize [‘modifyvm’, :id, ‘–memory’, ‘768’, ‘–cpus’, ‘2’] you modify default hardware settings for the machine, RAM is set to 768MB and 2 CPUs are configured. Finally, the provisioning part takes place which is done by shell commands inside config.vm.provision :shell do |shell| block. This block installs Python as well as docker-py. It is CentOS specific as it uses YUM which is CentOS package manager. Next provisioning part is to run docker provisioner that builds docker image and then runs it by mapping ports and setting an environment variable. For more details how to build and run Docker containers read Run Dropwizard application in Docker with templated configuration using environment variables post.

Running Vagrant

Command to start Vagrant machine is: vagrant up. Then in order to invoke provisioning section with actual deployment, you have to call: vagrant provision. All can be done in one step: vagrant up –provision. To shut down the machine use vagrant halt. To delete machine: vagrant destroy.

Conclusion

It is very easy to create Vagrantfile that builds and runs Docker container. Once created file can be reused by all team members. It is executed over and over again making provisioning extremely easy.

Related Posts

Read more...