Cypress vs. Selenium, is this the end of an era?

Last Updated on by

Post summary: Blog post about a Cypress talk I did recently on a local conference. The presentation compares Cypress with well knows Selenium.

Background

This weekend I did a small talk about Cypress, named “Cypress vs. Selenium, the end of an era?” on QA Challenge Accepted, a local testing conference. This is my second talk on this conference. In 2016 I spoke about Gatling. I haven’t blogged about my Galing talks because my blog covers the tool very extensively. In Performance testing with Gatling post, there is complete Gatling tutorial. In the current post, I will show most of the slides of my presentation and will describe what I have spoken about. The full Cypress presentation can be found on SlideShare: QA Challenge Accepted 4.0 – Cypress vs. Selenium.

Presentation

Selenium Overview

Selenium is a very well known tool, so I will not get into details about it. I will emphasize on its architecture which will be important for the rest of the presentation. Selenium consists of two components. One is so-called bindings, libraries for different programming languages that we use to write our tests with. The other component is the WebDriver. WebDriver is a program that can manage and fully control a specific browser, for which it is designated. The important bit here is that those two components communicate over HTTP by exchanging JSON payload. This is well defined by WebDriver Protocol, which is W3C Candidate Recommendation. Every command used in tests results to a JSON sent through the network. This network communication happens even if tests are run locally. In this case, requests are sent to localhost behind which there is loopback network interface. Even on localhost request travels to Layer 3 of the OSI Model. Request travels through 5 layers, only layers 1 (physical) and 2 (data link) are skipped.

After the conference, I spoke to Anton Angelov, the founder of Automate The Planet and he pointed out that on some WebDrivers on .NET Core, the resolution from localhost in the request to 127.0.0.1, which is the IPv4 address of loopback interface, can take up to a second. This is, of course, .NET Core specific thing. Anyway, in general, resolving localhost to 127.0.0.1 also needs some time which is added to total execution time.

The bottom line of the slide: By architecture Selenium works through the network and this brings delay, which can sometimes be significant.

Cypress Overview

Cypress is used for UI testing but is not based on Selenium. There are many tools out there which bring a lot of abstractions over the WebDriver, but they are limited to the WebDriver as a technology for browser manipulation. All those tools inherit WebDriver limitations. Cypress has its own mechanism for manipulation DOM in the browser. Cypress runs directly in the browser, no network communication involved. By running directly in the browser Cypress has access to everything in the browser, including your application under test. I do not know a valid reason for this, but in my observations, developers strongly do not like Selenium. Cypress is designed with developers in mind, so it is very developer-friendly. Debugging tests with Cypress is easy, there is so-called travel back in time. I will speak for it later.

The bottom line of the slide: Cypress is made from scratch with its own unique DOM manipulation technology and is made with developers in mind.

How to do it

The bottom line of the slide: It is easy to install Cypress. It is easy to write tests with it. It is very easy to debug tests. It is easy to include it in continuous integration or continuous delivery pipelines.

Debug tests in Cypress Test Runner

Cypress Test Runner is a browser instance in which you see all your tests’ steps on the left-hand side. You can click on any step and in the right-hand side window, the application under test is visualized. Cypress makes DOM snapshot before each test steps, so you can easily inspect them.

The bottom line of the slide: Cypress provides DOM snapshots at each test step for easy test debugging.

Library or Framework

Comparison between both tools now begins. Selenium is a library. If you want to make real UI automation you have to combine it with a unit testing framework or make your own runner; you may want to add assertions library or reporting one. This is handy and gives you great flexibility because if you know what you do you can make miracles. You become a creator! If you don’t know what you do you can very easily shoot yourself in the foot. I think this is mostly because developers hate Selenium. Its usage is not straightforward. In order to start writing tests, you have to do a lot of preparational work. This is something developers do not want to invest in, they invest enough in learning all the frameworks related to their work. They do not want to spend time on several more. Cypress, on the other hand, is a complete framework. You install it and start writing tests. It includes Mocha, very famous JavaScript unit testing framework; Chai is assertions library; Chai-jQuery adds jQuery chainer methods to Chai; Sinon is famous JavaScript mocking library that provides mocks, stubs, and spies; Sinon-Chai brings Chai assertions on stubs and spies.

The bottom line of the slide: Selenium is a library allowing you great flexibility. It requires a lot of preparational work before you can start writing tests. Cypress is a complete framework. You install it and start writing tests.

Test Pyramid

Test pyramid illustrates how your test portfolio should look like. Currently is show a test pyramid for a modern web application. In the bottom are the unit tests. They are fast and test most of the functions in your code. By integration tests is meant the following thing. Defacto the standard now for web applications is to use some JavaScript framework and work with data from APIs. Modern web applications process data from APIs. With integration testing, we want to be in control of this data. We want for e.g. to test how web application behaves when API returns an error. If we have stable and well-tested API this scenario we won’t be able to test in reality. By stubbing the data web application works with it is possible to fully test the web application. UI tests are the one executed against deployed, configured and working web application. They are slow and flaky, thus they are limited in number. Selenium works in the UI part of the pyramid. Cypress is there as well, but Cypress is also very good in Integration tests. The dotted line is mostly a wishful thinking – we have unit testing framework included, why not create some unit tests.

The bottom line of the slide: Selenium works only in the UI part of the test pyramid, while Cypress is involved in UI tests and most important in the integration tests.

Programming languages

There are bindings in almost any programming language existing nowadays. If there is not such, by following WebDriver protocol you can create your own binding. Cypress on the other hand only uses JavaScript and will continue to only use JavaScript. There are two reasons for this. FIrst one is not significant, but it is good your tests code to be as you application under test’s code. The most important reason though is that as I said modern web applications are written in JavaScript frameworks. Developers do know JavaScript, so they can very easily write their own Cypress tests.

The bottom line of the slide: Selenium is available in all programming languages. Developers of modern web application know JavaScript. With Cypress, they can write their own tests.

Selectors

Selenium supports 8 different locators. CSS and XPath are the most powerful ones. Cypress supports jQuery selectors. What you can use as a CSS selector in Selenium, you can directly use as a selector in Cypress. The benefit is that jQuery provides more selectors on hand.

The bottom line of the slide: jQuery selectors give more capabilities than CSS selectors.

Supported Browsers

Selenium supports all significant browsers. You can even create your own browser, make WebDriver for it following WebDriver protocol and your current tests will work exactly the same on this new browser. Cypress at this point supports only Chrome. This is maybe the biggest weakness of the tool. Good thing though is that more than 60% of the web uses Chome. Another good thing is that Firefox support is on its way. IE 11 and Edge support is also on the roadmap but with no clear dates.

The bottom line of the slide: Cypress is weak at cross-browser testing. Cypress team is working through to get better in this area.

Cypress vs. Selenium (1)

Comparison of different characteristics:

  • Speed – Selenium tests are generally slow. WebDriver starting is slow, WebDriver is working slowly. Network operating nature of WebDriver also brings some delay. Cypress is super fast. There is no noticeable delay because of the tool itself.
  • Wait for element – in order to do some effective automation with Selenium, waiting for an element is an important part of your framework. You have to have good error catching mechanism as well as retry logic. It takes significant effort to make tests non-flaky. With Cypress, you do not wait. Cypress runs in the browser and knows what is happening behind the scenes, whether the application under test is still busy. In Cypress, you request the element and you get it when the element is ready, no extra code or logic needed.
  • Remote execution – this is what Selenium is made for. You can use Selenium Grid with different browsers and different browser versions. Cypress does not support remote execution.
  • Parallel execution – Selenium is a library that can manipulate the browser. If you make your code thread safe and use a unit testing framework that supports parallel runs then you will have parallel execution. Selenium does not really care about this. Cypress currently does not support parallel execution. It is possible to do it on your own with Docker images, but this involves additional effort. Currently, Cypress team is working on developing parallel execution, so this will happen soon.
  • Headless – both tools support headless Chrome.

Cypress vs. Selenium (2)

Comparison of different characteristics:

  • Screenshot – both perform equally bad because both make screenshot only of the visible part of the page. In order to get the full page, you need to use external JavaScript libraries to capture page and save it as a screenshot. Selenium is a little bit better on screenshot though, because it gives you screenshot object in your tests and you can save it wherever you want. Cypress makes an automatic screenshot with a fixed name. In order to make second screenshot for one test, you need to do some file manipulations for renaming the previous file.
  • Video – Selenium does not record video. Cypress records video by default when tests are run from command line.
  • Documentation – Selenium documentation for me is ugly and not complete. If one has to get acknowledged with Selenium by reading its documentation only that would be very difficult. Cypress team had invested a lot in the documentation. They have their API well described, they have examples and FAQ page.
  • Community – Selenium is an institution. Everybody is using Selenium. For every problem, you may encounter there are already tens of solutions. Cypress does not have such a community yet, not many people are using it. They have chat though in which Cypress developers answer your queries. This chat gets flooded with information, so you can easily get lost.

Cypress vs. Selenium (3)

Comparison of different characteristics:

  • Execute JS – Selenium allows JavaScript execution and this is fast. I’ve seen frameworks where Selenium is used only for JavaScript execution. Cypress is designed to work with JavaScript. It has full access to everything in the browser, including application under test.
  • Switch tabs – Selenium can switch between two tabs of the same browser, Cypress cannot.
  • Several browsers – Selenium can work with several browser windows, even from different browsers. Cypress can work with only one browser instance.
  • Load extensions – both tools allow you to run your tests with some Chrome extension.
  • Manage cookies – both tools manage cookies equally well.

Test Mushroom

This is some funny, ironic but mostly tragic term I’ve made up. It is made up with an analogy to the test pyramid. This mushroom represents a very common test portfolio where the quality of releases totally depends on UI tests. The mushroom leg represents Unit and Integration tests. It is shown with a dotted line as such test are missing. UI tests are slow and flaky, every release sign off takes a lot of time for debugging failed tests. Every release has a risk of failure.

The bottom line of the slide: There are many real-life scenarios where web application quality depends only on UI tests, which are brittle.Cypress gives you instrumentation to act on integrations test, thus reducing the number of UI ones.

The end of an era?

Now is the time to give an answer to the most interesting part of presentation name. Is this an end of an era? My presentation is not really about the competition between those two tools. Everyone has its strengths and weaknesses. They work perfectly combined together. My main point is and I truly believe it is the end of Developers don’t test era. Selenium may not be their favorite tool, but Cypress is made from developers for developers. It is easy to work with and provides features to speed up test writing. If you try the tool and do not like it, at least try to introduce it to developers in your company.

The bottom line of the slide: Cypress is a tool created by developers for developers. Try to introduce it to developers in your organization.

Cypress Sugar (1)

Those are features that make Cypress interesting tool and that make integration testing easy. Cypress runs directly in the browser and has access to everything in the browser, including application under test. Sometimes Selenium tests go through several pages just to bring the application in some desired state. With Cypress, you can programmatically bring the application to this desired state. Cypress provides spies, stubs, and clocks. With spies, you can verify if given JavaScript function has been called, with what arguments or how many times. Stubs allow you to change the default behavior of JavaScript functions and feed to the application under test the data that you need. For e.g. window.fetch is the new way of getting data from API. This can be very easily stubbed. Cypress provides control over the clock in the browser. If you have some animation, instead of waiting for it, you can move the clock forcing animation to show. Cypress allows you to have full control over the network traffic within the browser. You can assert on XMLHttpRequest to an API, verifying that API is called with proper arguments. You can intercept, change, delay, or block response from the API. This allows you to cover various integration scenarios. With Cypress, you can develop even though there is no backend ready yet. You can do TDD (test driven development), create tests and stub the data from missing API in them, develop the UI and then run the tests until they get green.

The bottom line of the slide: Cypress provides great functionalities for stubbing JavaScript functions and control on network traffic within the browser. Those can be used for creating various integration tests.

Cypress Sugar (2)

The essential functionality of most applications is hidden behind a login. Login through the UI slows up the tests. Cypress allows you to send a login request to the backend and it extracts cookies from the response, injects them in the browser and from now on the user is logged in tests. This request takes browser user agent and existing cookies but skips some security limitations, such as CORS (cross-origin resource sharing). Same can be done with Selenium. You can use some HTTP client, send login request, get the response and inject the cookies in the browser. With Selenium, this requires additional effort though, with Cypress it comes out of the box. Last but not least, with Cypress you can test Electron applications. Electron is a framework which enables you to write desktop applications in HTML, CSS, and JavaScript. Those applications are run within Electron browser, which is based on Chromium and Node.js. A good example of an Electron application is Postman (check Introduction to Postman with examples post). Cypress Test Runner is also an Electron application. And Cypress uses Cypress to test Cypress (Test Runner). Testing of Electron applications is not really a straight-forward task. It requires some amount of functions stubbing.

Conclusion

Cypress is a really great tool. It provides very good features to enable you to create integration tests. I have used Selenium way too much in order to dislike it. Tests with it are slow and flaky. I really hope there is something better out there. On the other hand, I have used Cypress way too little to like it very much and think this is the tool. In any way do try Cypress. If you do not like it, then definitely introduce it to your developers.

Related Posts

Read more...

Manage and automatically select needed WebDriver in Java 8 Selenium project

Last Updated on by

Post summary: Example code how to efficiently manage and automatically select needed local WebDriver using Java 8 method reference used as lambda expression.

Code examples in the current post can be found in GitHub selenium-samples-java/design-patterns repository.

Java 8 features

In this example lambda expression and method reference, Java 8 features are used. More in Java 8 features can be found in Java 8 features – Lambda expressions, Interface changes, Stream API, DateTime API post.

Functional interface

Before explaining lambda it is needed to understand the idea of a functional interface as they are leveraged for use with lambda expressions. A functional interface is an interface that has only one abstract method that is to be implemented. A functional interface may or may not have default or static methods (again new Java 8 feature). Although not mandatory, a good practice is to annotate the functional interface with @FunctionalInterface.

Lambda expressions

There is no such term in Java, but you can think of lambda expression as an anonymous method. Lambda expression is a piece of code that provides an inline implementation of a functional interface, eliminating the need for using anonymous classes. Lambda expressions facilitate functional programming and ease development by reducing the amount of code needed.

Method reference

Sometimes when using lambda expression all you do is call a method by name. Method reference provides an easy way to call the method making the code more readable.

Managing WebDriver

The proposed solution of managing WebDriver has enumeration called Browser and class called WebDriverFactory. Another important thing is web drivers should be placed in a folder with name webdrivers and named with a special pattern.

Browser enum

The code is shown below:

package com.automationrhapsody.designpatterns;

import java.util.Arrays;
import java.util.function.Supplier;

import org.openqa.selenium.WebDriver;
import org.openqa.selenium.chrome.ChromeDriver;
import org.openqa.selenium.firefox.FirefoxDriver;
import org.openqa.selenium.ie.InternetExplorerDriver;

public enum Browser {
	FIREFOX("gecko", FirefoxDriver::new),
	CHROME("chrome", ChromeDriver::new),
	IE("ie", InternetExplorerDriver::new);

	private String name;
	private Supplier<WebDriver> driverSupplier;

	Browser(String name, Supplier<WebDriver> driverSupplier) {
		this.name = name;
		this.driverSupplier = driverSupplier;
	}

	public String getName() {
		return name;
	}

	public WebDriver getDriver() {
		return driverSupplier.get();
	}

	public static Browser fromString(String value) {
		for (Browser browser : values()) {
			if (value != null && value.toLowerCase().equals(browser.getName())) {
				return browser;
			}
		}
		System.out.println("Invalid driver name passed as 'browser' property. "
			+ "One of: " + Arrays.toString(values()) + " is expected.");
		return FIREFOX;
	}
}

Enumeration’s constructor has Supplier functional interface as a parameter. When the constructor is called method reference FirefoxDriver::new is called as a lambda expression which purpose is to instantiate new Firefox driver. If only lambda expression is used is would be: () -> new FirefoxDriver(). Notice that method reference is much shorter and easy to read. getDriver() method invokes Supplier’s get() method which is implemented by the lambda expression, so lambda expression is executed hence instantiating new web driver. With this approach Firefox web driver object is created only when getDriver() method is called.

WebDriverFactory

Code is:

package com.automationrhapsody.designpatterns;

import java.io.File;

import org.openqa.selenium.WebDriver;

class WebDriverFactory {

	private static final String WEB_DRIVER_FOLDER = "webdrivers";

		public static WebDriver createWebDriver() {
		Browser browser = Browser.fromString(System.getProperty("browser"));
		String arch = System.getProperty("os.arch").contains("64") ? "64" : "32";
		String os = System.getProperty("os.name").toLowerCase().contains("win") 
				? "win.exe" : "linux";
		String driverFileName = browser.getName() + "driver-" + arch + "-" + os;
		String driverFilePath = driversFolder(new File("").getAbsolutePath());
		System.setProperty("webdriver." + browser.getName() + ".driver", 
				driverFilePath + driverFileName);
		return browser.getDriver();
	}

	private static String driversFolder(String path) {
		File file = new File(path);
		for (String item : file.list()) {
			if (WEB_DRIVER_FOLDER.equals(item)) {
				return file.getAbsolutePath() + "/" + WEB_DRIVER_FOLDER + "/";
			}
		}
		return driversFolder(file.getParent());
	}
}

This code recursively searches for a folder named webdrivers in the project. This is done because when you have a multi-module project running from IDE and from Maven has different root folder and finding web drivers is not possible from both simultaneously. Once the folder is found then proper web driver is selected based on OS and architecture. The code reads browser system property which can be passed from outside hence making the selection of web driver easy to configure. The important part is to have web drivers with special naming convention.

Web drivers naming convention

In order code above to work the web drivers should be placed in webdrivers folder in the project and their names should match the pattern: {DIVER_NAME}-{ARCHITECTURE}-{OS}, e.g. geckodriver-64-win.exe for Windows 64 bit and geckodriver-64-linux for Linux 64 bit.

Conclusion

The proposed solution is a very elegant way to manage your web drivers and select proper one just by passing -Dbrowser={BROWSER} Java system property.

Related Posts

Read more...

Complete guide how to use design patterns in automation

Last Updated on by

Post summary: Complete code example of how design patterns can be used in real life test automation.

With series of posts, I’ve described 5 design patterns that each automation test engineer should know and use. I’ve started with a brief description of the patterns. Then I’ve explained in details with code snippets following patterns: Page objects, Facade, Factory, Singleton, Null object. Code examples are located in GitHub for C# and Java.

Overview

This post is intended to bond all together in a complete guide how to do write better automation code. Generally, automation project consists of two parts. Automation framework project and tests project. Current guide is intended to describe how to build your automation testing framework. How to structure your tests is a different topic. Remember once having correctly designed framework then tests will be much more clean, maintainable and easy to write. To keep post shorter some of the code that is not essential for representing the idea is removed. The whole code is on GitHub.

Page objects

Everything starts by defining proper page objects. There is no fixed recipe for this. It all depends on the structure of application under test. The general rule is that repeating elements (header, footer, menu, widget, etc) are extracted as separate objects. The whole idea is to have one element defined in only one place (stay DRY)! Below is our HomePage object. What you can do generally is make search and clear search terms. Note that clearing is done with jQuery code. This is because of a bug I’ve described with a workaround in Selenium WebDriver cannot click UTF-8 icons post.

using OpenQA.Selenium;

namespace AutomationRhapsody.DesignPatterns
{
	class HomePageObject
	{
		private WebDriverFacade webDriver;
		public HomePageObject(WebDriverFacade webDriver)
		{
			this.webDriver = webDriver;
		}

		private IWebElement SearchField
		{
			get { return webDriver.FindElement(By.Id("search")); }
		}

		public void SearchFor(string text)
		{
			SearchField.SendKeys(text);
		}

		public void ClearSearch()
		{
			webDriver.ExecuteJavaScript("$('span.cancel').click()");
		}
	}
}

WebDriver factory

WebDriver factory will be responsible for instantiating the WebDriver based on a condition which browser we want to run our tests with.

using OpenQA.Selenium;
using OpenQA.Selenium.Chrome;
using OpenQA.Selenium.Firefox;
using OpenQA.Selenium.IE;

namespace AutomationRhapsody.DesignPatterns
{
	public class WebDriverFactory
	{
		public IWebDriver CreateInstance(Browsers browser)
		{
			if (Browsers.Chrome == browser)
			{
				return new ChromeDriver();
			}
			else if (Browsers.IE == browser)
			{
				return new InternetExplorerDriver();
			}
			else
			{
				return new FirefoxDriver();
			}
		}
	}
}

The constructor takes an argument browser type. Browser type is defined as an enumeration. This is very important. Avoid passing back and forth strings. Always stick to enums or special purpose classes. This will save you time investigating bugs in your automation.

public enum Browsers
{
	Chrome, IE, Firefox
}

NullWebElement

This is null object pattern and implements IWebElement. There is a NULL property that is used to compare is given element is not found or no.

using OpenQA.Selenium;
using System.Collections.Generic;
using System.Collections.ObjectModel;
using System.Drawing;

namespace AutomationRhapsody.DesignPatterns
{
	public class NullWebElement : IWebElement
	{
		private const string nullWebElement = "NullWebElement";
		public bool Displayed { get { return false; } }
		public bool Enabled { get { return false; } }
		public Point Location { get { return new Point(0, 0); } }
		public bool Selected { get { return false; } }
		public Size Size { get { return new Size(0, 0); } }
		public string TagName { get { return nullWebElement; } }
		public string Text { get { return nullWebElement; } }
		public void Clear() { }
		public void Click() { }
		public string GetAttribute(string attributeName) { return nullWebElement; }
		public string GetCssValue(string propertyName) { return nullWebElement; }
		public void SendKeys(string text) { }
		public void Submit() { }
		public IWebElement FindElement(By by) { return this; }
		public ReadOnlyCollection<IWebElement> FindElements(By by)
		{
			return new ReadOnlyCollection<IWebElement>(new List<IWebElement>());
		}

		private NullWebElement() { }

		private static NullWebElement instance;
		public static NullWebElement NULL
		{
			get
			{
				if (instance == null)
				{
					instance = new NullWebElement();
				}
				return instance;
			}
		}
	}
}

WebDriver facade

WebDriver facade main responsibility is to define custom behavior on elements location. This gives you centralized control over elements location. The constructor takes browser type and uses the factory to create WebDriver instance which is used internally in the facade. FindElement method defines explicit wait. If the element is not found then NullWebElement which is actual implementation of Null object pattern. The idea is to safely locate elements with try/catch and then just use them skipping checks for null.

using OpenQA.Selenium;
using OpenQA.Selenium.Support.UI;
using System;
using System.Collections.ObjectModel;

namespace AutomationRhapsody.DesignPatterns
{
	public class WebDriverFacade
	{
		private IWebDriver webDriver = null;
		private TimeSpan waitForElement = TimeSpan.FromSeconds(5);

		public WebDriverFacade(Browsers browser)
		{
			WebDriverFactory factory = new WebDriverFactory();
			webDriver = factory.CreateInstance(browser);
		}

		public void Start(string url)
		{
			webDriver.Url = url;
			webDriver.Navigate();
		}

		public void Stop()
		{
			webDriver.Quit();
		}

		public object ExecuteJavaScript(string script)
		{
			return ((IJavaScriptExecutor)webDriver).
				ExecuteScript("return " + script);
		}

		public IWebElement FindElement(By by)
		{
			try
			{
				WebDriverWait wait = new WebDriverWait(webDriver, waitForElement);
				return wait.Until(ExpectedConditions.ElementIsVisible(by));
			}
			catch
			{
				return NullWebElement.NULL;
			}
		}
	}
}

Tests

As I mentioned initially this post is about using efficiently design patterns in your framework automation project. Tests design are not being discussed here. Once you have designed the framework one simple test (without asserts) that makes search will look like the code below.

Browsers browser = Browsers.Chrome;
WebDriverFacade webDriver = new WebDriverFacade(browser);
webDriver.Start("https://automationrhapsody.com/examples/utf8icons.html");

HomePageObject homePage = new HomePageObject(webDriver);
homePage.ClearSearch();
homePage.SearchFor("automation");

webDriver.Stop();

Conclusion

Design patterns are enabling you to write maintainable code. They increase the value of your code. As shown in this series of posts they are not so complicated and you can easily adopt them in your automation. Most important is design patterns increase your value as a professional. So make the time to learn them!

Related Posts

Read more...

Selenium WebDriver cannot click UTF-8 icons

Last Updated on by

Post summary: Selenium WebDriver is not able to click UTF-8 icons. The solution is to use jQuery code.

In given example, there is a simple search form. Search value is cleared by “X” icon. This seems pretty forward case to be automated with Selenium WebDriver. It just seems! The element is properly located but when Selenium tries to click it an ElementNotVisibleException is thrown.

UTF-8 icon

When you inspect the code you can notice element is an empty SPAN. It has a UTF-8 icon displayed over it by CSS “content” property. With this approach, it is very easy to visualize vast amount of interesting icons without being proficient in image editing.

Debug

I’ve set up very simple Selenium project for Visual Studio 2013 in my GitHub repository. On debug, the element can be located, but its width=0 so Selenium treats it as Displayed=false. This seems a good candidate for Selenium bug.

// Locate element
IWebElement element = webDriver.FindElement(By.CssSelector("span.cancel"));
// Debug info
bool isDisplayed = element.Displayed; // false
int width = element.Size.Width; // 0
int heigth = element.Size.Height; // 17

Solve it

Problem solved with the execution of jQuery that will do the actual click of the element:

// jQuery workaround of the click
((IJavaScriptExecutor)webDriver).
	ExecuteScript("$('span.cancel').click()");

If jQuery is not available then try to do the click with a function of JavaScript library used by the application you are automating. If none is used they you can just use DOM JavaScript call:

((IJavaScriptExecutor)webDriver).
	ExecuteScript("document.getElementsByClassName('cancel')[0].click();");

Bonus

There is one more interesting thing about jQuery. You can execute jQuery with some specific selector logic and this will return the same IWebElement if you have located it with WebDriver.

// Locate element with jQuery
element = (IWebElement)((IJavaScriptExecutor)webDriver).
	ExecuteScript("return $('span.cancel')[0]");

One small detail – [0] is needed at the end of jQuery code otherwise a jQuery object is returned and Selenium is not able to cast it to IWebElement. And of course, the element is yet not clickable. I’m just giving an alternative way of locating elements if Selenium way gets too complicated.

Conclusion

jQuery can be used inside Selenium WebDriver for tasks which are impossible to be done otherwise or are too hard and not worth wasting time on it. jQuery can be quite helpful in your automation. So it is worth improving your skill set with it. Remember, the essence of automation is about saving your company time and money, not wasting them on insignificant tasks.

Read more...

Efficient waiting for Ajax call data loading with Selenium WebDriver

Last Updated on by

Post summary: This post is about implementing an efficient mechanism for Selenium WebDriver to wait for elements by execution of jQuery code.

Automating single page application with Selenium WebDriver could be sometimes a tricky task. You can get into the trap of timing issues. Although you set explicit waits you still can try to use an element that is not yet loaded by the Ajax call. Remember Thread.Sleep() is never an option! You can use very tiny sleep (100-200ms) in order to wait for initiation of given process, but never use sleep to wait for the end of the process.

Implement Selenium wrapper (Facade)

I good approach I like is to implement your own FindElement method which is basically a wrapper for Selenium’s methods (Facade design pattern). With this approach, you are hiding unneeded Selenium functionality and have centralized control over locating of elements and explicit waits. Locate behaviour of your entire framework is controlled in just one method.

private static TimeSpan waitForElement = TimeSpan.FromSeconds(10);

public static IWebElement FindElement(By by)
{
	try
	{
		WaitForReady();
		WebDriverWait wait = new WebDriverWait(webDriver, waitForElement);
		return wait.Until(ExpectedConditions.ElementIsVisible(by));
	}
	catch
	{
		return null;
	}
}

Code above is C# one and is implementation of explicit wait with WebDriverWait class from OpenQA.Selenium.Support.UI. You can see an unknown (so far) method WaitForReady(). Note that ElementIsVisible is used instead of ElementExists because element might be on the page but yet not ready to work with.

Wait for Ajax call to finish

Initially, WaitForReady() was supposed to check that Ajax has finished loading by using jQuery.active property. This is in case jQuery is used in the application under test. If this property is 0 then there are no active Ajax request to the server.

private static void WaitForReady()
{
	WebDriverWait wait = new WebDriverWait(webDriver, waitForElement);
	wait.Until(driver => (bool)((IJavaScriptExecutor)driver).
			ExecuteScript("return jQuery.active == 0"));
}

Wait for Ajax call to finish and data to load

You can realize that sometimes it is not enough to wait for Ajax to finish rather than to wait for data to be rendered. There is fancy loader in my application under which is a DIV shown when some action is being performed. If there is one on your application then you’d better wait not only for Ajax to finish but the loader to hide.

private static void WaitForReady()
{
	WebDriverWait wait = new WebDriverWait(webDriver, waitForElement);
	wait.Until(driver =>
	{
		bool isAjaxFinished = (bool)((IJavaScriptExecutor)driver).
			ExecuteScript("return jQuery.active == 0");
		try
		{
			driver.FindElement(By.ClassName("spinner"));
			return false;
		}
		catch
		{
			return isAjaxFinished;
		}
	});
}

If “spinner” location gives exception then loader is not present and we can stop waiting. Good!

Improve the wait for data load

What about performance? When putting a timer the result was ~300ms for each Selenium search for the loader. Not so good… Is 300ms long? Sure not, but taking into consideration this is called every time an element is located then this could make a huge difference in test execution times.

Why not make the same check for a hidden loader, but this time with a JavaScript call to the browser? I’m familiar with jQuery, then why not.

private static void WaitForReady()
{
	WebDriverWait wait = new WebDriverWait(webDriver, waitForElement);
	wait.Until(driver =>
	{
		bool isAjaxFinished = (bool)((IJavaScriptExecutor)driver).
			ExecuteScript("return jQuery.active == 0");
		bool isLoaderHidden = (bool)((IJavaScriptExecutor)driver).
			ExecuteScript("return $('.spinner').is(':visible') == false");
		return isAjaxFinished & isLoaderHidden;
	});
}

Conclusion

Same logic to check that element with class=”spinner” is not visible on the page but this time at a cost of ~30ms. I like it much better this way!

Related Posts

Read more...

Automating SignalR applications with Selenium WebDriver

Last Updated on by

Post summary: This post is about a “No response from server for url” issue during automation of web application using SignalR with Selenium WebDriver. The issue was resolved by configuring the application to connect to the server through WebSocket protocol.

I had to automate with Selenium WebDriver (version 2.43) a single page web application which was built with SignalR. The first thing to do when you start an automation is to automate a smoke test scenario. So did I. It run fine on Internet Explorer (version 11) and Chrome (version 37). I was happy and confident I’m going to finish this project on time. The happy face was dramatically changed when I run the suite on Firefox (version 33). The driver timed out with “No response from server for url” issue. I searched the net for similar problems with no luck. I couldn’t find an end-to-end solution to issues with SignalR automation so I prepared this post.

About SignalR

ASP.NET SignalR is a library for building “real-time” applications that enable the server to push events to the client instead of relying client to request data from the server. Once the application is started SignalR client tries to connect to the server with the transport protocols in the order shown in the list: WebSocket, Server-sent events, Forever Frame (IE only) and finally Ajax long polling.

My application was forced to connect to the server with Long Polling transport (later on I discovered this was caused by a bug in application’s connection logic). Client (browser) opens a connection to the server. The server keeps this connection open for a relatively long time (2 minutes in my case). Seems like this open connection is confusing Selenium and it doesn’t actually know when the browser is ready. “Unstable” page loading strategy did not work out:

FirefoxProfile profile = new FirefoxProfile();
profile.setPreference("webdriver.load.strategy", "unstable");
WebDriver driver = new FirefoxDriver(profile);

The only solution to make Selenium working with SignalR in all three browsers was to make the application under test working with WebSocket transport protocol.

It is also beneficial for the application itself to work with WebSockets. There are several SignalR prerequisites in order to use it with WebSockets: IIS 8 or IIS 8 Express with enabled WebSockets. In order to get enabled Web Sockets are installed additionally to IIS as a Windows feature.

Read more...