Distributed system observability: Instrument Selenium tests with OpenTelemetry

Last Updated on by

Post summary: Instrument Selenium tests with OpenTelemetry and be able to custom trace the tests themselves.

This post is part of Distributed system observability: complete end-to-end example series. The code used for this series of blog posts is located in selenium-observability-java GitHub repository.

Selenium

Selenium is browser automation software. It’s been around for many years and is de-facto the tool for web automation testing. It has bindings in all popular programming languages, which means people can write web automation tests in those languages.

Selenium observability

Selenium 4 comes with a pack of features. One of those features is the Selenium observability feature. It uses OpenTracing to keep track of the request’s lifecycle. This feature was the main driving factor for me to start to research the current examples. I pictured in my head end-to-end observability, from the test action down to the database call. I have to come up with a custom tracing solution, that is described in this post.

Selenium WebDriver architecture

Selenium consist of a client, those are the bindings and server, these are the executables that control the given browser. Both communicate via HTTP calls with JSON payload. This is described in detail in the W3C Selenium specification. I attach a small diagram, I used in a presentation I did a long time ago.

Selenium client instrumentation

Enabling the default selenium observability is very easy. A Jar dependency has to be added in pom.xml, environment variables to be set, and of course running Jaeger instance to collect the traces. Note that this works only for the RemoteWebDriver. It is described in detail in Remote WebDriver.

<dependency>
    <groupId>io.opentelemetry</groupId>
    <artifactId>opentelemetry-exporter-jaeger</artifactId>
    <version>1.6.0</version>
</dependency>
<dependency>
    <groupId>io.grpc</groupId>
    <artifactId>grpc-netty</artifactId>
    <version>1.41.0</version>
</dependency>

This goes along with the WebDriver instantiation code.

System.setProperty("otel.traces.exporter", "jaeger");
System.setProperty("otel.exporter.jaeger.endpoint", "http://localhost:14250");
System.setProperty("otel.resource.attributes", "service.name=selenium-java-client");
System.setProperty("otel.metrics.exporter", "none");
WebDriver driver = new RemoteWebDriver(
                new URL("http://localhost:4444"),
                new ImmutableCapabilities("browserName", "chrome"));

Selenium server instrumentation

Server instrumentation examples are shown in manoj9788/tracing-selenium-grid. Both the standalone server and Selenium grid can be instrumented. In the current examples, I am working only with the standalone server. Unlike the examples, I used Docker to do the instrumentation. I take the default selenium/standalone-chrome:4.0.0 image and install Coursier, a dependency resolver tool, on top of it. Then I run the dependency fetch, so this build sted gets cached for a faster rebuild. Selenium provides –ext flag, which can be set after the standalone command option. I could not make this work only by changing the SE_OPTS environment variable, so I made this rewrite of the startup command in /opt/bin/start-selenium-standalone.sh file. What I did was to change from java -jar to java -cp command, as -cp flag is ignored in case -jar flag is used.

FROM selenium/standalone-chrome:4.0.0

# Install coursier in order to fetch the dependencies
RUN cd /tmp && curl -k -fLo cs https://git.io/coursier-cli-"$(uname | tr LD ld)" && chmod +x cs && ./cs install cs && rm cs

# Download dependencies, so they are availble during run
RUN /home/seluser/.local/share/coursier/bin/cs fetch -p io.opentelemetry:opentelemetry-exporter-jaeger:1.6.0 io.grpc:grpc-netty:1.41.0

# Modify the run command to include dependent JARs in it
RUN sudo sed -i 's~-jar /opt/selenium/selenium-server.jar~-cp "/opt/selenium/selenium-server.jar:$(/home/seluser/.local/share/coursier/bin/cs fetch -p io.opentelemetry:opentelemetry-exporter-jaeger:1.6.0 io.grpc:grpc-netty:1.41.0)" org.openqa.selenium.grid.Main~g' /opt/bin/start-selenium-standalone.sh

# Enable OpenTelemetry
ENV JAVA_OPTS "$JAVA_OPTS \
  -Dotel.traces.exporter=jaeger \
  -Dotel.exporter.jaeger.endpoint=http://jaeger:14250 \
  -Dotel.resource.attributes=service.name=selenium-java-server"

Selenium default traces in Jaeger

RemoteWebDriver client is passing down the traceparent header when making the request to the server, this is why both client and server traces are connected.

Selenium tests custom observability

As stated before, in the case of HTTP calls, the OpenTelemetry binding between both parties is the traceparent header. I want to bind the Selenium tests with the frontend, so it comes naturally to mind – open the URL in the browser and provide this HTTP header. After research, I could not find a way to achieve this. I implemented a custom solution, which is WebDriver independent and can be customized as needed. Moreover, it is a web automation framework independent, this approach can be used with any web automation tool. An example of how tracing can be done with Cypress is shown in Distributed system observability: Instrument Cypress tests with OpenTelemetry post.

Instrument the frontend

In order to achieve linking, a JavaScript function is exposed in the frontend, which creates a parent Span. Then this JS function is called from the tests when needed. This function is named startBindingSpan() and is registered with the window global object. It creates a binding span with the same attributes (traceId, spanId, traceFlags) as the span used in the Selenium tests. This span never ends, so is not recorded in the traces. In order to enable this span, the traceSpan() function has to be manually used in the frontend code, because it links the current frontend context with the binding span. I have added another function, called flushTraces(). It forces the OpenTelemetry library to report the traces to Jaeger. Reporting is done with an HTTP call and the browser should not exit before all reporting requests are sent.

Note: some people consider exposing such a window-bound function in the frontend to modify React state as an anti-pattern. Frontend code is in src/helpers/tracing/index.ts:

declare const window: any
var bindingSpan: Span | undefined

window.startBindingSpan = (traceId: string, spanId: string, traceFlags: number) => {
  bindingSpan = webTracerWithZone.startSpan('')
  bindingSpan.spanContext().traceId = traceId
  bindingSpan.spanContext().spanId = spanId
  bindingSpan.spanContext().traceFlags = traceFlags
}

window.flushTraces = () => {
  provider.activeSpanProcessor.forceFlush().then(() => console.log('flushed'))
}

export function traceSpan<F extends (...args: any)
    => ReturnType<F>>(name: string, func: F): ReturnType<F> {
  var singleSpan: Span
  if (bindingSpan) {
    const ctx = trace.setSpan(context.active(), bindingSpan)
    singleSpan = webTracerWithZone.startSpan(name, undefined, ctx)
    bindingSpan = undefined
  } else {
    singleSpan = webTracerWithZone.startSpan(name)
  }
  return context.with(trace.setSpan(context.active(), singleSpan), () => {
    try {
      const result = func()
      singleSpan.end()
      return result
    } catch (error) {
      singleSpan.setStatus({ code: SpanStatusCode.ERROR })
      singleSpan.end()
      throw error
    }
  })
}

Instrument the Selenium tests

I have created a custom TracingWebDriver wrapper over the WebDriver. It instantiates the OpenTracing client with initializeTracer() method. It has a built-in custom logic when to generate a tracking span, which is the parent of this span, and when to link the tests’ span with the frontend span. Finding an element is done with the custom findElement() method. It creates a child span, linking it to the previously defined currentSpan. Then the window.startBindingSpan() function is being called in the browser in order to create the binding span in the frontend. This is the way to link tests and the frontend. In case of error, Span is recorded as an error and this can be tracked in Jaeger. On driver quit, or on URL change, or maybe on page change via a button, or whenever needed, window.flushTraces() function can be called by invoking forceFlushTraces() method in the tests. This has 1 second of Thread.sleep(), which waits for the tracing request to be fired from the frontend to the Jaeger. Sleeping like this is an anti-pattern for test automation, but I could not find a better way to wait for the traces. If the browser is prematurely closed or the page is navigated, then tracing is incorrect.

package com.automationrhapsody.observability;

import io.opentelemetry.api.trace.Span;
import io.opentelemetry.api.trace.StatusCode;
import io.opentelemetry.api.trace.Tracer;
import io.opentelemetry.context.Context;
import io.opentelemetry.exporter.jaeger.JaegerGrpcSpanExporter;
import io.opentelemetry.sdk.OpenTelemetrySdk;
import io.opentelemetry.sdk.resources.Resource;
import io.opentelemetry.sdk.trace.SdkTracerProvider;
import io.opentelemetry.sdk.trace.export.SimpleSpanProcessor;
import org.openqa.selenium.*;
import org.openqa.selenium.remote.tracing.opentelemetry.OpenTelemetryTracer;
import org.openqa.selenium.support.ui.ExpectedConditions;
import org.openqa.selenium.support.ui.WebDriverWait;

import java.io.File;
import java.time.Duration;
import java.util.List;

public class TracingWebDriver {

    private static final Duration WAIT_SECONDS = Duration.ofSeconds(5);
    private static final String JAEGER_GRPC_URL = "http://localhost:14250";

    private WebDriver driver;
    private Tracer tracer;
    private Span mainSpan;
    private Span currentSpan;

    public TracingWebDriver(boolean isRemote, String className, String methodName) {
        System.setProperty("otel.traces.exporter", "jaeger");
        System.setProperty("otel.exporter.jaeger.endpoint", JAEGER_GRPC_URL);
        System.setProperty("otel.resource.attributes", "service.name=selenium-java-client");
        System.setProperty("otel.metrics.exporter", "none");

        initializeTracer();

        mainSpan = tracer.spanBuilder("webdriver-create").startSpan();
        mainSpan.setAttribute("test.class.name", className);
        mainSpan.setAttribute("test.method.name", methodName);
        setCurrentSpan(mainSpan);
        driver = WebDriverFactory.createDriver(isRemote);
        mainSpan.end();
    }

    public void get(String url) {
        waitToLoad();
        setCurrentSpan(mainSpan);
        Span span = createChildSpan("get: " + url);
        try {
            forceFlushTraces();
            driver.get(url);
            createBrowserBindingSpan(span);
        } catch (Exception ex) {
            span.setStatus(StatusCode.ERROR, ex.getMessage());
            captureScreenshot();
        } finally {
            span.end();
        }
    }

    public WebElement findElement(By by) {
        waitToLoad();
        Span span = createChildSpan("findElement: " + by.toString());
        try {
            createBrowserBindingSpan(span);
            WebDriverWait wait = new WebDriverWait(driver, WAIT_SECONDS);
            return wait.until(ExpectedConditions.visibilityOfElementLocated(by));
        } catch (Exception ex) {
            span.setStatus(StatusCode.ERROR, ex.getMessage());
            captureScreenshot();
            return null;
        } finally {
            span.end();
        }
    }

    public void quit() {
        waitToLoad();
        forceFlushTraces();
        setCurrentSpan(mainSpan);
        Span span = createChildSpan("quit");
        driver.quit();
        span.end();
    }

    public Object executeJavaScript(String script) {
        return ((JavascriptExecutor) driver).executeScript(script);
    }

    public String captureScreenshot() {
        File screenshotFile = ((TakesScreenshot) driver).getScreenshotAs(OutputType.FILE);
        String output = screenshotFile.getAbsolutePath();
        System.out.println(output);
        return output;
    }

    private void waitToLoad() {
        WebDriverWait wait = new WebDriverWait(driver, WAIT_SECONDS);
        wait.until(ExpectedConditions.invisibilityOfElementLocated(By.id("test-progress-indicator")));
    }

    private void initializeTracer() {
        JaegerGrpcSpanExporter exporter = JaegerGrpcSpanExporter.builder().setEndpoint(JAEGER_GRPC_URL).build();
        Resource resource = Resource.builder()
                .put("service.name", "selenium-tests")
                .build();
        SdkTracerProvider provider = SdkTracerProvider.builder()
                .addSpanProcessor(SimpleSpanProcessor.create(exporter))
                .setResource(resource)
                .build();
        OpenTelemetrySdk openTelemetrySdk = OpenTelemetrySdk.builder()
                .setTracerProvider(provider)
                .build();
        tracer = openTelemetrySdk.getTracer("io.opentelemetry.jaeger.exporter");
    }

    private Span createChildSpan(String name) {
        Span span = tracer.spanBuilder(name)
                .setParent(Context.current().with(currentSpan))
                .startSpan();
        setCurrentSpan(span);
        return span;
    }

    private void setCurrentSpan(Span span) {
        currentSpan = span;
        OpenTelemetryTracer.getInstance().setOpenTelemetryContext(Context.current().with(span));
    }

    private void createBrowserBindingSpan(Span span) {
        executeJavaScript("window.startBindingSpan('"
                + span.getSpanContext().getTraceId() + "', '"
                + span.getSpanContext().getSpanId() + "', '"
                + span.getSpanContext().getTraceFlags().asHex() + "')");
    }

    private void forceFlushTraces() {
        executeJavaScript("if (window.flushTraces) window.flushTraces()");
        try {
            Thread.sleep(1000);
        } catch (InterruptedException e) {
            // Do nothing
        }
    }
}

End-to-end traces in Jaeger

In case of error, this is also recorded.

Linking default and custom traces

In an ideal world, I would like to make my custom Span parent of the default Selenium tracing spans, so I can attach the debug information to the custom tracing information. I was not able to do this. I have raised an issue with Selenium, OpenTelementry tracing: be able to link the default tracing with a custom tracing. I contributed to Selenium by adding this possibility to link Selenium traces to the custom traces. This is done with the following code OpenTelemetryTracer.getInstance().setOpenTelemetryContext(Context.current().with(span)). Finally, the full tracing looks as shown on the image:

Conclusion

The out-of-the-box Selenium observability is useful to trace what is happening in a complex grid. It does not give the possibility to trace tests performance and how test steps are affecting the application itself. In the current post, I have described a way to create a custom tracing, which provides end-to-end traceability from the tests down to the database calls. This approach gives the flexibility to be customized for different needs. It involves changes in the application’s frontend code though, which involves the application architecture topic in the discussions.

Related Posts

Read more...

Distributed system observability: complete end-to-end example with OpenTracing, Jaeger, Prometheus, Grafana, Spring Boot, React and Selenium

Last Updated on by

Post summary: Code examples and explanations on an end-to-end example showcasing a distributed system observability from the Selenium tests through React front end, all the way to the database calls of a Spring Boot application. Examples are implemented with the OpenTracing toolset and traces are saved in Jaeger. This example also shows a complete observability setup including tools like Grafana, Prometheus, Loki, and Promtail.

This post is part of Distributed system observability: complete end-to-end example series. The code used for this series of blog posts is located in selenium-observability-java GitHub repository.

Introduction

Nowadays, the MIcroservices architecture is very popular. It certainly has its benefits, allowing the companies to deliver faster products to the market. It is much easier to manage several small applications, each one of them with isolated responsibilities, rather than one big fat monolithic application. Microservices architecture has its challenges as well. One of those challenges is traceability. What happens in case of error, where did it occur, what microservices were involved, what were the requests flow through the system, where is the stack trace? In a monolithic application, the stack trace is shown into the logs, giving the exact location of the error. In a microservices landscape, errors are in many cases meaningless, unless there is full traceability of the request flow.

Observability and distributed tracing

Distributed tracing, also called distributed request tracing, is a method used to profile and monitor applications, especially those built using a microservices architecture. Distributed tracing helps pinpoint where failures occur and what causes poor performance. Logs, metrics, and traces are often known as the three pillars of observability. Further reading on observability can be done in The Three Pillars of Observability article.

OpenTracing

OpenTracing is an API specification and libraries, that enables the instrumentation of distributed applications. It is not locked to any particular vendors and allows flexibility just by changing the configuration of already instrumented applications. More details can be found in Instrumenting your application and What is Distributed Tracing?. Current examples are based on OpenTracing libraries and tools.

End-to-end traceability and observability

In the current examples, I am going to give an end-to-end solution, how observability can be achieved in a distributed system. I have used mnadeem/boot-opentelemetry-tempo project as a basis and have extended it with React Frontend and Selenium tests, to provide a complete end-to-end example. Below is a diagram of the full setup. All applications involved will be explained on a higher level.

PostgreSQL and pgAdmin

The basic examples used PostgreSQL, I thought of changing it to MySQL, but when I did short research, I found that PostgreSQL has some advantages. PostgreSQL is an object-relational database, while MySQL is a purely relational database. This means that Postgres includes features like table inheritance and function overloading, which can be important to certain applications. Postgres also adheres more closely to SQL standards. See more in MySQL vs PostgreSQL — Choose the Right Database for Your Project.

pgAdmin is the default user interface to manage a PostgreSQL database, so it is present in the architecture as well.

Spring Boot backend

Spring Boot is used as a backend. I did want to get some exposure to the technology, so I created a very basic application in Spring Boot. It uses the PostgreSQL database for reading and writing data. Spring Boot application is instrumented with OpenTelemetry Java library and exports the traces in Jaeger format directly to the Jaeger backend. It also writes application log files on a file system. Backend exposes APIs, which are consumed by the frontend. More details on the backend can be found in Distributed system observability: Instrument Spring Boot application with OpenTelemetry post.

React frontend

I am very experienced with React, so this was the natural choice for the frontend technology. The frontend uses fetch() to consume the backend APIs. It is instrumented with OpenTelementry JavaScript libraries to trace all communication happening through fetch() and to exports the traces in OpenTelemetry format to the OpenTelemetry collector. The frontend also has manual instrumentation which traces the actions done by end-users on it. More details on the frontend can be found in Distributed system observability: Instrument React application with OpenTelemetry post.

OpenTelemetry collector

OpenTelemetry collector converts the data received from the frontend in OpenTelemetry format into Jaeger format and exports it to the Jaeger backend. The collector is also extracting the span metrics, which are read by Prometheus, read more in Distributed system observability: extract and visualize metrics from OpenTelemetry spans post. Configurations are described in the collector configuration. Local configurations are in otel-config.yaml.

Selenium tests

Selenium was chosen for the web testing framework because of its observability feature. Actually, this was the reason for which I created the current examples. After getting to know the tracing features of Selenium better, I find them not much useful. Selenium does not provide traceability of the tests, but rather on its internal operations and performance. Having started with the tracing and the whole project, I could not ditch it in the middle, so I have to create a custom way to make Selenium trace the tests. Selenium tests export tracing information in Jaeger format directly into the Jaeger backend. More details on the tests can be found in Distributed system observability: Instrument Selenium tests with OpenTelemetry post.

Cypress tests

Cypress is a front-end testing tool built for the modern web. It is most often compared to Selenium. The initial driver of the current post series was Selenium observability. After I got a better understanding of the observability topic, I’ve decided to add examples on Cypress tests observability for more completeness of the examples. Cypress interacts with the Frontend and exports its traces to OpenTelemetry Collector, which then forwards the traces into Jaeger. More details on the tests can be found in Distributed system observability: Instrument Cypress tests with OpenTelemetry post.

Jaeger

Jaeger, inspired by Dapper and OpenZipkin, is an open-source distributed tracing system. It is used for monitoring and troubleshooting microservices-based distributed systems. Jaeger collects all the traces and provides a search and visualization of the traces. In the original examples, Grafana Tempo was used as a backend and Jaeger UI via the jaeger-query module to open the traces. I initially started with it, but Tempo does not provide a possibility to search the traces. I find this rather inconvenient, so I switched completely to Jaeger.

Promtail

Promtail is an agent which ships the contents of the Spring Boot backend logs to a Loki instance. It is usually deployed to every machine that has applications needed to be monitored. Local configurations are in promtail-local.yaml.

Loki

Grafana Loki is a log aggregation system inspired by Prometheus. It does not index the contents of the logs, but rather a set of labels for each log stream. Log data itself is then compressed and stored in chunks. In the current example, logs are being pushed to Loki by Promtrail. Local configurations are in loki-local.yaml.

Prometheus

Prometheus is an open-source monitoring and alerting toolkit. Prometheus collects and stores its metrics as time-series data, i.e. metrics information is stored with the timestamp at which it was recorded, alongside optional key-value pairs called labels. In the current example, Prometheus is monitoring the Sprint Boot backend, Loki, Jaeger, and OpenTelemetry Collector. It pulls the metrics data from those applications at a regular interval and stores them in its database. Alerts can be configured based on the metrics. Local configurations are in prometheus.yaml.

Grafana

Grafana is an open-source solution for running data analytics, pulling up metrics from different data sources, and monitoring applications with the help of customizable dashboards. The tool helps to study, analyze and monitor data over a period of time, technically called time-series analytics. In the current example, Grafana pulls data from Prometheus, Jaeger, and Loki. Local configurations are in grafana-dashboards.yaml and grafana-datasource.yml.

Explore the example

Running the example is very easy. What is needed is Docker compose and IDE that can run JUnit tests, I prefer IntelliJ IDEA. Run the examples:

  1. Check out the source code from https://github.com/llatinov/selenium-observability-java
  2. Run: docker-compose build
  3. Run: docker-compose up
  4. Open selenium-tests Maven project and run all the unit tests

Explore the example artifacts:

pgAdmin

pgAdmin is accessible at http://localhost:8005/. In order to log in, use the following credentials: pgadmin4@pgadmin.org / admin. This is needed only if the database records have to be read or modified.

Jaeger

Jaeger is accessible at http://localhost:16686. The home page shows rich search functionality. There is a dropdown with all available services, then operations performed by the selected service can be also filtered.

A trace can be opened from the search results. It shows all the actions for this trace that have been recorded.

Grafana

Grafana is accessible at http://localhost:3001. Different data sources can be accessed from the left-hand side menu, there is a small compass, the Explore menu. From the top, there is a dropdown with the available data sources.

Grafana -> Loki

From Grafana select Loki as datasource. Search for {job=”person-service”}, this shows all logs for the Spring Boot backend.

Grafana -> Jaeger

Jaeger data source can open a trace by its id. This data source can be used in conjunction with Loki. Search logs in Loki, then open a log, this exposes a Jaeger button.

Jaeger data source can be opened directly from the dropdown, then type the TraceID.

Grafana -> Prometheus

From Grafana select Prometheus as a data source. Search for {job=”person-service”}, this shows all metrics for the Spring Boot backend.

Prometheus

Prometheus is accessible at http://localhost:9090/. Search for {job=”person-service”}, this shows all metrics for the Spring Boot backend.

Furter posts with details

This is an introductory post, more details, explanations, and code examples on actual implementation can be found in the following posts:

Conclusion

Microservices architecture is used more often. Alongside its advantages, it comes with specific challenges. Observability is one of those challenges and is a very important topic in a distributed software system. In the current example, I have shown end-to-end observability achieved with popular open-source tools. The main objective of my experiments was to be able to trace Selenium test execution through all the systems involved in the distributed architecture.

Related Posts

Read more...

How to gather code coverage with Istanbul and Selenium and pitfalls to avoid

Last Updated on by

Post summary: Istanbul does not seem to be recoding code coverage correctly, it turned out that the tests do navigation by changing the URL, which resets the code coverage.

How to use Istanbul for code coverage of Cypress automated tests was explained in detail in Testing with Cypress – Custom logging of errors and JUnit results post.

Code coverage with Istanbul and Selenium

Recently I had to do it again, this time with Selenium. There are several approaches, which can be taken to measure code coverage with Selenium. Whichever approach is taken, the first step is to instrument the frontend. How to do it with React and create-react-app is described in Testing with Cypress – Code coverage with Istanbul post. Coverage is present in __coverage__ JS frontend variable.

Once the frontend is instrumented, it is important to collect the code coverage after the tests are run. This is where approaches differ. One option is to use istanbul-middleware. In this case, a Node.js backend has to be created and the tests should post the coverage results, taken from __coverage__ to the backend. I find this approach not convenient, so I took the easier one.

Once the test is finished, the code coverage data is collected and saved as a JSON file in a test results folder, then all the results are used to generate the report. I use C# and the code to do so is as simple as:

public void CollectCodeCoverage()
	{
		var data = ((IJavaScriptExecutor)_webDriver)
			.ExecuteScript("return window.__coverage__");
		if (data != null)
		{
			var jsonString = JsonConvert.SerializeObject(data);
			var fileName = $"{_testResultsFolder}/coverage_{DateTime.Now.Ticks}.json";
			File.WriteAllText(fileName, jsonString);
		}
	}

Generating code coverage report

The report is generated with the nyc cli tool. Once all the JSON files are copied into a folder with the name .nyc_output, the command to run the report is nyc report –reporter=html. Nyc can be installed as a global NPM package or can be added to the frontend project inside package.json.

The issues measuring the code coverage

The setup described above is clear and easy to achieve. Although when tests were run, they did not record coverage, which was supposed to be there. I have spent several days trying to figure out what the issue was. And finally, I was able to understand. In my tests, I use _webDriver.Navigate().GoToUrl(). This actually visits a new URL, basically invalidating all the coverage results gathered so far. Once the problem was identified, the solution was pretty simple – save the cove coverage every time before a new URL is about to get opened.

Conclusion

Istanbul is a very good tool to measure the code coverage for web automation tests. In the current post, I have described a pitfall, which should be avoided when using it.

Related Posts

Read more...

Cypress vs. Selenium, is this the end of an era?

Last Updated on by

Post summary: Blog post about a Cypress talk I did recently on a local conference. The presentation compares Cypress with well knows Selenium.

Background

This weekend I did a small talk about Cypress, named “Cypress vs. Selenium, the end of an era?” on QA Challenge Accepted, a local testing conference. This is my second talk on this conference. In 2016 I spoke about Gatling. I haven’t blogged about my Galing talks because my blog covers the tool very extensively. In Performance testing with Gatling post, there is complete Gatling tutorial. In the current post, I will show most of the slides of my presentation and will describe what I have spoken about. The full Cypress presentation can be found on SlideShare: QA Challenge Accepted 4.0 – Cypress vs. Selenium.

Presentation

Selenium Overview

Selenium is a very well known tool, so I will not get into details about it. I will emphasize on its architecture which will be important for the rest of the presentation. Selenium consists of two components. One is so-called bindings, libraries for different programming languages that we use to write our tests with. The other component is the WebDriver. WebDriver is a program that can manage and fully control a specific browser, for which it is designated. The important bit here is that those two components communicate over HTTP by exchanging JSON payload. This is well defined by WebDriver Protocol, which is W3C Candidate Recommendation. Every command used in tests results to a JSON sent through the network. This network communication happens even if tests are run locally. In this case, requests are sent to localhost behind which there is loopback network interface. Even on localhost request travels to Layer 3 of the OSI Model. Request travels through 5 layers, only layers 1 (physical) and 2 (data link) are skipped.

After the conference, I spoke to Anton Angelov, the founder of Automate The Planet and he pointed out that on some WebDrivers on .NET Core, the resolution from localhost in the request to 127.0.0.1, which is the IPv4 address of loopback interface, can take up to a second. This is, of course, .NET Core specific thing. Anyway, in general, resolving localhost to 127.0.0.1 also needs some time which is added to total execution time.

The bottom line of the slide: By architecture Selenium works through the network and this brings delay, which can sometimes be significant.

Cypress Overview

Cypress is used for UI testing but is not based on Selenium. There are many tools out there which bring a lot of abstractions over the WebDriver, but they are limited to the WebDriver as a technology for browser manipulation. All those tools inherit WebDriver limitations. Cypress has its own mechanism for manipulation DOM in the browser. Cypress runs directly in the browser, no network communication involved. By running directly in the browser Cypress has access to everything in the browser, including your application under test. I do not know a valid reason for this, but in my observations, developers strongly do not like Selenium. Cypress is designed with developers in mind, so it is very developer-friendly. Debugging tests with Cypress is easy, there is so-called travel back in time. I will speak for it later.

The bottom line of the slide: Cypress is made from scratch with its own unique DOM manipulation technology and is made with developers in mind.

How to do it

The bottom line of the slide: It is easy to install Cypress. It is easy to write tests with it. It is very easy to debug tests. It is easy to include it in continuous integration or continuous delivery pipelines.

Debug tests in Cypress Test Runner

Cypress Test Runner is a browser instance in which you see all your tests’ steps on the left-hand side. You can click on any step and in the right-hand side window, the application under test is visualized. Cypress makes DOM snapshot before each test steps, so you can easily inspect them.

The bottom line of the slide: Cypress provides DOM snapshots at each test step for easy test debugging.

Library or Framework

Comparison between both tools now begins. Selenium is a library. If you want to make real UI automation you have to combine it with a unit testing framework or make your own runner; you may want to add assertions library or reporting one. This is handy and gives you great flexibility because if you know what you do you can make miracles. You become a creator! If you don’t know what you do you can very easily shoot yourself in the foot. I think this is mostly because developers hate Selenium. Its usage is not straightforward. In order to start writing tests, you have to do a lot of preparational work. This is something developers do not want to invest in, they invest enough in learning all the frameworks related to their work. They do not want to spend time on several more. Cypress, on the other hand, is a complete framework. You install it and start writing tests. It includes Mocha, very famous JavaScript unit testing framework; Chai is assertions library; Chai-jQuery adds jQuery chainer methods to Chai; Sinon is famous JavaScript mocking library that provides mocks, stubs, and spies; Sinon-Chai brings Chai assertions on stubs and spies.

The bottom line of the slide: Selenium is a library allowing you great flexibility. It requires a lot of preparational work before you can start writing tests. Cypress is a complete framework. You install it and start writing tests.

Test Pyramid

Test pyramid illustrates how your test portfolio should look like. Currently is show a test pyramid for a modern web application. In the bottom are the unit tests. They are fast and test most of the functions in your code. By integration tests is meant the following thing. Defacto the standard now for web applications is to use some JavaScript framework and work with data from APIs. Modern web applications process data from APIs. With integration testing, we want to be in control of this data. We want for e.g. to test how web application behaves when API returns an error. If we have stable and well-tested API this scenario we won’t be able to test in reality. By stubbing the data web application works with it is possible to fully test the web application. UI tests are the one executed against deployed, configured and working web application. They are slow and flaky, thus they are limited in number. Selenium works in the UI part of the pyramid. Cypress is there as well, but Cypress is also very good in Integration tests. The dotted line is mostly a wishful thinking – we have unit testing framework included, why not create some unit tests.

The bottom line of the slide: Selenium works only in the UI part of the test pyramid, while Cypress is involved in UI tests and most important in the integration tests.

Programming languages

There are bindings in almost any programming language existing nowadays. If there is not such, by following WebDriver protocol you can create your own binding. Cypress on the other hand only uses JavaScript and will continue to only use JavaScript. There are two reasons for this. FIrst one is not significant, but it is good your tests code to be as you application under test’s code. The most important reason though is that as I said modern web applications are written in JavaScript frameworks. Developers do know JavaScript, so they can very easily write their own Cypress tests.

The bottom line of the slide: Selenium is available in all programming languages. Developers of modern web application know JavaScript. With Cypress, they can write their own tests.

Selectors

Selenium supports 8 different locators. CSS and XPath are the most powerful ones. Cypress supports jQuery selectors. What you can use as a CSS selector in Selenium, you can directly use as a selector in Cypress. The benefit is that jQuery provides more selectors on hand.

The bottom line of the slide: jQuery selectors give more capabilities than CSS selectors.

Supported Browsers

Selenium supports all significant browsers. You can even create your own browser, make WebDriver for it following WebDriver protocol and your current tests will work exactly the same on this new browser. Cypress at this point supports only Chrome. This is maybe the biggest weakness of the tool. Good thing though is that more than 60% of the web uses Chome. Another good thing is that Firefox support is on its way. IE 11 and Edge support is also on the roadmap but with no clear dates.

The bottom line of the slide: Cypress is weak at cross-browser testing. Cypress team is working through to get better in this area.

Cypress vs. Selenium (1)

Comparison of different characteristics:

  • Speed – Selenium tests are generally slow. WebDriver starting is slow, WebDriver is working slowly. Network operating nature of WebDriver also brings some delay. Cypress is super fast. There is no noticeable delay because of the tool itself.
  • Wait for element – in order to do some effective automation with Selenium, waiting for an element is an important part of your framework. You have to have good error catching mechanism as well as retry logic. It takes significant effort to make tests non-flaky. With Cypress, you do not wait. Cypress runs in the browser and knows what is happening behind the scenes, whether the application under test is still busy. In Cypress, you request the element and you get it when the element is ready, no extra code or logic needed.
  • Remote execution – this is what Selenium is made for. You can use Selenium Grid with different browsers and different browser versions. Cypress does not support remote execution.
  • Parallel execution – Selenium is a library that can manipulate the browser. If you make your code thread safe and use a unit testing framework that supports parallel runs then you will have parallel execution. Selenium does not really care about this. Cypress currently does not support parallel execution. It is possible to do it on your own with Docker images, but this involves additional effort. Currently, Cypress team is working on developing parallel execution, so this will happen soon.
  • Headless – both tools support headless Chrome.

Cypress vs. Selenium (2)

Comparison of different characteristics:

  • Screenshot – both perform equally bad because both make screenshot only of the visible part of the page. In order to get the full page, you need to use external JavaScript libraries to capture page and save it as a screenshot. Selenium is a little bit better on screenshot though, because it gives you screenshot object in your tests and you can save it wherever you want. Cypress makes an automatic screenshot with a fixed name. In order to make second screenshot for one test, you need to do some file manipulations for renaming the previous file.
  • Video – Selenium does not record video. Cypress records video by default when tests are run from command line.
  • Documentation – Selenium documentation for me is ugly and not complete. If one has to get acknowledged with Selenium by reading its documentation only that would be very difficult. Cypress team had invested a lot in the documentation. They have their API well described, they have examples and FAQ page.
  • Community – Selenium is an institution. Everybody is using Selenium. For every problem, you may encounter there are already tens of solutions. Cypress does not have such a community yet, not many people are using it. They have chat though in which Cypress developers answer your queries. This chat gets flooded with information, so you can easily get lost.

Cypress vs. Selenium (3)

Comparison of different characteristics:

  • Execute JS – Selenium allows JavaScript execution and this is fast. I’ve seen frameworks where Selenium is used only for JavaScript execution. Cypress is designed to work with JavaScript. It has full access to everything in the browser, including application under test.
  • Switch tabs – Selenium can switch between two tabs of the same browser, Cypress cannot.
  • Several browsers – Selenium can work with several browser windows, even from different browsers. Cypress can work with only one browser instance.
  • Load extensions – both tools allow you to run your tests with some Chrome extension.
  • Manage cookies – both tools manage cookies equally well.

Test Mushroom

This is some funny, ironic but mostly tragic term I’ve made up. It is made up with an analogy to the test pyramid. This mushroom represents a very common test portfolio where the quality of releases totally depends on UI tests. The mushroom leg represents Unit and Integration tests. It is shown with a dotted line as such test are missing. UI tests are slow and flaky, every release sign off takes a lot of time for debugging failed tests. Every release has a risk of failure.

The bottom line of the slide: There are many real-life scenarios where web application quality depends only on UI tests, which are brittle.Cypress gives you instrumentation to act on integrations test, thus reducing the number of UI ones.

The end of an era?

Now is the time to give an answer to the most interesting part of presentation name. Is this an end of an era? My presentation is not really about the competition between those two tools. Everyone has its strengths and weaknesses. They work perfectly combined together. My main point is and I truly believe it is the end of Developers don’t test era. Selenium may not be their favorite tool, but Cypress is made from developers for developers. It is easy to work with and provides features to speed up test writing. If you try the tool and do not like it, at least try to introduce it to developers in your company.

The bottom line of the slide: Cypress is a tool created by developers for developers. Try to introduce it to developers in your organization.

Cypress Sugar (1)

Those are features that make Cypress interesting tool and that make integration testing easy. Cypress runs directly in the browser and has access to everything in the browser, including application under test. Sometimes Selenium tests go through several pages just to bring the application in some desired state. With Cypress, you can programmatically bring the application to this desired state. Cypress provides spies, stubs, and clocks. With spies, you can verify if given JavaScript function has been called, with what arguments or how many times. Stubs allow you to change the default behavior of JavaScript functions and feed to the application under test the data that you need. For e.g. window.fetch is the new way of getting data from API. This can be very easily stubbed. Cypress provides control over the clock in the browser. If you have some animation, instead of waiting for it, you can move the clock forcing animation to show. Cypress allows you to have full control over the network traffic within the browser. You can assert on XMLHttpRequest to an API, verifying that API is called with proper arguments. You can intercept, change, delay, or block response from the API. This allows you to cover various integration scenarios. With Cypress, you can develop even though there is no backend ready yet. You can do TDD (test driven development), create tests and stub the data from missing API in them, develop the UI and then run the tests until they get green.

The bottom line of the slide: Cypress provides great functionalities for stubbing JavaScript functions and control on network traffic within the browser. Those can be used for creating various integration tests.

Cypress Sugar (2)

The essential functionality of most applications is hidden behind a login. Login through the UI slows up the tests. Cypress allows you to send a login request to the backend and it extracts cookies from the response, injects them in the browser and from now on the user is logged in tests. This request takes browser user agent and existing cookies but skips some security limitations, such as CORS (cross-origin resource sharing). Same can be done with Selenium. You can use some HTTP client, send login request, get the response and inject the cookies in the browser. With Selenium, this requires additional effort though, with Cypress it comes out of the box. Last but not least, with Cypress you can test Electron applications. Electron is a framework which enables you to write desktop applications in HTML, CSS, and JavaScript. Those applications are run within Electron browser, which is based on Chromium and Node.js. A good example of an Electron application is Postman (check Introduction to Postman with examples post). Cypress Test Runner is also an Electron application. And Cypress uses Cypress to test Cypress (Test Runner). Testing of Electron applications is not really a straight-forward task. It requires some amount of functions stubbing.

Conclusion

Cypress is a really great tool. It provides very good features to enable you to create integration tests. I have used Selenium way too much in order to dislike it. Tests with it are slow and flaky. I really hope there is something better out there. On the other hand, I have used Cypress way too little to like it very much and think this is the tool. In any way do try Cypress. If you do not like it, then definitely introduce it to your developers.

Related Posts

Read more...

Manage and automatically select needed WebDriver in Java 8 Selenium project

Last Updated on by

Post summary: Example code how to efficiently manage and automatically select needed local WebDriver using Java 8 method reference used as lambda expression.

Code examples in the current post can be found in GitHub selenium-samples-java/design-patterns repository.

Java 8 features

In this example lambda expression and method reference, Java 8 features are used. More in Java 8 features can be found in Java 8 features – Lambda expressions, Interface changes, Stream API, DateTime API post.

Functional interface

Before explaining lambda it is needed to understand the idea of a functional interface as they are leveraged for use with lambda expressions. A functional interface is an interface that has only one abstract method that is to be implemented. A functional interface may or may not have default or static methods (again new Java 8 feature). Although not mandatory, a good practice is to annotate the functional interface with @FunctionalInterface.

Lambda expressions

There is no such term in Java, but you can think of lambda expression as an anonymous method. Lambda expression is a piece of code that provides an inline implementation of a functional interface, eliminating the need for using anonymous classes. Lambda expressions facilitate functional programming and ease development by reducing the amount of code needed.

Method reference

Sometimes when using lambda expression all you do is call a method by name. Method reference provides an easy way to call the method making the code more readable.

Managing WebDriver

The proposed solution of managing WebDriver has enumeration called Browser and class called WebDriverFactory. Another important thing is web drivers should be placed in a folder with name webdrivers and named with a special pattern.

Browser enum

The code is shown below:

package com.automationrhapsody.designpatterns;

import java.util.Arrays;
import java.util.function.Supplier;

import org.openqa.selenium.WebDriver;
import org.openqa.selenium.chrome.ChromeDriver;
import org.openqa.selenium.firefox.FirefoxDriver;
import org.openqa.selenium.ie.InternetExplorerDriver;

public enum Browser {
	FIREFOX("gecko", FirefoxDriver::new),
	CHROME("chrome", ChromeDriver::new),
	IE("ie", InternetExplorerDriver::new);

	private String name;
	private Supplier<WebDriver> driverSupplier;

	Browser(String name, Supplier<WebDriver> driverSupplier) {
		this.name = name;
		this.driverSupplier = driverSupplier;
	}

	public String getName() {
		return name;
	}

	public WebDriver getDriver() {
		return driverSupplier.get();
	}

	public static Browser fromString(String value) {
		for (Browser browser : values()) {
			if (value != null && value.toLowerCase().equals(browser.getName())) {
				return browser;
			}
		}
		System.out.println("Invalid driver name passed as 'browser' property. "
			+ "One of: " + Arrays.toString(values()) + " is expected.");
		return FIREFOX;
	}
}

Enumeration’s constructor has Supplier functional interface as a parameter. When the constructor is called method reference FirefoxDriver::new is called as a lambda expression which purpose is to instantiate new Firefox driver. If only lambda expression is used is would be: () -> new FirefoxDriver(). Notice that method reference is much shorter and easy to read. getDriver() method invokes Supplier’s get() method which is implemented by the lambda expression, so lambda expression is executed hence instantiating new web driver. With this approach Firefox web driver object is created only when getDriver() method is called.

WebDriverFactory

Code is:

package com.automationrhapsody.designpatterns;

import java.io.File;

import org.openqa.selenium.WebDriver;

class WebDriverFactory {

	private static final String WEB_DRIVER_FOLDER = "webdrivers";

		public static WebDriver createWebDriver() {
		Browser browser = Browser.fromString(System.getProperty("browser"));
		String arch = System.getProperty("os.arch").contains("64") ? "64" : "32";
		String os = System.getProperty("os.name").toLowerCase().contains("win") 
				? "win.exe" : "linux";
		String driverFileName = browser.getName() + "driver-" + arch + "-" + os;
		String driverFilePath = driversFolder(new File("").getAbsolutePath());
		System.setProperty("webdriver." + browser.getName() + ".driver", 
				driverFilePath + driverFileName);
		return browser.getDriver();
	}

	private static String driversFolder(String path) {
		File file = new File(path);
		for (String item : file.list()) {
			if (WEB_DRIVER_FOLDER.equals(item)) {
				return file.getAbsolutePath() + "/" + WEB_DRIVER_FOLDER + "/";
			}
		}
		return driversFolder(file.getParent());
	}
}

This code recursively searches for a folder named webdrivers in the project. This is done because when you have a multi-module project running from IDE and from Maven has different root folder and finding web drivers is not possible from both simultaneously. Once the folder is found then proper web driver is selected based on OS and architecture. The code reads browser system property which can be passed from outside hence making the selection of web driver easy to configure. The important part is to have web drivers with special naming convention.

Web drivers naming convention

In order code above to work the web drivers should be placed in webdrivers folder in the project and their names should match the pattern: {DIVER_NAME}-{ARCHITECTURE}-{OS}, e.g. geckodriver-64-win.exe for Windows 64 bit and geckodriver-64-linux for Linux 64 bit.

Conclusion

The proposed solution is a very elegant way to manage your web drivers and select proper one just by passing -Dbrowser={BROWSER} Java system property.

Related Posts

Read more...

Complete guide how to use design patterns in automation

Last Updated on by

Post summary: Complete code example of how design patterns can be used in real life test automation.

With series of posts, I’ve described 5 design patterns that each automation test engineer should know and use. I’ve started with a brief description of the patterns. Then I’ve explained in details with code snippets following patterns: Page objects, Facade, Factory, Singleton, Null object. Code examples are located in GitHub for C# and Java.

Overview

This post is intended to bond all together in a complete guide how to do write better automation code. Generally, automation project consists of two parts. Automation framework project and tests project. Current guide is intended to describe how to build your automation testing framework. How to structure your tests is a different topic. Remember once having correctly designed framework then tests will be much more clean, maintainable and easy to write. To keep post shorter some of the code that is not essential for representing the idea is removed. The whole code is on GitHub.

Page objects

Everything starts by defining proper page objects. There is no fixed recipe for this. It all depends on the structure of application under test. The general rule is that repeating elements (header, footer, menu, widget, etc) are extracted as separate objects. The whole idea is to have one element defined in only one place (stay DRY)! Below is our HomePage object. What you can do generally is make search and clear search terms. Note that clearing is done with jQuery code. This is because of a bug I’ve described with a workaround in Selenium WebDriver cannot click UTF-8 icons post.

using OpenQA.Selenium;

namespace AutomationRhapsody.DesignPatterns
{
	class HomePageObject
	{
		private WebDriverFacade webDriver;
		public HomePageObject(WebDriverFacade webDriver)
		{
			this.webDriver = webDriver;
		}

		private IWebElement SearchField
		{
			get { return webDriver.FindElement(By.Id("search")); }
		}

		public void SearchFor(string text)
		{
			SearchField.SendKeys(text);
		}

		public void ClearSearch()
		{
			webDriver.ExecuteJavaScript("$('span.cancel').click()");
		}
	}
}

WebDriver factory

WebDriver factory will be responsible for instantiating the WebDriver based on a condition which browser we want to run our tests with.

using OpenQA.Selenium;
using OpenQA.Selenium.Chrome;
using OpenQA.Selenium.Firefox;
using OpenQA.Selenium.IE;

namespace AutomationRhapsody.DesignPatterns
{
	public class WebDriverFactory
	{
		public IWebDriver CreateInstance(Browsers browser)
		{
			if (Browsers.Chrome == browser)
			{
				return new ChromeDriver();
			}
			else if (Browsers.IE == browser)
			{
				return new InternetExplorerDriver();
			}
			else
			{
				return new FirefoxDriver();
			}
		}
	}
}

The constructor takes an argument browser type. Browser type is defined as an enumeration. This is very important. Avoid passing back and forth strings. Always stick to enums or special purpose classes. This will save you time investigating bugs in your automation.

public enum Browsers
{
	Chrome, IE, Firefox
}

NullWebElement

This is null object pattern and implements IWebElement. There is a NULL property that is used to compare is given element is not found or no.

using OpenQA.Selenium;
using System.Collections.Generic;
using System.Collections.ObjectModel;
using System.Drawing;

namespace AutomationRhapsody.DesignPatterns
{
	public class NullWebElement : IWebElement
	{
		private const string nullWebElement = "NullWebElement";
		public bool Displayed { get { return false; } }
		public bool Enabled { get { return false; } }
		public Point Location { get { return new Point(0, 0); } }
		public bool Selected { get { return false; } }
		public Size Size { get { return new Size(0, 0); } }
		public string TagName { get { return nullWebElement; } }
		public string Text { get { return nullWebElement; } }
		public void Clear() { }
		public void Click() { }
		public string GetAttribute(string attributeName) { return nullWebElement; }
		public string GetCssValue(string propertyName) { return nullWebElement; }
		public void SendKeys(string text) { }
		public void Submit() { }
		public IWebElement FindElement(By by) { return this; }
		public ReadOnlyCollection<IWebElement> FindElements(By by)
		{
			return new ReadOnlyCollection<IWebElement>(new List<IWebElement>());
		}

		private NullWebElement() { }

		private static NullWebElement instance;
		public static NullWebElement NULL
		{
			get
			{
				if (instance == null)
				{
					instance = new NullWebElement();
				}
				return instance;
			}
		}
	}
}

WebDriver facade

WebDriver facade main responsibility is to define custom behavior on elements location. This gives you centralized control over elements location. The constructor takes browser type and uses the factory to create WebDriver instance which is used internally in the facade. FindElement method defines explicit wait. If the element is not found then NullWebElement which is actual implementation of Null object pattern. The idea is to safely locate elements with try/catch and then just use them skipping checks for null.

using OpenQA.Selenium;
using OpenQA.Selenium.Support.UI;
using System;
using System.Collections.ObjectModel;

namespace AutomationRhapsody.DesignPatterns
{
	public class WebDriverFacade
	{
		private IWebDriver webDriver = null;
		private TimeSpan waitForElement = TimeSpan.FromSeconds(5);

		public WebDriverFacade(Browsers browser)
		{
			WebDriverFactory factory = new WebDriverFactory();
			webDriver = factory.CreateInstance(browser);
		}

		public void Start(string url)
		{
			webDriver.Url = url;
			webDriver.Navigate();
		}

		public void Stop()
		{
			webDriver.Quit();
		}

		public object ExecuteJavaScript(string script)
		{
			return ((IJavaScriptExecutor)webDriver).
				ExecuteScript("return " + script);
		}

		public IWebElement FindElement(By by)
		{
			try
			{
				WebDriverWait wait = new WebDriverWait(webDriver, waitForElement);
				return wait.Until(ExpectedConditions.ElementIsVisible(by));
			}
			catch
			{
				return NullWebElement.NULL;
			}
		}
	}
}

Tests

As I mentioned initially this post is about using efficiently design patterns in your framework automation project. Tests design are not being discussed here. Once you have designed the framework one simple test (without asserts) that makes search will look like the code below.

Browsers browser = Browsers.Chrome;
WebDriverFacade webDriver = new WebDriverFacade(browser);
webDriver.Start("https://automationrhapsody.com/examples/utf8icons.html");

HomePageObject homePage = new HomePageObject(webDriver);
homePage.ClearSearch();
homePage.SearchFor("automation");

webDriver.Stop();

Conclusion

Design patterns are enabling you to write maintainable code. They increase the value of your code. As shown in this series of posts they are not so complicated and you can easily adopt them in your automation. Most important is design patterns increase your value as a professional. So make the time to learn them!

Related Posts

Read more...

Selenium WebDriver cannot click UTF-8 icons

Last Updated on by

Post summary: Selenium WebDriver is not able to click UTF-8 icons. The solution is to use jQuery code.

In given example, there is a simple search form. Search value is cleared by “X” icon. This seems pretty forward case to be automated with Selenium WebDriver. It just seems! The element is properly located but when Selenium tries to click it an ElementNotVisibleException is thrown.

UTF-8 icon

When you inspect the code you can notice element is an empty SPAN. It has a UTF-8 icon displayed over it by CSS “content” property. With this approach, it is very easy to visualize vast amount of interesting icons without being proficient in image editing.

Debug

I’ve set up very simple Selenium project for Visual Studio 2013 in my GitHub repository. On debug, the element can be located, but its width=0 so Selenium treats it as Displayed=false. This seems a good candidate for Selenium bug.

// Locate element
IWebElement element = webDriver.FindElement(By.CssSelector("span.cancel"));
// Debug info
bool isDisplayed = element.Displayed; // false
int width = element.Size.Width; // 0
int heigth = element.Size.Height; // 17

Solve it

Problem solved with the execution of jQuery that will do the actual click of the element:

// jQuery workaround of the click
((IJavaScriptExecutor)webDriver).
	ExecuteScript("$('span.cancel').click()");

If jQuery is not available then try to do the click with a function of JavaScript library used by the application you are automating. If none is used they you can just use DOM JavaScript call:

((IJavaScriptExecutor)webDriver).
	ExecuteScript("document.getElementsByClassName('cancel')[0].click();");

Bonus

There is one more interesting thing about jQuery. You can execute jQuery with some specific selector logic and this will return the same IWebElement if you have located it with WebDriver.

// Locate element with jQuery
element = (IWebElement)((IJavaScriptExecutor)webDriver).
	ExecuteScript("return $('span.cancel')[0]");

One small detail – [0] is needed at the end of jQuery code otherwise a jQuery object is returned and Selenium is not able to cast it to IWebElement. And of course, the element is yet not clickable. I’m just giving an alternative way of locating elements if Selenium way gets too complicated.

Conclusion

jQuery can be used inside Selenium WebDriver for tasks which are impossible to be done otherwise or are too hard and not worth wasting time on it. jQuery can be quite helpful in your automation. So it is worth improving your skill set with it. Remember, the essence of automation is about saving your company time and money, not wasting them on insignificant tasks.

Read more...

Efficient waiting for Ajax call data loading with Selenium WebDriver

Last Updated on by

Post summary: This post is about implementing an efficient mechanism for Selenium WebDriver to wait for elements by execution of jQuery code.

Automating single page application with Selenium WebDriver could be sometimes a tricky task. You can get into the trap of timing issues. Although you set explicit waits you still can try to use an element that is not yet loaded by the Ajax call. Remember Thread.Sleep() is never an option! You can use very tiny sleep (100-200ms) in order to wait for initiation of given process, but never use sleep to wait for the end of the process.

Implement Selenium wrapper (Facade)

I good approach I like is to implement your own FindElement method which is basically a wrapper for Selenium’s methods (Facade design pattern). With this approach, you are hiding unneeded Selenium functionality and have centralized control over locating of elements and explicit waits. Locate behaviour of your entire framework is controlled in just one method.

private static TimeSpan waitForElement = TimeSpan.FromSeconds(10);

public static IWebElement FindElement(By by)
{
	try
	{
		WaitForReady();
		WebDriverWait wait = new WebDriverWait(webDriver, waitForElement);
		return wait.Until(ExpectedConditions.ElementIsVisible(by));
	}
	catch
	{
		return null;
	}
}

Code above is C# one and is implementation of explicit wait with WebDriverWait class from OpenQA.Selenium.Support.UI. You can see an unknown (so far) method WaitForReady(). Note that ElementIsVisible is used instead of ElementExists because element might be on the page but yet not ready to work with.

Wait for Ajax call to finish

Initially, WaitForReady() was supposed to check that Ajax has finished loading by using jQuery.active property. This is in case jQuery is used in the application under test. If this property is 0 then there are no active Ajax request to the server.

private static void WaitForReady()
{
	WebDriverWait wait = new WebDriverWait(webDriver, waitForElement);
	wait.Until(driver => (bool)((IJavaScriptExecutor)driver).
			ExecuteScript("return jQuery.active == 0"));
}

Wait for Ajax call to finish and data to load

You can realize that sometimes it is not enough to wait for Ajax to finish rather than to wait for data to be rendered. There is fancy loader in my application under which is a DIV shown when some action is being performed. If there is one on your application then you’d better wait not only for Ajax to finish but the loader to hide.

private static void WaitForReady()
{
	WebDriverWait wait = new WebDriverWait(webDriver, waitForElement);
	wait.Until(driver =>
	{
		bool isAjaxFinished = (bool)((IJavaScriptExecutor)driver).
			ExecuteScript("return jQuery.active == 0");
		try
		{
			driver.FindElement(By.ClassName("spinner"));
			return false;
		}
		catch
		{
			return isAjaxFinished;
		}
	});
}

If “spinner” location gives exception then loader is not present and we can stop waiting. Good!

Improve the wait for data load

What about performance? When putting a timer the result was ~300ms for each Selenium search for the loader. Not so good… Is 300ms long? Sure not, but taking into consideration this is called every time an element is located then this could make a huge difference in test execution times.

Why not make the same check for a hidden loader, but this time with a JavaScript call to the browser? I’m familiar with jQuery, then why not.

private static void WaitForReady()
{
	WebDriverWait wait = new WebDriverWait(webDriver, waitForElement);
	wait.Until(driver =>
	{
		bool isAjaxFinished = (bool)((IJavaScriptExecutor)driver).
			ExecuteScript("return jQuery.active == 0");
		bool isLoaderHidden = (bool)((IJavaScriptExecutor)driver).
			ExecuteScript("return $('.spinner').is(':visible') == false");
		return isAjaxFinished & isLoaderHidden;
	});
}

Conclusion

Same logic to check that element with class=”spinner” is not visible on the page but this time at a cost of ~30ms. I like it much better this way!

Related Posts

Read more...

Automating SignalR applications with Selenium WebDriver

Last Updated on by

Post summary: This post is about a “No response from server for url” issue during automation of web application using SignalR with Selenium WebDriver. The issue was resolved by configuring the application to connect to the server through WebSocket protocol.

I had to automate with Selenium WebDriver (version 2.43) a single page web application which was built with SignalR. The first thing to do when you start an automation is to automate a smoke test scenario. So did I. It run fine on Internet Explorer (version 11) and Chrome (version 37). I was happy and confident I’m going to finish this project on time. The happy face was dramatically changed when I run the suite on Firefox (version 33). The driver timed out with “No response from server for url” issue. I searched the net for similar problems with no luck. I couldn’t find an end-to-end solution to issues with SignalR automation so I prepared this post.

About SignalR

ASP.NET SignalR is a library for building “real-time” applications that enable the server to push events to the client instead of relying client to request data from the server. Once the application is started SignalR client tries to connect to the server with the transport protocols in the order shown in the list: WebSocket, Server-sent events, Forever Frame (IE only) and finally Ajax long polling.

My application was forced to connect to the server with Long Polling transport (later on I discovered this was caused by a bug in application’s connection logic). Client (browser) opens a connection to the server. The server keeps this connection open for a relatively long time (2 minutes in my case). Seems like this open connection is confusing Selenium and it doesn’t actually know when the browser is ready. “Unstable” page loading strategy did not work out:

FirefoxProfile profile = new FirefoxProfile();
profile.setPreference("webdriver.load.strategy", "unstable");
WebDriver driver = new FirefoxDriver(profile);

The only solution to make Selenium working with SignalR in all three browsers was to make the application under test working with WebSocket transport protocol.

It is also beneficial for the application itself to work with WebSockets. There are several SignalR prerequisites in order to use it with WebSockets: IIS 8 or IIS 8 Express with enabled WebSockets. In order to get enabled Web Sockets are installed additionally to IIS as a Windows feature.

Read more...