Distributed system observability: Instrument Spring Boot application with OpenTelemetry

Last Updated on by

Post summary: Instrumenting a Spring Boot Java application with OpenTelemetry.

This post is part of Distributed system observability: complete end-to-end example series. The code used for this series of blog posts is located in selenium-observability-java GitHub repository.

Spring Boot

Spring Boot makes it easy to create stand-alone, production-grade Spring-based Applications that you can “just run”. Most Spring Boot applications need minimal Spring configuration. Creating a basic Spring Boot application takes a few steps.

Application

The application class is the entry point. It should have @SpringBootApplication annotation. I have added additional @ServletComponentScan annotation in order to register a custom response filter. I want to output the TraceId as a response header.

Application

import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
import org.springframework.boot.web.servlet.ServletComponentScan;

@ServletComponentScan
@SpringBootApplication
public class PersonServiceApplication {

    public static void main(String[] args) {
        SpringApplication.run(PersonServiceApplication.class, args);
    }
}

Filter

import io.opentelemetry.api.trace.Span;

import javax.servlet.*;
import javax.servlet.annotation.WebFilter;
import javax.servlet.http.HttpServletResponse;
import java.io.IOException;

@WebFilter("*")
public class AddResponseHeaderFilter implements Filter {

    @Override
    public void doFilter(ServletRequest request, ServletResponse response, FilterChain chain) throws IOException, ServletException {
        HttpServletResponse httpServletResponse = (HttpServletResponse) response;
        httpServletResponse.setHeader("x-trace-id", Span.current().getSpanContext().getTraceId());
        chain.doFilter(request, response);
    }
}

Controller

In the current example, there is only one controller with two APIs, a POST and GET endpoints with the same path. The class has to be annotated with @RestController. Each API endpoint is annotated with @RequestMapping, with more details about the path and the method. I have used Spring’s constructor-based dependency injection. I have omitted the @Autowired annotation because there is just one constructor.

import com.automationrhapsody.observability.services.PersonService;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.web.bind.annotation.RequestBody;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.RequestMethod;
import org.springframework.web.bind.annotation.RestController;

import java.util.List;

@RestController
public class PersonController {

    private static final Logger LOGGER 
            = LoggerFactory.getLogger(PersonController.class);

    private PersonService personService;

    public PersonController(PersonService personService) {
        this.personService = personService;
    }

    @RequestMapping(value = "/persons", method = RequestMethod.GET)
    public List<PersonDto> getPersons() {
        LOGGER.info("Processing GET /persons request.");

        List<PersonDto> persons = personService.getPersons();
        return persons;
    }

    @RequestMapping(value = "/persons", method = RequestMethod.POST)
    public Long savePersons(@RequestBody PersonDto person) {
        LOGGER.info("Processing POST /persons request with {}", person);

        Long resultId = personService.savePerson(person);
        return resultId;
    }
}

Repository

Defining a very basic repository can be really easy with Spring, all needed is to extend the CrudRepository interface. If fine-tuning and custom methods are needed then it is needed to create an interface that extends the Spring’s Repository interface and defines custom repository methods inside. Read more in Working with Spring Data Repositories.

import org.springframework.data.repository.CrudRepository;
import org.springframework.stereotype.Repository;

@Repository
public interface FlightRepository extends CrudRepository<PersonEntity, Long> {
}

Service

A service layer is a good idea to handle the business logic between the controller and the repository. In this case, dependency is injected with @Autowired annotation. findAll() is a method that comes from CrudRepository interface. It gets the data from the database. @WithSpan annotation creates a new span in the OpenTelemetry traces. This will be explained in more detail.

import com.automationrhapsody.observability.controllers.PersonDto;
import com.automationrhapsody.observability.repositories.person.FlightRepository;
import com.automationrhapsody.observability.repositories.person.PersonEntity;
import io.opentelemetry.api.common.AttributeKey;
import io.opentelemetry.api.common.Attributes;
import io.opentelemetry.api.trace.Span;
import io.opentelemetry.extension.annotations.WithSpan;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.stereotype.Service;

import java.util.List;
import java.util.stream.Collectors;
import java.util.stream.StreamSupport;

@Service
public class PersonService {

    private static final Logger LOGGER =
            LoggerFactory.getLogger(PersonService.class);

    @Autowired
    private FlightRepository flightRepository;

    public List<PersonDto> getPersons() {
        doSomeWorkNewChildSpan();
        Iterable<PersonEntity> persons = flightRepository.findAll();
        return StreamSupport.stream(persons.spliterator(), false)
                .map(this::toPersonDto)
                .collect(Collectors.toList());
    }

    @WithSpan
    public void doSomeWorkNewChildSpan() {
        LOGGER.info("Doing some work In New child span");
        Span span = Span.current();
        span.setAttribute("template.a2", "some value");
        span.addEvent("template.processing2.start", attributes("321"));
        span.addEvent("template.processing2.end", attributes("321"));
    }

    private Attributes attributes(String id) {
        return Attributes.of(AttributeKey.stringKey("app.id"), id);
    }

    private PersonDto toPersonDto(PersonEntity person) {
        PersonDto personDto = new PersonDto();
        personDto.setFirstName(person.getFirstName());
        personDto.setLastName(person.getLastName());
        personDto.setEmail(person.getEmail());
        return personDto;
    }
}

Instrumentation

OpenTelemetry provides a way for manual instrumentation, which will be covered in the subsequent Selenium-based post. OpenTelemetry also provides a Java agent JAR, that can be attached to any Java 8+ application and dynamically injects bytecode to capture telemetry from a number of popular libraries and frameworks. This JAR agent is attached to the Spring Boot application described above. This is done in the Docker file. Jaeger exporter and Jaeger backend endpoint are configured with otel.traces.exporter and otel.exporter.jaeger.endpoint environment variables.


# ========= BUILD =========
FROM maven:3-openjdk-11 as builder
WORKDIR /build

COPY pom.xml pom.xml
RUN mvn dependency:resolve

COPY . .
RUN mvn install

# ========= RUN =========
FROM openjdk:11
ENV APP_NAME person-service
# https://github.com/open-telemetry/opentelemetry-java-instrumentation
ENV JAVA_OPTS "$JAVA_OPTS \
  -Dotel.traces.exporter=jaeger \
  -Dotel.exporter.jaeger.endpoint=http://jaeger:14250 \
  -Dotel.metrics.exporter=none \
  -Dotel.resource.attributes="service.name=${APP_NAME}" \
  -Dotel.javaagent.debug=false \
  -javaagent:/app/opentelemetry-javaagent-all.jar"

ADD https://github.com/open-telemetry/opentelemetry-java-instrumentation/releases/download/v1.6.2/opentelemetry-javaagent-all.jar /app/opentelemetry-javaagent-all.jar
COPY --from=builder /build/target/$APP_NAME-*.jar /app/$APP_NAME.jar

CMD java $JAVA_OPTS -jar /app/$APP_NAME.jar

This is it. Just by attaching the Java agent application is instrumented and OpenTelemetry is ready to report traces. OpenTelemetry supports a large number of libraries and frameworks, the full list can be found in Supported libraries, frameworks, application servers, and JVMs.

Custom tracking

In many cases, apart from the standard tracing, the application has to report additional traces. This can be very easily done with the @WithSpan annotation. This annotation marks that the execution of this method or constructor should result in a new Span. It can be used to signal OpenTelemetry auto-instrumentation that a new span should be created whenever a marked method is executed. In the example above, a custom attribute has been added to the new Span. This is not mandatory though, just adding the annotation will record the method invocation as a new span in the trace output.

Traces output

In order to monitor a trace, run the examples as described in Distributed system observability: complete end-to-end example with OpenTracing, Jaeger, Prometheus, Grafana, Spring Boot, React and Selenium. Accessing http://localhost:8090/persons generates a trace in Jaeger:

Conclusion

With the steps described above, create a basic Spring Boot application is fairly easy. It is even easier instrumenting the application with OpenTelemetry tracing, just by adding the Java agent JAR. Custom tracing is also made easy with @WithSpan annotation. Traces give very valuable information about the performance of the different parts of the application.

Related Posts

Read more...

Distributed system observability: complete end-to-end example with OpenTracing, Jaeger, Prometheus, Grafana, Spring Boot, React and Selenium

Last Updated on by

Post summary: Code examples and explanations on an end-to-end example showcasing a distributed system observability from the Selenium tests through React front end, all the way to the database calls of a Spring Boot application. Examples are implemented with the OpenTracing toolset and traces are saved in Jaeger. This example also shows a complete observability setup including tools like Grafana, Prometheus, Loki, and Promtail.

This post is part of Distributed system observability: complete end-to-end example series. The code used for this series of blog posts is located in selenium-observability-java GitHub repository.

Introduction

Nowadays, the MIcroservices architecture is very popular. It certainly has its benefits, allowing the companies to deliver faster products to the market. It is much easier to manage several small applications, each one of them with isolated responsibilities, rather than one big fat monolithic application. Microservices architecture has its challenges as well. One of those challenges is traceability. What happens in case of error, where did it occur, what microservices were involved, what were the requests flow through the system, where is the stack trace? In a monolithic application, the stack trace is shown into the logs, giving the exact location of the error. In a microservices landscape, errors are in many cases meaningless, unless there is full traceability of the request flow.

Observability and distributed tracing

Distributed tracing, also called distributed request tracing, is a method used to profile and monitor applications, especially those built using a microservices architecture. Distributed tracing helps pinpoint where failures occur and what causes poor performance. Logs, metrics, and traces are often known as the three pillars of observability. Further reading on observability can be done in The Three Pillars of Observability article.

OpenTracing

OpenTracing is an API specification and libraries, that enables the instrumentation of distributed applications. It is not locked to any particular vendors and allows flexibility just by changing the configuration of already instrumented applications. More details can be found in Instrumenting your application and What is Distributed Tracing?. Current examples are based on OpenTracing libraries and tools.

End-to-end traceability and observability

In the current examples, I am going to give an end-to-end solution, how observability can be achieved in a distributed system. I have used mnadeem/boot-opentelemetry-tempo project as a basis and have extended it with React Frontend and Selenium tests, to provide a complete end-to-end example. Below is a diagram of the full setup. All applications involved will be explained on a higher level.

PostgreSQL and pgAdmin

The basic examples used PostgreSQL, I thought of changing it to MySQL, but when I did short research, I found that PostgreSQL has some advantages. PostgreSQL is an object-relational database, while MySQL is a purely relational database. This means that Postgres includes features like table inheritance and function overloading, which can be important to certain applications. Postgres also adheres more closely to SQL standards. See more in MySQL vs PostgreSQL — Choose the Right Database for Your Project.

pgAdmin is the default user interface to manage a PostgreSQL database, so it is present in the architecture as well.

Spring Boot backend

Spring Boot is used as a backend. I did want to get some exposure to the technology, so I created a very basic application in Spring Boot. It uses the PostgreSQL database for reading and writing data. Spring Boot application is instrumented with OpenTelemetry Java library and exports the traces in Jaeger format directly to the Jaeger backend. It also writes application log files on a file system. Backend exposes APIs, which are consumed by the frontend. More details on the backend can be found in Distributed system observability: Instrument Spring Boot application with OpenTelemetry post.

React frontend

I am very experienced with React, so this was the natural choice for the frontend technology. The frontend uses fetch() to consume the backend APIs. It is instrumented with OpenTelementry JavaScript libraries to trace all communication happening through fetch() and to exports the traces in OpenTelemetry format to the OpenTelemetry collector. The frontend also has manual instrumentation which traces the actions done by end-users on it. More details on the frontend can be found in Distributed system observability: Instrument React application with OpenTelemetry post.

OpenTelemetry collector

OpenTelemetry collector converts the data received from the frontend in OpenTelemetry format into Jaeger format and exports it to the Jaeger backend. The collector is also extracting the span metrics, which are read by Prometheus, read more in Distributed system observability: extract and visualize metrics from OpenTelemetry spans post. Configurations are described in the collector configuration. Local configurations are in otel-config.yaml.

Selenium tests

Selenium was chosen for the web testing framework because of its observability feature. Actually, this was the reason for which I created the current examples. After getting to know the tracing features of Selenium better, I find them not much useful. Selenium does not provide traceability of the tests, but rather on its internal operations and performance. Having started with the tracing and the whole project, I could not ditch it in the middle, so I have to create a custom way to make Selenium trace the tests. Selenium tests export tracing information in Jaeger format directly into the Jaeger backend. More details on the tests can be found in Distributed system observability: Instrument Selenium tests with OpenTelemetry post.

Cypress tests

Cypress is a front-end testing tool built for the modern web. It is most often compared to Selenium. The initial driver of the current post series was Selenium observability. After I got a better understanding of the observability topic, I’ve decided to add examples on Cypress tests observability for more completeness of the examples. Cypress interacts with the Frontend and exports its traces to OpenTelemetry Collector, which then forwards the traces into Jaeger. More details on the tests can be found in Distributed system observability: Instrument Cypress tests with OpenTelemetry post.

Jaeger

Jaeger, inspired by Dapper and OpenZipkin, is an open-source distributed tracing system. It is used for monitoring and troubleshooting microservices-based distributed systems. Jaeger collects all the traces and provides a search and visualization of the traces. In the original examples, Grafana Tempo was used as a backend and Jaeger UI via the jaeger-query module to open the traces. I initially started with it, but Tempo does not provide a possibility to search the traces. I find this rather inconvenient, so I switched completely to Jaeger.

Promtail

Promtail is an agent which ships the contents of the Spring Boot backend logs to a Loki instance. It is usually deployed to every machine that has applications needed to be monitored. Local configurations are in promtail-local.yaml.

Loki

Grafana Loki is a log aggregation system inspired by Prometheus. It does not index the contents of the logs, but rather a set of labels for each log stream. Log data itself is then compressed and stored in chunks. In the current example, logs are being pushed to Loki by Promtrail. Local configurations are in loki-local.yaml.

Prometheus

Prometheus is an open-source monitoring and alerting toolkit. Prometheus collects and stores its metrics as time-series data, i.e. metrics information is stored with the timestamp at which it was recorded, alongside optional key-value pairs called labels. In the current example, Prometheus is monitoring the Sprint Boot backend, Loki, Jaeger, and OpenTelemetry Collector. It pulls the metrics data from those applications at a regular interval and stores them in its database. Alerts can be configured based on the metrics. Local configurations are in prometheus.yaml.

Grafana

Grafana is an open-source solution for running data analytics, pulling up metrics from different data sources, and monitoring applications with the help of customizable dashboards. The tool helps to study, analyze and monitor data over a period of time, technically called time-series analytics. In the current example, Grafana pulls data from Prometheus, Jaeger, and Loki. Local configurations are in grafana-dashboards.yaml and grafana-datasource.yml.

Explore the example

Running the example is very easy. What is needed is Docker compose and IDE that can run JUnit tests, I prefer IntelliJ IDEA. Run the examples:

  1. Check out the source code from https://github.com/llatinov/selenium-observability-java
  2. Run: docker-compose build
  3. Run: docker-compose up
  4. Open selenium-tests Maven project and run all the unit tests

Explore the example artifacts:

pgAdmin

pgAdmin is accessible at http://localhost:8005/. In order to log in, use the following credentials: pgadmin4@pgadmin.org / admin. This is needed only if the database records have to be read or modified.

Jaeger

Jaeger is accessible at http://localhost:16686. The home page shows rich search functionality. There is a dropdown with all available services, then operations performed by the selected service can be also filtered.

A trace can be opened from the search results. It shows all the actions for this trace that have been recorded.

Grafana

Grafana is accessible at http://localhost:3001. Different data sources can be accessed from the left-hand side menu, there is a small compass, the Explore menu. From the top, there is a dropdown with the available data sources.

Grafana -> Loki

From Grafana select Loki as datasource. Search for {job=”person-service”}, this shows all logs for the Spring Boot backend.

Grafana -> Jaeger

Jaeger data source can open a trace by its id. This data source can be used in conjunction with Loki. Search logs in Loki, then open a log, this exposes a Jaeger button.

Jaeger data source can be opened directly from the dropdown, then type the TraceID.

Grafana -> Prometheus

From Grafana select Prometheus as a data source. Search for {job=”person-service”}, this shows all metrics for the Spring Boot backend.

Prometheus

Prometheus is accessible at http://localhost:9090/. Search for {job=”person-service”}, this shows all metrics for the Spring Boot backend.

Furter posts with details

This is an introductory post, more details, explanations, and code examples on actual implementation can be found in the following posts:

Conclusion

Microservices architecture is used more often. Alongside its advantages, it comes with specific challenges. Observability is one of those challenges and is a very important topic in a distributed software system. In the current example, I have shown end-to-end observability achieved with popular open-source tools. The main objective of my experiments was to be able to trace Selenium test execution through all the systems involved in the distributed architecture.

Related Posts

Read more...