Monthly Archives: July 2015

REST performance problems with Dropwizard and Jersey JAXB provider

Last Updated on by

Post summary: Dropwizard’s performance highly degrades when using REST with XML caused by Jersey’s Abstract JAXB provider. Solution is to inject your own JAXB provider.

Dropwizard is a Java-based framework for building a RESTful web server in very short time. I have created a short tutorial how to do so in Build a RESTful stub server with Dropwizard post.

Short overview

The current application is a Dropwizard based serving as a hub between several systems. Running on Java 7, it receives REST with XML and sends XML over REST to other services. JAXB is a framework for converting XML document to Java objects and vice versa. In order to do so, JAXB needs to instantiate a context for each and every Java object. Context creation is an expensive operation.

Problem

Jersey’s Abstract JAXB provider has weak references to JAXB contexts by using WeakHashMap. This causes context’s map to be garbage collected very often and new contexts to be added again to that map. Both garbage collection and context creation are expensive operations causing 100% CPU load and very poor performance.

Solution

The solution is to create your own JAXB context provider which keeps context forever. One approach is HashMap with context created on the fly on first access of specific Java object:

import javax.ws.rs.ext.ContextResolver;
import javax.xml.bind.JAXBContext;
import javax.xml.bind.JAXBException;
import java.util.HashMap;
import java.util.Map;

public class CustomJAXBContextProvider implements ContextResolver<JAXBContext> {
	private static final Map<Class, JAXBContext> JAXB_CONTEXT
			= new HashMap<Class, JAXBContext>();

	public JAXBContext getContext(Class<?> type) {
		try {
			JAXBContext context = JAXB_CONTEXT.get(type);
			if (context == null) {
				context = JAXBContext.newInstance(type);
				JAXB_CONTEXT.put(type, context);
			}
			return context;
		} catch (JAXBException e) {
			// Do something
			return null;
		}
	}
}

Another approach is one big context created for all the Java objects from specific packages separated with a colon:

import javax.ws.rs.ext.ContextResolver;
import javax.xml.bind.JAXBContext;
import javax.xml.bind.JAXBException;

public class CustomJAXBContextProvider implements ContextResolver<JAXBContext> {
	private static JAXBContext jaxbContext;

	public JAXBContext getContext(Class<?> type) {
		try {
			if (jaxbContext == null) {
				jaxbContext = JAXBContext
						.newInstance("com.acme.foo:com.acme.bar");
			}
			return jaxbContext;
		} catch (JAXBException e) {
			// Do something
			return null;
		}
	}
}

Both approaches have pros and cons. The first approach has fast startup time, but the first request will be slow. The second approach will have a fast first request, but slow server startup time. Once JAXB context is created in Dropwizard Application class a Jersey client should be created with this context and used for REST requests:

Client client = new JerseyClientBuilder(environment)
		.using(configuration.getJerseyClientConfiguration())
		.withProvider(CustomJAXBContextProvider.class).build(getName());

Conclusion

There is no practical need to garbage collect JAXB context so it should stay as long as application lives. This is why custom JAXB provider is a good solution even there are not actual performance issues.

Related Posts

Read more...

How to do proper performance testing

Last Updated on by

Post summary: Describe what actions are needed in order to make successful performance testing.

Functional testing that system works as per user requirements is a must for every application. But if the application is expected to handle a large number of users then doing a performance testing is also an important task. Performance testing has different aspects like load, stress, soak. More about them can be found in Performance, Load, Stress and Soak testing post. Those are all incorporated into term “performance testing” in the current article. Steps to achieve successful performance testing in short are:

  1. Set proper goals
  2. Choose tools
  3. Try the tools
  4. Implement scenarios
  5. Prepare environments
  6. Run and measure

Why?

Performance testing should not be something we do for fun or because other people do it. Performance testing should be business justified. It is up to the business to decide whether performance testing will have some ROI or not. In this article, I will give a recipe on how to do performance testing.

Setting the goal

This is one of the most important steps before starting any performance initiative. Just making a performance for the sake of making performance is worthless and a waste of effort. Before starting any activity it should be clear how many users are expected, what is the peak load, what users are doing on the site and many more. This information usually is obtained from business and product owners, but it could be obtained by certain statistical data. After having rough numbers then define what answers performance test should give. Questions could be:

  • Can the system handle 100 simultaneous users with response time less than 1 second and no error?
  • Can the system handle 50 requests/second with response time less than 1.5 seconds for 1 hour and no more than 98% errors?
  • Can system work 2 days with 50 simultaneous users with response time less than 2 seconds?
  • How the system behaves with 1000 users? With 5000 users?
  • When will the system crash?
  • What is the slowest module of the system?

Choosing the tools

Choosing the tool must be done after the estimated load has been defined. There are many commercial and non-commercial tools out there. Some can produce huge traffic and cost lots of money, some can produce mediocre traffic and are free. Important criteria for choosing a tool are how many virtual users it can support and can it fulfill performance goal. Another important thing is can QAs be able to work with it and create scenarios. In the current post, I will mention two open source tools JMeter and Gatling.

JMeter

It is a well known and proven tool. It is very easy to work with, no programming skills are needed. No need to spend many words on their benefits, they are many. Problems though are that it has certain limitations on the load it may produce from a single instance. Virtual users are represented as Java thread and JVM is not good at handling too many threads. A good thing is it provides a mechanism for adding more hosts that participate in the run and can produce huge load. Management of those machines is needed. Also, there are services in the cloud that offer running JMeter test plans and you can scale up there.

Gatling

Very powerful tool. Build on top of Akka it enables thousands of virtual users on a single machine. Akka has message-driven architecture and this overrides the JVM limitation of handling many threads. Virtual users are not threads but messages. The disadvantage is that tests are written in Scala, which makes scenarios creation and maintenance more complex.

Try the tools

Do not just rely on marketing data provided on the website of the given tool. An absolute must is to record user scenarios and play them with a significant number of users. Try to make it as realistic as possible. Even if this evaluation cost more time, just spend it, it will save a lot of time and money in the long run. This evaluation will give you confidence that the tool can do the job and can be used by QAs responsible for the performance testing project.

Implement the scenarios

Some of the work should have already been done during the evaluation demo. Scenarios now must be polished to make them match real user experience as much as possible. It is a good idea to be able to implement a mechanism for changing scenarios just by a configuration.

The essence of performance testing

In terms of Web or API (REST, SOAP) performance testing every tool, no matter how fancy it is, in the end, does one and the same, send HTTP requests to the server, collects and measures the response. This is it, not rocket science.

Include static resources or not?

This is an important question in the case of Web performance testing. There is no fixed recipe though. Successful web applications use a content delivery network (CDN) to serve static content such as images, CSS, JavaScripts, media. If CDN is a third party and they provide some service level agreements (SLAs) for response time then static data should be skipped in the performance test. If it is our CDN then it may be a good idea to make separate performance testing project just for CDN itself. This could double the effort but will make each project focused and coherent. If static data is hosted on the same server as dynamic content then it may be a good idea to include images also. It very much depends on the situation. Browsers do have a cache but it is controlled by correct HTTP response header values. In case of incorrect such or too dynamic static content, this can put a significant load on the server.

Virtual users vs. requests/second

Tools for performance testing use virtual users as the main metric. This is a representation of a real-life user. With sleep times between requests, they mimic real user behavior on the application and this gives very close to reality simulation. The metric though is more business orientated. A more technical metric is a requests per second. This is what most traffic monitoring tools report. But converting between those is a tricky task. It really depends on how the application is performing. Will try to illustrate it with some examples. Let us consider 100 users with a sleep time of 1 second between requests. This theoretically should give 100 requests per second load. But if the application is responding more slowly than 1 second then it will produce fewer req/s as each user has to wait for the response and then sent next request. Lest consider 10 users with no sleep time. If the application responds for 100 ms then each user will make 10 req/s this sums to a total of 100 req/s. If the application responds with 1 second then the load will drop to 10 req/s. If the application responds with 2 seconds then the load will drop to 5 req/s. In reality, it takes several attempts to match users count with expected request per second and all those depend on the application’s response time.

Environments

With the start of the project, tests can be run on test servers or local QA/Dev machines. Sometimes problems are caught even at this level. Especially when performance testing is a big event in the company I recommend first do it locally, this could save some embarrassment. Also, this helps polish even better the scenarios. Ones everything is working perfectly locally then we can start with actual performance testing. Environments to be used for performance testing should be production like. The closer they are, the better. Once everything is good at the production-like server the cherry on top will be if tests can be run on production in times of no usage. Beware when running the tests and you try to see at what amount of users system will fail, as your test/production machine could be VM and this may affect other important VMs.

Measure

Each performance testing tool gives some reporting about response times, the total number of requests, request per second, responses with errors. This is good, but do not trust these reports. Performance testing tools like any software have bugs. You should definitely have some server monitoring software or application performance measurement tool installed on the machine under test. Those tools will give you the most adequate information as long as memory usage statistics and even hints where problems may occur.

Conclusion

Performance testing is an important part of an application life-cycle. It should be done correctly to get good results. Once you have optimized the backend and end results are not satisfactory it is time to do some measurements in the frontend as well. I have given some tips in Performance testing in the browser post.

Related Posts

Read more...