Monthly Archives: July 2015

REST performance problems with Dropwizard and Jersey JAXB provider

Last Updated on by

Post summary: Dropwizard’s performance highly degrades when using REST with XML caused by Jersey’s Abstract JAXB provider. Solution is to inject your own JAXB provider.

Dropwizard is Java based framework for building RESTful web server in very short time. I have created short tutorial how to do so in Build a RESTful stub server with Dropwizard post.

Short overview

Current application is a Dropwizard based serving as a hub between several systems. Running on Java 7, it receives REST with XML and sends XML over REST to other services. JAXB is a framework for converting XML document to Java objects and vice versa. In order to do so JAXB needs to instantiate a context for each and every Java object. Context creation is an expensive operation.

Problem

Jersey’s Abstract JAXB provider has weak references to JAXB contexts by using WeakHashMap. This causes contexts map to be garbage collected very often and new contexts to be added again to that map. Both garbage collection and context creation are expensive operations causing 100% CPU load and very poor performance.

Solution

Solution is to create your own JAXB context provider which keeps context forever. One approach is HashMap with context created on the fly on first access of specific Java object:

import javax.ws.rs.ext.ContextResolver;
import javax.xml.bind.JAXBContext;
import javax.xml.bind.JAXBException;
import java.util.HashMap;
import java.util.Map;

public class CustomJAXBContextProvider implements ContextResolver<JAXBContext> {
	private static final Map<Class, JAXBContext> JAXB_CONTEXT
			= new HashMap<Class, JAXBContext>();

	public JAXBContext getContext(Class<?> type) {
		try {
			JAXBContext context = JAXB_CONTEXT.get(type);
			if (context == null) {
				context = JAXBContext.newInstance(type);
				JAXB_CONTEXT.put(type, context);
			}
			return context;
		} catch (JAXBException e) {
			// Do something
			return null;
		}
	}
}

Other approach is one big context created for all the Java objects from specific packages separated with colon:

import javax.ws.rs.ext.ContextResolver;
import javax.xml.bind.JAXBContext;
import javax.xml.bind.JAXBException;

public class CustomJAXBContextProvider implements ContextResolver<JAXBContext> {
	private static JAXBContext jaxbContext;

	public JAXBContext getContext(Class<?> type) {
		try {
			if (jaxbContext == null) {
				jaxbContext = JAXBContext
						.newInstance("com.acme.foo:com.acme.bar");
			}
			return jaxbContext;
		} catch (JAXBException e) {
			// Do something
			return null;
		}
	}
}

Both approaches have pros and cons. First approach has fast start up time, but first request will be slow. Second approach will have fast first request, but slow server start up time. Once JAXB context is created in Dropwizard Application class a Jersey client should be created with this context and used for REST requests:

Client client = new JerseyClientBuilder(environment)
		.using(configuration.getJerseyClientConfiguration())
		.withProvider(CustomJAXBContextProvider.class).build(getName());

Conclusion

There is no practical need to garbage collect JAXB context so it should stay as log as application lives. This is why custom JAXB provider is a good solution even there are not actual performance issues.

Read more...

How to do proper performance testing

Last Updated on by

Post summary: Describe what actions are needed in order to make successful performance testing.

Functional testing that system works as per user requirements is a must for every application. But if application is expected to handle large amount of users then doing a performance testing is also important task. Performance testing have different aspect like load, stress, soak. More about them can be found in Performance, Load, Stress and Soak testing post. Those are all incorporated into term “performance testing” in current article. Steps to achieve successful performance testing in short are:

  1. Set proper goals
  2. Choose tools
  3. Try the tools
  4. Implement scenarios
  5. Prepare environments
  6. Run and measure

Setting the goal

This is one of the most important steps before starting any performance initiative. Just making performance for the sake of making performance is worthless and waste of effort. Before starting any activity it should be clear how many users are expected, what is the peak load, what users are doing on the site and many more. This information usually is obtained from business and product owners, but it could be obtained by certain statistical data. After having rough numbers then define what answers performance test should give. Questions could be:

  • Can the system handle 100 simultaneous users with response time less than 1 second and no error?
  • Can the system handle 50 requests/second with response time less than 1.5 seconds for 1 hour and no more than 98% errors?
  • Can system work 2 days with 50 simultaneous users with response time less than 2 seconds?
  • How the system behaves with 1000 users? With 5000 users?
  • When will the system crash?
  • What is the slowest module of the system?

Choosing the tools

Choosing the tool must be done after the estimated load has been define. There are many commercial and non-commercial tools out there. Some can produce huge traffic and cost lots of money, some can produce mediocre traffic and are free. Important criteria of choosing a tool is how many virtual users it can support and can it fulfil performance goal. Another important thing is can QAs be able to work with it and create scenarios. In current post I will mention two open source tools JMeter and Gatling.

JMeter

It is well known and proven tool. It is very easy to work with, no programming skills are needed. No need to spend many words on its benefits, they are many. Problems though are that it has certain limitations on the load it may produce from a single instance. Virtual users are represented as Java thread and JVM is not good on handling too many threads. Good thing is it provides mechanism for adding more hosts that participate in the run and can produce huge load. Management of those machines is needed. Also there are services in the cloud that offer running JMeter test plans and you can scale up there.

Gatling

Very powerful tool. Build on top of Akka it enables thousands of virtual users on a single machine. Akka has message driven architecture and this overrides the JVM limitation of handling many threads. Virtual users are not threads but messages. Disadvantage is that tests are written in Scala, which makes scenarios creation and maintenance more complex.

Try the tools

Do not just rely on marketing data provided on the web site of the given tool. Absolute must is to record user scenarios and play them with significant amount of users. Try to make it as realistic as possible. Even if this evaluation cost more time, just spend it, it will save a lot of time and money in the long run. This evaluation will give you confidence that tool can do the job and can be used by QAs responsible for the performance testing project.

Implement the scenarios

Some of the work should have already been done during the evaluation demo. Scenarios now must be polished to make them match real user experience as much as possible. It is good idea to be able to implement a mechanism for changing of scenarios just by a configuration.

Essence of performance testing

In terms of Web or API (REST, SOAP) performance testing every tool, no matter how fancy it is, in the end does one and the same, sends HTTP requests to the server, collects and measures the response. This is it, not a rocket science.

Include static resources or not?

This is important question in case of Web performance testing. There is no fixed recipe though. Successful web applications use content delivery network (CDN) to server static content such as images, CSS, JavaScripts, media. If CDN is third party and they provide some service level agreements (SLAs) for response time then static data should be skipped in performance test. If it is our CDN then it may be good idea to make separate performance testing project just for CDN itself. This could double the effort, but will make each projects focused and coherent. If static data is hosted on the same server as dynamic content then it may be good idea to include images also. It very much depends on the situation. Browsers do have cache but it is controlled by correct HTTP response header values. In case of incorrect such or too dynamic static content this can put significant load on the server.

Virtual users vs. requests/second

Tools for performance testing use virtual user as main metric. This is a representation of a real life user. With sleep times between requests they mimic real user behaviour on the application and this gives very close to reality simulation. The metric though is more business orientated. The more technical metric is requests per second. This is what most traffic monitoring tools report. But converting between those is a tricky task. It really depends how application is performing. Will try to illustrate it with some examples. Lets consider 100 users with sleep time of 1 second between requests. This theoretically should give 100 request per second load. But if application is responding more slowly than 1 second then it will produce less req/s as each user has to wait for the response and then sent next request. Lest consider 10 users with no sleep time. If application responds for 100 ms then each user will make 10 req/s this sums to total of 100 req/s. If application responds with 1 second then load will drop to 10 req/s. If application responds with 2 second then load will drop to 5 req/s. In reality it takes several attempts to match users count with expected request per second and all those depend on the application’s response time.

Environments

With the start of the project tests can be run on test servers or local QA/Dev machines. Sometimes problems are caught even at this level. Especially when performance testing is a big event in the company I recommend first do it locally, this could save some embarrassment. Also this helps polish even better the scenarios.Ones everything is working perfect locally then we can start with actual performance testing. Environments to be used for performance testing should be production like. The closer they are, the better. Once everything is good at production-like server the cherry on top will be if tests can be run on production in times of no usage. Beware when running the tests and you try to see at what amount of users system will fail, as your test/production machine could be VM and this may affect other important VMs.

Measure

Each performance testing tool gives some reporting about response times, total number of requests, request per second, responses with errors. This is good, but do not trust this reports. Performance testing tools like any software has bugs. You should definitely have some server monitoring software or application performance measurement tool installed on machine under test. Those tools will give you most adequate information as long as memory usage statistics and even hints where problems may occurred.

Conclusion

Performance testing is important part of an application life-cycle. It should be done correctly to get good results.

Read more...