AWS examples in C# – create a service working with SQS

Last Updated on by

Post summary: To give a basic overview of AWS SQS, how to write a message to it and how to make a consumer that constantly polls the queue for new messages.

This post is part of AWS examples in C# – working with SQS, DynamoDB, Lambda, ECS series. The code used for this series of blog posts is located in aws.examples.csharp GitHub repository.

Event-driven architecture

I would like to briefly touch the topic of event-driven architecture since message service providers, such as SQS or RabbitMQ are the basis of its implementation. This is a software architecture paradigm promoting the production, detection, consumption of, and reaction to events. An event is a significant change in the state of an object, to which someone might be interested in. All communication happens asynchronously and systems are loosely coupled. An event-driven system typically consists of event emitters, event consumers, and event channels. Emitters have the responsibility to detect, gather, and transfer events. Emitters do not know the consumers of the events, they do not even know if a consumer exists. Consumers have the responsibility of applying a reaction as soon as an event is presented in a dedicated channel. This leads to the pattern commonly known as eventual consistency, which pushes the complexity of consistency to the application tier, which is the biggest challenge to solve in an event-driven architecture.

Apart from SQS, there is even more sophisticated service from AWS called EventBridge, which makes it easy to build event-driven applications because it takes care of event ingestion and delivery, security, authorization, and error handling. It is basically a serverless event bus that makes it easy to connect applications together using data from its own applications, integrated Software-as-a-Service (SaaS) applications, and AWS services.

AWS SQS

SQS stands for Simple Queue Service, it is a fully managed message queuing service that enables to decouple and scale microservices, distributed systems, and serverless applications. SQS eliminates the complexity and overhead associated with managing and operating message-oriented middleware and empowers developers to focus on differentiating work.

Types of queues

SQS offers two types of message queues:

  • Standard queues – they offer maximum throughput, best-effort ordering, and at-least-once delivery. This means there is no guaranteed order and messages can be duplicated.
  • FIFO queues – they are designed to guarantee that messages are processed exactly once, in the exact order that they are sent.

Dead-letter queue

In addition to those, there is a special type of queues, called dead-letter queues. They are used mainly for debugging and failure proofing applications. If a message cannot be successfully processed after several retries from one of the source queues above, it ends in the dead-letter queue, from which it can be analyzed and returned back to source queue for reprocessing.

Message processing

It is important to know how SQS operates, in order to make good architectural decisions. When a message is published to the queue it becomes visible. When some consumer reads the message, then the message becomes not visible, but still present in the queue, its status now is in-flight. There is visibility timeout which by default is 30 seconds, the maximum value is 12 hours. After the visibility timeout passes then the message is visible again to be read by consumers. In case there is no dead-letter queue, this process happens over and over until the message retention period is reached, afterward message gets automatically deleted. The retention period default value is 4 days, the maximum value is 14 days. In case of a dead-letter queue, after the message cannot be processed for more than maximum receive count times, then it goes to dead-letter queue and stays in the dead-letter for its message retention period. See more info on SQS on How Amazon SQS Works page.

Architectural approaches

One queue or many queues

Since many event emitters can write messages to the queue it gets tricky to process the messages properly. One option is to have a separate queue for separate types of messages, another option is to put some metadata into the messages. I have decided to go for the solution with one queue because I have just one consumer which knows which message processor to call and thus simplify the code. In the case of many SQS queues, there should be many consumers defined in the code, which is better to split into many micro-services, for each SQS queue.

Dead-letter

I would say a dead-letter queue with the maximum retention period of 14 days is a good idea. In this case, messages can be quarantined which will not slow down the normal queue operations. In the case of no dead-letter queue and default timeouts, if a message cannot be processed, then it will appear every 30 seconds for a period of 4 days, this makes 2880 times a day, 11520 times in total. Now imagine there are thousands of messages like this one. I have decided to go for a dead-letter queue with the default retention period.

Long polling

Long polling is another aspect that has to be considered. It can be enabled in two ways. One is on a queue level, by setting the ReceiveMessageWaitTimeSeconds when creating the queue, it can be from 1 to 20 seconds. Other way to enable it is when messages are read from the queue, there is WaitTimeSeconds setting in the request, which can be from 1 to 20 seconds. In case both options are combined, then WaitTimeSeconds takes precedence.

Unknown messages

Another architectural decision in case there is only one queue is what to be done with unknown messages. In the case of no dead-letter queue, messages are good to be deleted, otherwise, they will keep showing for the queue’s retention period. I throw an error in the logs and after 3 unsuccessful attempts, which is the receive count times I have configured, the message goes to the dead-letter queue.

SQS queue operations at a glance

In AWS examples in C# – basic SQS queue operations post following the operations briefed below were described in more details:

  • Create queue with dead-letter queue
  • Read messages from the queue
  • Write a message to the queue (comes in two flavors)
  • Delete messages to the queue
  • Move messages from dead-letter to source queue

Creating SQS message consumer

In order to read the messages, there should be a consumer that constantly polls the queue and processes the messages. ProcessMessageAsync uses the strategy design pattern to get the proper message processor based on MessageType attribute. Processors are stored in _messageProcessors which is IEnumerable<IMessageProcessor> and is injected by .NET Core dependency injection. If a processor is found, then the processor is invoked, if not an error is shown in the logs. This logic can be subject to change if unknown messages are tolerated in the queue. In ProcessAsync method there is a while loop, which constantly reads for messages by _sqsClient which SqsClient class described in previous sections. SQS returns the response if there are some messages or if WaitTimeSeconds time expired when reading the message or ReceiveMessageWaitTimeSeconds configured by AwsQueueLongPollTimeSeconds environment variable has expired. This while loop is a little tricky to unit test though as it consumes the main thread, and the mocked object should be instructed to wait. Everything is controlled by a CancellationTokenSource, when this is canceled, then consumption is stopped.

ProcessMessageAsync

private async Task ProcessMessageAsync(Message message)
{
	try
	{
		var messageType = message.MessageAttributes.GetMessageTypeAttributeValue();
		if (messageType == null)
		{
			throw new Exception($"No 'MessageType' attribute present in message {JsonConvert.SerializeObject(message)}");
		}

		var processor = _messageProcessors.SingleOrDefault(x => x.CanProcess(messageType));
		if (processor == null)
		{
			throw new Exception($"No processor found for message type '{messageType}'");
		}

		await processor.ProcessAsync(message);
		await _sqsClient.DeleteMessageAsync(message.ReceiptHandle);
	}
	catch (Exception ex)
	{
		_logger.LogError(ex, $"Cannot process message [id: {message.MessageId}, receiptHandle: {message.ReceiptHandle}, body: {message.Body}] from queue {_sqsClient.GetQueueName()}");
	}
}

ProcessAsync

private async void ProcessAsync()
{
	try
	{
		while (!_tokenSource.Token.IsCancellationRequested)
		{
			var messages = await _sqsClient.GetMessagesAsync(_tokenSource.Token);
			messages.ForEach(async x => await ProcessMessageAsync(x));
		}
	}
	catch (OperationCanceledException)
	{
		//operation has been canceled but it shouldn't be propagated
	}
}

StartConsuming

public void StartConsuming()
{
	if (!IsConsuming())
	{
		_tokenSource = new CancellationTokenSource();
		ProcessAsync();
	}
}

private bool IsConsuming()
{
	return _tokenSource != null && !_tokenSource.Token.IsCancellationRequested;
}

Message processors

In the current example, I have taken the architectural design decision to have one queue and different messages into it. For each different type of message, there is a relevant processor. With the strategy design pattern, the appropriate message processor is picked based on MessageType attribute. Processors implement a very simple interface IMessageProcessor. In the current example, they take the message as a string, serialize it to an object and save this object to DynamoDB. A sample implementation is shown below:

IMessageProcessor

public interface IMessageProcessor
{
	bool CanProcess(string messageType);
	Task ProcessAsync(Message message);
}

ActorMessageProcessor

public bool CanProcess(string messageType)
{
	return messageType == typeof(Actor).Name;
}

public async Task ProcessAsync(Message message)
{
	var actor = JsonConvert.DeserializeObject<Actor>(message.Body);
	await _actorsRepository.SaveActorAsync(actor);
	_logger.LogInformation($"ActorMessageProcessor invoked with: {message.Body}");
}

AWS ECS and AWS ECR

ECS stands for Elastic Container Service is a fully managed container orchestration service. Containers can be run in clusters using AWS Fargate, which is a serverless compute for containers. Fargate removes the need to provision and manage servers, lets you specify and pay for resources per application, and improves security through application isolation by design.

ECR stands for Elastic Container Registry is a fully-managed Docker container registry that makes it easy for developers to store, manage, and deploy Docker container images. ECR is integrated with ECS, eliminating the need to operate own container repositories or worry about scaling the underlying infrastructure.

SqsWriter and SqsReader

SqsWriter is a .NET Core 3.0 application, that is dockerized and run in AWS ECS with Fargate, and its container images are stored in ECR. It exposes an API that can be used to publish Actor or Movie objects as messages with separate MessageType attributes in the SQS queue.

SqsReader is a .NET Core 3.0 application, that is dockerized and run in AWS ECS with Fargate, and its container images are stored in ECR. It has a consumer that listens to the SQS queue and processes the messages by writing them into appropriate AQS DynamoDB tables. It also exposes API to stop or start processing, as long as reprocess the dead-letter queue or simply get the queue status.

More information on how to run the solution can be found in AWS examples in C# – run the solution post.

Conclusion

In the current post, I have given some concepts of event-driven architecture and how SQS fits in it. Also, I have described some architectural considerations when using SQS queues, such as dead-letter queues, one queue with different message type or several queues, etc. In the end, I have given practical code on how to make a consumer for the SQS queue.

Related Posts

Read more...

AWS examples in C# – basic SQS queue operations

Last Updated on by

Post summary: Code examples of how to perform basic SQS queue operations like reading, writing, deleting, creating a queue, etc.

This post is part of AWS examples in C# – working with SQS, DynamoDB, Lambda, ECS series. The code used for this series of blog posts is located in aws.examples.csharp GitHub repository. In the current post, I will put in practice example basic SQS operations, a more detailed description of their usage is available in AWS examples in C# – create a service working with SQS post.

Instantiate Amazon SQS client

In the current examples, I use a configuration class called AppConfig. Its values are injected from the environment variables by .NET Core framework in Startup class. In order to work with SQS, a client is needed. The SQS client interface is called IAmazonSQS and comes from AWS C# SDK. The NuGet package is called AWSSDK.SQS, which in the current example comes as a sub-reference from Automationrhapsody.Aws.Examples.Models NuGet package. The concrete AWS client implementation is AmazonSQSClient and a singleton object is instantiated in SqsClientFactory class, where it is possible to use either RegionEndpoint or ServiceURL to instantiate AmazonSQSConfig. This two way of configuration is done to support Localstack experiments I did, more info is available in AWS examples in C# – run in Localstack post. I use the AwsCredentials class which extends the AWS’ abstract AWSCredentials in order to manage the credentials.

SqsClientFactory.cs

public static AmazonSQSClient CreateClient(AppConfig appConfig)
{
	var sqsConfig = new AmazonSQSConfig();
	if (!string.IsNullOrEmpty(appConfig.LocalstackHostname))
	{
		sqsConfig.ServiceURL = $"http://{appConfig.LocalstackHostname}:4576";
		var credentials = new BasicAWSCredentials("xxx", "xxx");
		return new AmazonSQSClient(credentials, sqsConfig);
	}

	sqsConfig.RegionEndpoint = RegionEndpoint.GetBySystemName(appConfig.AwsRegion);
	var awsCredentials = new AwsCredentials(appConfig);
	return new AmazonSQSClient(awsCredentials, sqsConfig);
}

AwsCredentials.cs

public class AwsCredentials : AWSCredentials
{
	private readonly AppConfig _appConfig;

	public AwsCredentials(AppConfig appConfig)
	{
		_appConfig = appConfig;
	}

	public override ImmutableCredentials GetCredentials()
	{
		return new ImmutableCredentials(_appConfig.AwsAccessKey,
						_appConfig.AwsSecretKey, null);
	}
}

AppConfig.cs

public class AppConfig
{
	private const string FifoSuffix = ".fifo";
	private string _queueName;

	public string AwsRegion { get; set; }
	public string AwsAccessKey { get; set; }
	public string AwsSecretKey { get; set; }
	public string AwsQueueName
	{
		get => AwsQueueIsFifo ? _queueName + FifoSuffix : _queueName;
		set => _queueName = value;
	}
	public string AwsDeadLetterQueueName
	{
		get
		{
			var deadLetter = _queueName + "-exceptions";
			return AwsQueueIsFifo ? deadLetter + FifoSuffix : deadLetter;
		}
	}

	public bool AwsQueueAutomaticallyCreate { get; set; }
	public bool AwsQueueIsFifo { get; set; }
	public int AwsQueueLongPollTimeSeconds { get; set; }
	public string LocalstackHostname { get; set; }
}

Startup.cs

public Startup()
{
	var configurationBuilder = new ConfigurationBuilder()
		.AddEnvironmentVariables();
}

Local SqsClient dependencies

This sample code shows what external dependencies the SqsClient class needs. They are injected into the constructor by .NET Core dependency injection.


private readonly AppConfig _appConfig;
private readonly IAmazonSQS _sqsClient;
private readonly ILogger<SqsClient> _logger;
private readonly ConcurrentDictionary<string, string> _queueUrlCache;

public SqsClient(IOptions<AppConfig> awsConfig, 
	IAmazonSQS sqsClient, ILogger<SqsClient> logger)
{
	_appConfig = awsConfig.Value;
	_sqsClient = sqsClient;
	_logger = logger;
	_queueUrlCache = new ConcurrentDictionary<string, string>();
}

Create SQS queue and dead-letter queue

Queues can be created programmatically, something that will be described in the current post. Another option is to create them from the AWS CLI, see more information in AWS examples in C# – deploy with AWS CLI commands post.

Once the client is in place, then the queue and dead-letter queue is created with the code below. The code snippet also enables long polling for the queue, which allows reducing costs while allowing consumers to receive messages as soon as they arrive in the queue. Basically SQS waits until a message is available in a queue before sending a response.

public async Task CreateQueueAsync()
{
	const string arnAttribute = "QueueArn";

	try
	{
		var createQueueRequest = new CreateQueueRequest();
		if (_appConfig.AwsQueueIsFifo)
		{
			createQueueRequest.Attributes.Add("FifoQueue", "true");
		}

		createQueueRequest.QueueName = _appConfig.AwsQueueName;
		var createQueueResponse = await _sqsClient.CreateQueueAsync(createQueueRequest);
		createQueueRequest.QueueName = _appConfig.AwsDeadLetterQueueName;
		var createDeadLetterQueueResponse = await _sqsClient.CreateQueueAsync(createQueueRequest);

		// Get the the ARN of dead letter queue and configure main queue to deliver messages to it
		var attributes = await _sqsClient.GetQueueAttributesAsync(new GetQueueAttributesRequest
		{
			QueueUrl = createDeadLetterQueueResponse.QueueUrl,
			AttributeNames = new List<string> { arnAttribute }
		});
		var deadLetterQueueArn = attributes.Attributes[arnAttribute];

		// RedrivePolicy on main queue to deliver messages to dead letter queue if they fail processing after 3 times
		var redrivePolicy = new
		{
			maxReceiveCount = "3",
			deadLetterTargetArn = deadLetterQueueArn
		};
		await _sqsClient.SetQueueAttributesAsync(new SetQueueAttributesRequest
		{
			QueueUrl = createQueueResponse.QueueUrl,
			Attributes = new Dictionary<string, string>
			{
				{"RedrivePolicy", JsonConvert.SerializeObject(redrivePolicy)},
				// Enable Long polling
				{"ReceiveMessageWaitTimeSeconds", _appConfig.AwsQueueLongPollTimeSeconds.ToString()}
			}
		});
	}
	catch (Exception ex)
	{
		_logger.LogError(ex, $"Error when creating SQS queue {_appConfig.AwsQueueName} and {_appConfig.AwsDeadLetterQueueName}");
	}
}

Read messages from the SQS queue

Reading is done with the given code, where _queueUrlCache is ConcurrentDictionary<string, string>. Queue URL is cached for better performance in GetQueueUrl method.

GetMessagesAsync

public async Task<List<Message>> GetMessagesAsync(string queueName, CancellationToken cancellationToken = default)
{
	var queueUrl = await GetQueueUrl(queueName);

	try
	{
		var response = await _sqsClient.ReceiveMessageAsync(new ReceiveMessageRequest
		{
			QueueUrl = queueUrl,
			WaitTimeSeconds = _appConfig.AwsQueueLongPollTimeSeconds,
			AttributeNames = new List<string> { "ApproximateReceiveCount" },
			MessageAttributeNames = new List<string> { "All" }
		}, cancellationToken);

		if (response.HttpStatusCode != HttpStatusCode.OK)
		{
			throw new AmazonSQSException($"Failed to GetMessagesAsync for queue {queueName}. Response: {response.HttpStatusCode}");
		}

		return response.Messages;
	}
	catch (TaskCanceledException)
	{
		_logger.LogWarning($"Failed to GetMessagesAsync for queue {queueName} because the task was canceled");
		return new List<Message>();
	}
	catch (Exception)
	{
		_logger.LogError($"Failed to GetMessagesAsync for queue {queueName}");
		throw;
	}
}

GetQueueUrl

private async Task<string> GetQueueUrl(string queueName)
{
	if (string.IsNullOrEmpty(queueName))
	{
		throw new ArgumentException("Queue name should not be blank.");
	}

	if (_queueUrlCache.TryGetValue(queueName, out var result))
	{
		return result;
	}

	try
	{
		var response = await _sqsClient.GetQueueUrlAsync(queueName);
		return _queueUrlCache.AddOrUpdate(queueName, response.QueueUrl, (q, url) => url);
	}
	catch (QueueDoesNotExistException ex)
	{
		throw new InvalidOperationException($"Could not retrieve the URL for the queue '{queueName}' as it does not exist or you do not have access to it.", ex);
	}
}

Write a message to the SQS queue

The current example is to write a single message to the queue. AWS SDK offers a method called SendMessageBatchAsync, which can send a group of messages. Because of the nature of the example application, the use of SendMessageBatchAsync is not needed. Writing comes in two flavors. With generic method accepting object instance or with method accepting message text and message type.

In the case of a FIFO queue, there are two more values to be set. One is the MessageGroupId, so messages from the same group are always processed one by one. In the current example, messages are grouped by type. Another mandatory thing is MessageDeduplicationId, which used by SQS for deduplication of sent messages. If a message with a particular message deduplication ID is sent successfully, any messages sent with the same message deduplication ID are accepted successfully but aren’t delivered during the 5-minute deduplication interval.

PostMessageAsync<T>

public async Task PostMessageAsync<T>(string queueName, T message)
{
	var queueUrl = await GetQueueUrl(queueName);

	try
	{
		var sendMessageRequest = new SendMessageRequest
		{
			QueueUrl = queueUrl,
			MessageBody = JsonConvert.SerializeObject(message),
			MessageAttributes = SqsMessageTypeAttribute.CreateAttributes<T>()
		};
		if (_appConfig.AwsQueueIsFifo)
		{
			sendMessageRequest.MessageGroupId = typeof(T).Name;
			sendMessageRequest.MessageDeduplicationId = Guid.NewGuid().ToString();
		}

		await _sqsClient.SendMessageAsync(sendMessageRequest);
	}
	catch (Exception ex)
	{
		_logger.LogError(ex, $"Failed to PostMessagesAsync to queue '{queueName}'. Exception: {ex.Message}");
		throw;
	}
}

PostMessageAsync

public async Task PostMessageAsync(string queueName, string messageBody, string messageType)
{
	var queueUrl = await GetQueueUrl(queueName);

	try
	{
		var sendMessageRequest = new SendMessageRequest
		{
			QueueUrl = queueUrl,
			MessageBody = messageBody,
			MessageAttributes = SqsMessageTypeAttribute.CreateAttributes(messageType)
		};
		if (_appConfig.AwsQueueIsFifo)
		{
			sendMessageRequest.MessageGroupId = messageType;
			sendMessageRequest.MessageDeduplicationId = Guid.NewGuid().ToString();
		}

		await _sqsClient.SendMessageAsync(sendMessageRequest);
	}
	catch (Exception ex)
	{
		_logger.LogError(ex, $"Failed to PostMessagesAsync to queue '{queueName}'. Exception: {ex.Message}");
		throw;
	}
}

Distinguishing messages in the queue

Since many event emitters can write messages to the queue it gets tricky to process the messages properly. One option is to have a separate queue for separate types of messages, another option is to put some metadata into the messages. I have decided to go for the solution with one queue because I have just one consumer which knows which message processor to call. In the case of many consumers, it is recommended to have several SQS queues, so the consumer does not need to read and disregard messages, this is not optimal.

Every message is added additional MessageAttributes. In the example above this is done with SqsMessageTypeAttribute.CreateAttributes(messageType) extension method, available in Automationrhapsody.Aws.Examples.Models NuGet package, which is also part of the examples code, is located in Models project. What this method does is to add MessageType string attribute, where the value is typeof(T).Name.

public static class SqsMessageTypeAttribute
{
	private const string AttributeName = "MessageType";

	public static string GetMessageTypeAttributeValue(this Dictionary<string, MessageAttributeValue> attributes)
	{
		return attributes.SingleOrDefault(x => x.Key == AttributeName).Value?.StringValue;
	}

	public static Dictionary<string, MessageAttributeValue> CreateAttributes<T>()
	{
		return CreateAttributes(typeof(T).Name);
	}

	public static Dictionary<string, MessageAttributeValue> CreateAttributes(string messageType)
	{
		return new Dictionary<string, MessageAttributeValue>
		{
			{
				AttributeName, new MessageAttributeValue
				{
					DataType = nameof(String),
					StringValue = messageType
				}
			}
		};
	}
}

Delete message from the queue

Once the message is processed, it should be removed from the queue. This is done with the following method:

public async Task DeleteMessageAsync(string queueName, string receiptHandle)
{
	var queueUrl = await GetQueueUrl(queueName);

	try
	{
		var response = await _sqsClient.DeleteMessageAsync(queueUrl, receiptHandle);

		if (response.HttpStatusCode != HttpStatusCode.OK)
		{
			throw new AmazonSQSException($"Failed to DeleteMessageAsync with for [{receiptHandle}] from queue '{queueName}'. Response: {response.HttpStatusCode}");
		}
	}
	catch (Exception)
	{
		_logger.LogError($"Failed to DeleteMessageAsync from queue {queueName}");
		throw;
	}
}

Reprocess messages from dead-letter queue

If there is a problem with message processing, they are moved to the dead-letter queue. There might be a specific bug in the consumer application for this particular type of message. This bug might be fixed, new version deployed and now all those messages should be reprocessed. Moving from dead-letter to source queue is done with the following code:

public async Task RestoreFromDeadLetterQueueAsync(CancellationToken cancellationToken = default)
{
	var deadLetterQueueName = _appConfig.AwsDeadLetterQueueName;

	try
	{
		var token = new CancellationTokenSource();
		while (!token.Token.IsCancellationRequested)
		{
			var messages = await GetMessagesAsync(deadLetterQueueName, cancellationToken);
			if (!messages.Any())
			{
				token.Cancel();
				continue;
			}

			messages.ForEach(async message =>
			{
				var messageType = message.MessageAttributes.GetMessageTypeAttributeValue();
				if (messageType != null)
				{
					await PostMessageAsync(message.Body, messageType);
					await DeleteMessageAsync(deadLetterQueueName, message.ReceiptHandle);
				}
			});
		}
	}
	catch (Exception)
	{
		_logger.LogError($"Failed to ReprocessMessages from queue {deadLetterQueueName}");
		throw;
	}
}

SQS queue operations at a glance

All operations described above can be seen in SqsReader SqsClient class and SqsWriter SqsClient class.

Conclusion

In the current post, I have given code examples of how to perform basic SQS queue operations.

Related Posts

Read more...

AWS examples in C# – run the solution

Last Updated on by

Post summary: Explanation of how to install and use the solution in AWS examples in C# blog post series.

This post is part of AWS examples in C# – working with SQS, DynamoDB, Lambda, ECS series. The code used for this series of blog posts is located in aws.examples.csharp GitHub repository. In the current post, I give information on how to install and run the project.

Disclaimer

Although current examples can be run on Localstack, inquiring no costs, they are originally designed to run in AWS Cloud. The solution has commands to deploy to the cloud as well as to clean resources. Note not all resources are cleaned, read more in the Cleanup section. In order to be run in AWS valid account is needed. I am not going to describe how to create an account. If an account is present, then there is also knowledge and awareness of how to use it.

Important: current examples generate costs on AWS account. Use cautiously at your own risk!

Restrictions

The project was tested to be working on Linux and Windows. For Windows, it is working only with Git Bash. The project does not require a valid AWS account, it can be run on Localstack, see more in AWS examples in C# – run in Localstack post.

Required installations

In order to fully run and enjoy the project following needs to be installed:

Configurations

AWS CLI has to be configured in order to run properly. It happens with aws configure. If there is no AWS account, this is not an issue, put some values for access and secret key and put a correct region, like us-east-1.

Import Postman collection, in order to be able to try the examples. Postman collection is in aws.examples.csharp.postman_collection.json file in the code. This is an optional step, below there are cURL examples also.

Run the project

Running locally on Localstack is done with ./solution-deploy-localstack.sh script. Note that the output of the deployment command gives the API Gateway URL. See the screenshot below.

Running on AWS requires the setting of environment variables:

export AwsAccessKey=KIA57FV4.....
export AwsSecretKey=mSgsxOWVh...
export AwsRegion=us-east-1

Then the solution is deployed to AWS with ./solution-deploy.sh script. Note that the output of the command gives the API Gateway URL and API key, as well as the SqsWriter and SqsReader endpoints. See image below:

Usage

There is a Postman collection which allows easy firing of the requests. Another option is to use cURL, examples of all requests with their Postman names are available below.

SqsWriter

SqsWriter is a .NET Core 3.0 application, that is dockerized and run in AWS ECS (Elastic Container Service). It exposes an API that can be used to publish Actor or Movie objects. There is also a health check request. After AWS deployment proper endpoint is needed. The endpoint can be found as an output of deployment scripts. See the image above.

PublishActor

curl --location --request POST 'http://localhost:5100/api/publish/actor' \
--header 'Content-Type: application/json' \
--data-raw '{
	"FirstName": "Bruce",
	"LastName": "Willis"
}'

PublishMovie

curl --location --request POST 'http://localhost:5100/api/publish/movie' \
--header 'Content-Type: application/json' \
--data-raw '{
	"Title": "Die Hard",
	"Genre": "Action Movie"
}'

When Actor or Movie is published, it goes to the SQS queue, SqsReader picks it up from there and processes it. What is visible in the logs is that both LogEntryMessageProcessor and ActorMessageProcessor are invoked. See the screenshot:

SqsWriterHealthCheck

curl --location --request GET 'http://localhost:5100/health'

SqsReader

SqsReader is a .NET Core 3.0 application, that is dockerized and run in AWS ECS. It has a consumer that listens to the SQS queue and processes the messages by writing them into appropriate AQS DynamoDB tables. It also exposes API to stop or start processing, as long as reprocess the dead-letter queue or simply get the queue status. After AWS deployment proper endpoint is needed. The endpoint can be found as an output of deployment scripts. See the image above.

ConsumerStart

curl --location --request POST 'http://localhost:5200/api/consumer/start' \
--header 'Content-Type: application/json' \
--data-raw ''

ConsumerStop

curl --location --request POST 'http://localhost:5200/api/consumer/stop' \
--header 'Content-Type: application/json' \
--data-raw ''

ConsumerStatus

curl --location --request GET 'http://localhost:5200/api/consumer/status'

ConsumerReprocess
If this one is invoked with no messages in the dead-letter queue then it takes 20 seconds to finish, because it actually waits for long polling timeout.

curl --location --request POST 'http://localhost:5200/api/consumer/reprocess' \
--header 'Content-Type: application/json' \
--data-raw ''

SqsReaderHealthCheck

curl --location --request GET 'http://localhost:5200/health'

ActorsServerlessLambda

This lambda is managed by the Serverless framework. It is exposed as REST API via AWS API Gateway. It also has a custom authorizer as well as API Key attached. Those are described in a further post.

ServerlessActors

In the case of AWS, the API Key and URL are needed, those can be obtained from deployment command logs. See the screenshot above. Put correct values to CHANGE_ME and REGION placeholders. Request is:

curl --location --request POST 'https://CHANGE_ME.execute-api.REGION.amazonaws.com/dev/actors/search' \
--header 'Content-Type: application/json' \
--header 'x-api-key: CHANGE_ME' \
--header 'Authorization: Bearer validToken' \
--data-raw '{
    "FirstName": "Bruce",
    "LastName": "Willis"
}'

ServerlessActorsLocal

In the case of Localstack deployment, only URL is needed. Put correct values to CHANGE_ME placeholder.

curl --location --request POST 'http://localhost:4567/restapis/CHANGE_ME/local/_user_request_/actors/search' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer validToken' \
--data-raw '{
    "FirstName": "Bruce",
    "LastName": "Willis"
}'

MoviesServerlessLambda

ServerlessMovies

curl --location --request GET 'https://CHANGE_ME.execute-api.eu-central-1.amazonaws.com/dev/movies/title/Die Hard'

ServerlessMoviesLocal

curl --location --request GET 'https://CHANGE_ME.execute-api.eu-central-1.amazonaws.com/dev/movies/title/Die Hard'

Cleanup

Nota bene: This is a very important step, as leaving the solution running in AWS will accumulate costs.

In order to stop the Localstack version, run ./solution-delete-localstack.sh script.

In order to stop and clean up all AWS resources run ./solution-delete.sh script.

Nota bene: There a resource that is not automatically deleted by the scripts. This is a Route53 resource created by AWS Cloud Map. It has to be deleted with the following commands. Note that the id in the delete command comes from the result of list-namespaces command.

aws servicediscovery list-namespaces
aws servicediscovery delete-namespace --id ns-kneie4niu6pwwela

Verify cleanup

In order to be sure there is no leftovers from the examples, following AWS services has to be checked:

  • SQS
  • DynamoDB
  • IAM -> Roles
  • EC2 -> Security Groups
  • ECS -> Clusters
  • ECS -> Task Definitions
  • ECR -> Repositories
  • Lambda -> Functions
  • Lambda -> Applications
  • CloudFormation -> Stacks
  • S3
  • CloudWatch -> Log Groups
  • Route 53
  • AWS Cloud Map

On top of it, Billing should be regularly monitored to ensure no costs are applied.

Conclusion

This post describes how to run and they the solution described in AWS examples in C# – working with SQS, DynamoDB, Lambda, ECS series

Related Posts

Read more...

AWS examples in C# – working with SQS, DynamoDB, Lambda, ECS

Last Updated on by

Post summary: Overview of the AWS examples in C# series.

In several blog posts, I give some practical examples of how to use AWS SQS, DynamoDB, Lambda with C# code. The code used for this series of blog posts is located in aws.examples.csharp GitHub repository.

Introduction

AWS stands for Amazon Web Services, it is a subsidiary of Amazon that provides on-demand cloud computing platforms and APIs to individuals, companies, and governments, on a metered pay-as-you-go basis. AWS is one of the big cloud service providers. The others are Microsoft Azure and Google Cloud. All three cloud service providers have functions that are semantically common but differ in practical implementation. Also, every one of them has its own flavors. I have chosen to use AWS for these examples as it is something I have used before and I am most comfortable with it.

Architectural overview

In order to get a full understanding of the architecture, I have prepared this very basic diagram. It illustrates what services are there and how they communicate.

SqsReader and SqsWriter

Both are .NET Core 3.0 microservices running in docker containers. The images are uploaded in AWS ECR (Elastic Container Registry) and containers are run in AWS ECS (Elastic Container Service). SqsReader has a REST endpoint, by which an Actor or Movie can be posted. Both are pushed as a message to AWS SQS (Simple Queue Service). SqsWriter is listening to the SQS and in case of a message, it processes it. If the message is from type Actor or Movie then SqsReader saves it to the respective AWS DynamoDB tables. If the message is LogEntry, then the message is only output into SqsReader logs.

ActorsLambdaFunction and MoviessLambdaFunction

Both are .NET Core 2.1 lambda functions run in AWS Lambda. They listen to Actors and Movies DynamoDB tables changes and in case of new entries, they write to LogEntries DynamoDB table. Also, they write SQS messages from type LogEntry, which are then read by SqsReader.

ActorsServerlessLambda and MoviesServerlessLambda

Those are again lambda functions by are fully managed by the Serverless framework. They have a lambda application defined as well as Cloud Formation templates. They expose a REST API trough AWS API Gateway, by which the Actors table can be queried or a movie can be got from the Movies table.

Post in the series

This is a long series of posts describing into detail all the pieces of the architectural diagram above. Also, every aspect of the code in the repository is explained in detail in subsequent blog posts. It was a very interesting learning opportunity for me, which I would like to share. Here are the posts in the series:

Future plans

There are several topics I would like to go into as well, but there is no code yet for them into the GitHub repository. Those are:

  • AWS examples in C# – manage with Terraform
  • AWS examples in C# – use AWS Cognito for API Gateway authorizer
  • AWS examples in C# – structured logging

Conclusion

These series of posts are intended to give some basic overview of important AWS services and how to use them in C# code.

Related Posts

Read more...

Git clone with predefined user email and user name

Last Updated on by

Post summary: Small bash script to clone a Git repository and set user.email and user.name.

Usecase

There are cases when committing with a different user to a different Git repository is needed. Git offers a very easy command to change user.email and user.name, as long as you remember to do so.

git config user.name "Firstname Lastname"
git config user.email "Firstname.LastnameDoe@somemailhost.com"

I always forget to do it, so I made up a small script that I use to clone a repository and it does it for me.

Git also offers a command to globally change user.name and user.email and this is valid for each and every repository that is cloned. If the use case is to work with one name and email only, then maybe this is the best option.

git config --global user.name "Firstname Lastname"
git config --global user.email "Firstname.LastnameDoe@somemailhost.com"

Script

#!/bin/bash

if [ -z "$1" ]
then
  echo "Please provide the Git repo as argument"
  exit 1
fi

if [ -z "$2" ]
then
  echo "Please provide the user.name repo as argument"
  exit 1
fi

if [ -z "$3" ]
then
  echo "Please provide the user.email repo as argument"
  exit 1
fi

IFS='/' read -r -a urlParts <<< "$1"
urlPartsLast=${urlParts[${#urlParts[@]}-1]}

IFS="." read -r -a repoParts <<< "$urlPartsLast"
repoPartsLast=${repoParts[${#repoParts[@]}-1]}
if [ "$repoPartsLast" == "git" ]
then
  unset 'repoParts[${#repoParts[@]}-1]'
fi
repoName=$(printf ".%s" "${repoParts[@]}")
repoName=${repoName:1}

git clone "$1"

cd $repoName
git config user.name "$2"
git config user.email "$3"

The script file should be made executable with chmod +x git-clone.sh and then the script can be invoked with the following command:

./git-clone.sh https://github.com/llatinov/aws.examples.csharp.git "Firstname Lastname" Firstname.LastnameDoe@somemailhost.com

Script insights

The script checks for empty arguments and returns error in case of such. Note that user.name and user.email can be hardcoded into the script itself, this makes it easier to invoke. Then the script splits by slash (/) the Git URL into different parts. It takes the last part, which is supposed to be the repository name. The last part is additionally split by dot (.) and the git suffix is ignored. Script clones the repository and navigates to the folder where it sets the user.name and user.email.

Conclusion

This script is helping not to forget to clone a Git repository with correct user.name and user.email.

Read more...

Serialize and deserialize enum values to custom string in C# with Json.NET

Last Updated on by

Post summary: How to serialize and deserialize C# enum values to customs strings.

The code used for this blog post is located in dotnet.core.templates GitHub repository.

Use case description

Enums provide an efficient way to define a set of named integral constants that may be assigned to a variable. They are strongly typed values and are preferred choice over the string constants. In the given example there is an API that provides functionality to save or get movies with title and genre. Although not the best example, as the genre can have lots of values, but it still can be presented as an enum. In the same setup, there is a user interface that consumes the get API. In the UI it is more convenient to directly display the genre as text, so control over naming convention happens in the backend and there is no mapping in the frontend, and localization is ignored in the current example.

Serialize and deserialize

Serialization and deserialization to a custom string can be done with two steps. The first is to add an attribute to all enum values which gives the preferred string mapping.

using System.Runtime.Serialization;

public enum MovieGenre
{
	[EnumMember(Value = "Action Movie")]
	Action,
	[EnumMember(Value = "Drama Movie")]
	Drama
}

Then Json.NET is instructed to serialize/deserialize the enum value as a string with [JsonConverter(typeof(StringEnumConverter))] attribute.

using Newtonsoft.Json;
using Newtonsoft.Json.Converters;

public class Movie
{
	public string Title { get; set; }

	[JsonConverter(typeof(StringEnumConverter))]
	public MovieGenre Genre { get; set; }
}

This results in the following JSON:

{
	"Title": "Die Hard",
	"Genre": "Action Movie"
}

In the example above, if [EnumMember(Value = “Action Movie“)] is not provided in the enum declaration, then the string representation of the enum value is taken:

{
	"Title": "Die Hard",
	"Genre": "Action"
}

Conclusion

Although the current example is not ideal from a use-case point of view, it shows technically how Newtonsoft Json.NET can be instructed to serialize/deserialize enum values to custom strings or just string representation of the enum value.

Read more...

Dockerize React application with a Docker multi-staged build

Last Updated on by

Post summary: How to build React application inside a Docker container, with a multi-staged build and then run it with NGINX or Caddy.

In the current post, I am not going to compare NGINX vs. Caddy. I will show how to build a React application and package it into a Docker container with both of them. Examples code is located in cypress-testing-framework GitHub repository.

NGINX

NGINX is open-source software for web serving, reverse proxying, caching, load balancing, media streaming, and more. It started out as a web server designed for maximum performance and stability. In addition to its HTTP server capabilities, NGINX can also function as a proxy server for email (IMAP, POP3, and SMTP) and a reverse proxy and load balancer for HTTP, TCP, and UDP servers.

Caddy

Caddy is an open-source, HTTP/2-enabled web server written in Go. It uses the Go standard library for its HTTP functionality. One of Caddy’s most notable features is enabling HTTPS by default.

Building

Docker multi-staged building is going to be used in the current post. I have slightly touched the topic in the Optimize the size of Docker images post. The main idea is to optimize the Docker images, so they become smaller. In the current post, I will show two flavors of builds. One is with the standard NPM package manager and is described in Build and run with NGINX section.

The other is with Yarn package manager and is described in Build and run with Caddy section. Current examples are configured to use Yarn. I personally prefer Yarn as for local development it has very effective caching and also it has a reliable dependency locking mechanism.

Build and run with NGINX

Following Dockerfile is describing the building of the React application with NPM package manager and packaging it into NGINX image.

# ========= BUILD =========
FROM node:8.16.0-alpine as builder

WORKDIR /app

COPY package.json .
COPY package-lock.json .
RUN npm install --production

COPY . .

RUN npm run build

# ========= RUN =========
FROM nginx:1.17

COPY conf/nginx.conf /etc/nginx/nginx.conf
COPY --from=builder /app/build /usr/share/nginx/html

The keyword as builder is used to put the name to the image. Both package.json and package-lock.json are copied to the already configured work directory /app. Installation of the packages is done with npm install –production, where the –production switch is used to skip the devDependencies. In the current example, Cypress takes a lot of time to install, and it is not needed for a production build. Afterward, all project files are copied to the image. The files configured in .dockerignore are skipped. All source code files are intentionally copied to the image only after the NPM packages installation. Packages installation takes time, and they need to be installed only if the package.json file has been changed. In case of code changes only, Docker cache is used for the packages layer, this speeds up the build. The build is initiated with npm run build and takes quite a time. Now there the build artifacts are ready. Next stage is to copy the artifacts to nginx:1.17 image into /usr/share/nginx/html folder from builder image’s /app/build folder. Also, NGINX configuration file is copied.

worker_processes auto;
worker_rlimit_nofile 8192;

events {
  worker_connections 1024;
}

http {
  include /etc/nginx/mime.types;
  sendfile on;
  tcp_nopush on;

  gzip on;
  gzip_static on;
  gzip_types
    text/plain
    text/css
    text/javascript
    application/json
    application/x-javascript
    application/xml+rss;
  gzip_proxied any;
  gzip_vary on;
  gzip_comp_level 6;
  gzip_buffers 16 8k;
  gzip_http_version 1.1;

  server {
    listen 3000;
    server_name localhost;
    root /usr/share/nginx/html;
    auth_basic off;

    location / {
      try_files $uri $uri/ /index.html;
    }

    # 404 if a file is requested (so the main app isn't served)
    location ~ ^.+\..+$ {
      try_files $uri =404;
    }
  }
}

I will not go into NGINX configuration details, the configuration can be checked in details in NGINX documentation. Important in the configuration above is that gzip compression is enabled and NGINX listens to port 3000. Then with try_files unknown routes are redirected to index.html, so React can bootstrap the routes.

Build and run with Caddy

Following Dockerfile is describing the building of the React application with Yarn package manager and packaging it into Caddy image.

# ========= BUILD =========
FROM node:8.16.0-alpine as builder

WORKDIR /app

RUN npm install yarn -g

COPY package.json .
COPY yarn.lock .
RUN yarn install --production=true

COPY . .

RUN yarn build

# ========= RUN =========
FROM abiosoft/caddy:1.0.3

COPY conf/Caddyfile /etc/Caddyfile
COPY --from=builder /app/build /usr/share/caddy/html

Absolutely the same logic applies here as above. Yarn is installed as an additional Linux package, then package.json and yarn.lock files are copied. It is very important to copy the yarn.lock, otherwise every run lates dependencies will be fetched, and there might be inconsistent behavior. Only production dependencies are installed with yarn install –production=true. After the application is built with yarn build it is being copied to abiosoft/caddy:1.0.3 image in /usr/share/caddy/html folder from builder image. Caddyfile is copied as well to configure Caddy.

0.0.0.0:3000 {
	gzip
	log / stdout "{method} {path} {status}"
	root /usr/share/caddy/html
	rewrite {
		regexp .*
		to {path} /
	}
}

Caddy is configured to listen to port 3000, gzip compression is enabled and there is rewrite rule which redirects unknown paths to the main path, so React can bootstrap the router.

Conclusion

In the current post, I have shown how to build React application inside a Docker image with both NPM and Yarn and then pack the build artifacts to NGINX or Caddy Docker image, which later can be run as a container. This process optimizes the Docker image size and also it does not put extra requirements to the build machine to have Node JS installed, as Node JS is inside the builder image.

Related Posts

Read more...

Create .NET Core health checks with custom response payload

Last Updated on by

Post summary: How to extend custom .NET Core health checks so the response JSON provides more information.

The code used for this blog post is located in dotnet.core.templates GitHub repository.

Heath checks in .NET Core

Health checks in .NET Core is a middleware that provides a possibility to report an application’s health. This allows monitoring of the application and taking corrective actions in case of issues. For e.g., if an application reports to be unhealthy, then the load balancer can exclude it from the infrastructure and appropriate alarms to be raised. More in about health checks can be read in Health checks in ASP.NET Core page.

Adding a health check

In order to add a health check in a .NET Core application, then reference to Microsoft.AspNetCore.Diagnostics.HealthChecks package has to be added. Health checks themselves are classes implementing IHealthCheck interface.

public class VersionHealthCheck : IHealthCheck
{
	private readonly AppConfig _config;

	public VersionHealthCheck(IOptions<AppConfig> options)
	{
		_config = options.Value;
	}

	public Task<HealthCheckResult> CheckHealthAsync(HealthCheckContext context,
				CancellationToken cancellationToken = new CancellationToken())
	{
		return Task.FromResult(string.IsNullOrEmpty(_config.Version)
			? HealthCheckResult.Unhealthy()
			: HealthCheckResult.Healthy());
	}
}

Health checks have to be registered in Startup.cs file:

public void ConfigureServices(IServiceCollection services)
{
	services.AddHealthChecks()
		.AddCheck<VersionHealthCheck>("Version Health Check");
}

As a last step health checks endpoint has to be configured in Startup.cs file:

public void Configure(IApplicationBuilder app)
{
	app.UseEndpoints(endpoints =>
	{
		endpoints.MapHealthChecks("/health");
	});
}

Now health checks report is available at <HOSTNAME>/health URL. If everything is good, the response is 200 OK with content Healthy. In case of issues, the response is 503 Service Unavailable with content Unhealthy.

Extend the health checks response

As stated above health checks are mainly intended for machine usage. I have had cases in practice, in which just looking into the health check allows faster problem solving rather than looking into the logs. For this reason, investing in more explanatory health checks is worth it. Below is a code snippet on how to have more information into the health check response payload. A new static class HealthCheckExtensions with MapCustomHealthChecks method can be added.

public static class HealthCheckExtensions
{
	public static IEndpointConventionBuilder MapCustomHealthChecks(
		this IEndpointRouteBuilder endpoints, string serviceName)
	{
		return endpoints.MapHealthChecks("/health", new HealthCheckOptions
		{
			ResponseWriter = async (context, report) =>
			{
				var result = JsonConvert.SerializeObject(
					new HealthResult
					{
						Name = serviceName,
						Status = report.Status.ToString(),
						Duration = report.TotalDuration,
						Info = report.Entries.Select(e => new HealthInfo
						{
							Key = e.Key,
							Description = e.Value.Description,
							Duration = e.Value.Duration,
							Status = Enum.GetName(typeof(HealthStatus),
													e.Value.Status),
							Error = e.Value.Exception?.Message
						}).ToList()
					}, Formatting.None,
					new JsonSerializerSettings
					{
						NullValueHandling = NullValueHandling.Ignore
					});
				context.Response.ContentType = MediaTypeNames.Application.Json;
				await context.Response.WriteAsync(result);
			}
		});
	}
}

All the formatting in the code depends on two additional data classes HealthInfo and HealthResult.

public class HealthInfo
{
	public string Key { get; set; }
	public string Description { get; set; }
	public TimeSpan Duration { get; set; }
	public string Status { get; set; }
	public string Error { get; set; }
}
public class HealthResult
{
	public string Name { get; set; }
	public string Status { get; set; }
	public TimeSpan Duration { get; set; }
	public ICollection<HealthInfo> Info { get; set; }
}

Registering the endpoint happens with the same code as in the default case, with the difference that the MapCustomHealthChecks extension method is used:

public void Configure(IApplicationBuilder app)
{
	app.UseEndpoints(endpoints =>
	{
		endpoints.MapCustomHealthChecks("Service Name");
	});
}

Now it is possible to have some more elaborate health checks, which can capture exception for e.g. and return it as well.

public Task<HealthCheckResult> CheckHealthAsync(HealthCheckContext context,
		CancellationToken cancellationToken = new CancellationToken())
{
	try
	{
		var message = $"Version is healthy: {_config.Version}";
		return Task.FromResult(HealthCheckResult.Healthy(message));
	}
	catch (Exception ex)
	{
		var message = "There is an error with version health check";
		return Task.FromResult(HealthCheckResult.Unhealthy(message, ex));
	}
}

In the case of 503 Service Unavailable, health check gives more details, which in some cases can be enough to resolve the issue without having to dig into the logs.

{
	"Name": "Service Name",
	"Status": "Unhealthy",
	"Duration": "00:00:00.0159186",
	"Info": [
		{
			"Key": "Version Health Check",
			"Description": "There is an error with version health check",
			"Duration": "00:00:00.0010564",
			"Status": "Unhealthy",
			"Error": "Exception's message text"
		}
	]
}

Conclusion

.NET Core health checks are a convenient way for automatic service monitoring and taking corrective actions. With a small effort, they can be enhanced so they can be made useful for people trying to identify what the issues with the services are.

Related Posts

Read more...

Optimize the size of Docker images

Last Updated on by

Post summary: How to optimize the size of the Docker images, by using intermediate build image and final runtime image.

The code used for this blog post is located in dotnet.core.templates GitHub repository. The code examples below are for .NET Core 3.0, but principles applied in this article are valid for any programming language, so it is worth reading.

Docker layers and images

Docker image is an executable version of a given application that runs on top of an operating system’s kernel. Docker image is the result of the execution of a Dockerfile. Usually, Dockerfile starts from some base image, for e.g. an operating system. Then commands are built on top of this base image and the result is a new image. This new image can be used as a base image somewhere else. Each and every command in Dockerfile results in a layer. This layering system is used for better reusability, as several images can reuse a given layer. The more layers are added to the image the bigger it gets in size.

All docker images can be listed with docker images command. Size is also present as an output of the command. Then for a given image, it is possible to list all the layers with docker history <IMAGE_NAME> command, which also shows the size of a given layer.

Images are kept in a Docker repository, either public or private. The bigger the image, the more time it takes to upload, to download and the more space it consumes in the repository. It is a good practice to optimize the images in terms of size.

Optimize the size

Usually when building software much more resources are needed, such as SDK, or compiler, or additional libraries, than if the software is run. One strategy for optimization is to build the software on a special build machine and then pack it to a Docker image. In this approach, the build machine should have the needed build software. This puts some demand on the build machine and also makes the image creating process dependant on certain software packages being installed. A more convenient option is to build the application as part of the Docker image creating and then packet into a separate container. See Dockerfile below.

FROM mcr.microsoft.com/dotnet/core/aspnet:3.0-buster-slim AS base
WORKDIR /app

FROM mcr.microsoft.com/dotnet/core/sdk:3.0-buster AS build
WORKDIR /src
COPY . .
RUN dotnet restore
RUN dotnet publish -c Release -o /pub

FROM base AS final
WORKDIR /app
COPY --from=build /pub .
ENTRYPOINT ["dotnet", "PROJECT_NAME.dll"]

In short, image sdk:3.0-buster is used to publish the application as it has .NET Core SDK on it, and then application code is copied into aspnet:3.0-buster-slim which has only the .NET Core runtime and is low in size.

No matter how the software is built, the most optimal image in terms of size and capabilities has to be selected to pack the code into. For e.g. Google provides “Distroless”, images that do not contain package managers, shells or any other programs you would expect to find in a standard Linux distribution. This makes images smaller and much more secure. I tried to build the application I am experimenting with into Distroless image and it gets 136MB in size, where if I pack it into .NET 3.0 runtime image it gets 209MB. Unfortunately, there is no Distroless image for .NET Core 3.0, so my experiment image fails to run, and I have to use aspnet:3.0-buster-slim in order to run my sample application.

.NET Core different images

.NET Core has different images, which are very well explained into .NET Core SDK images page. They are:

  • buster – Debian 10
  • alpine – Alpine
  • bionic – Ubuntu 18.04
  • disco – Ubuntu 19.04
  • stretch – Debian 9

.NET Core 3.0 error in stretch images

This section is not directly contributing to the main point of the topic, but it might be helpful to someone. When I experimented, I initially started with stretch base images. And I got the following errors:

  • System.MissingMethodException: Method not found: ‘Void Microsoft.AspNetCore.WebUtilities.FileBufferingReadStream..ctor(System.IO.Stream, Int32)’
  • System.TypeLoadException: Could not load type ‘Microsoft.AspNetCore.WebUtilities

These errors were not present when switching to buster base images.

Conclusion

In the current post, I describe how to construct Docker files so the build is done in Docker, eliminating the need of having specific software in order to pack the images. No matter how the software is built it is very important to pack it into the smallest possible image in order to save bandwidth and storage space during image usage. Google provides Distroless images that seem very lightweight and also secure as they do not contain package managers, shells or any other programs. Examples in this post are in .NET Core 3.0, but principles can be applied to different programming languages and technologies.

Related Posts

Read more...

Create project for .NET Core custom template

Last Updated on by

Post summary: How to create a custom .NET Core template, install it and create projects from it.

The code used for this blog post is located in dotnet.core.templates GitHub repository.

.NET Core

In short .NET Core is a cross-platform development platform supporting Windows, macOS, and Linux, and can be used in device, cloud, and embedded/IoT scenarios. It is maintained by Microsoft and the .NET community on GitHub. More can be read on .NET Core Guide.

Why .NET Core templates

When you create a new .NET Core project with dotnet new command it is possible to select from a list of predefined templates. This is a very handy feature, but out of the box templates are not really convenient, additional changes are required afterward to make the project fit for purpose. In a situation where many projects with similar structures are created, such as many micro-services, a custom template is very helpful. Users can define a custom template and easily create new projects out of it. In the current post, I will describe how to create an elaborate custom template.

Create template

What has to be done is just to create a project and customize it according to the needs. Once code is ready, a file named template.json located in .template.config folder should be added. The file should conform to http://json.schemastore.org/template JSON schema. See the file below, it is more or less self-explanatory. An important field is identity, which is basically the unique template name, it is not visible to users though. What is visible are name and shortName. ShortName is actually used when creating a project from this template: dotnet new shortName. Guids used in the template are defined in guids section in template.json file and they are replaced with a fresh set of guids on each project creation. More details on each and every possible option can be found in “Runnable-Project”-Templates page.

{
	"$schema": "http://json.schemastore.org/template",
	"author": "Lyudmil Latinov",
	"classifications": [
		"Common",
		"Code"
	],
	"identity": "dotnet.core.micro.service",
	"name": ".NET Core 3.0 micro-service",
	"shortName": "microservice",
	"tags": {
		"language": "C#",
		"type": "item"
	},
	"guids": [
		"9A19103F-16F7-4668-BE54-9A1E7A4F7556"
	]
}

This is it, the template is now ready to be installed and used.

Install template and create project

The template is installed with dotnet new -i <PATH_TO_FOLDER>. Once the template is installed dotnet new command lists it along with all predefined templates.

Creating a project is similar as it is created from standard templates: dotnet new <SHORT_NAME>, dotnet new microservice in this example.

Uninstalling of a template is done with dotnet new -u <PATH_TO_FOLDER> command. There is another command which uninstalls all custom template from the system: dotnet new –debug:reinit.

Replace project name

A custom template is a very handy thing and I would like to go one step further in making this template more flexible. When a new project is created it is good to have a proper name, which is reflected in the file names, folders’ names and into the namespace. For this reason, a custom replace task can be added to the template definition. A placeholder PROJECT_NAME is added to namespaces, Dockerfile, folder names, solution file name, and .csproj files names. This is done by adding and external required parameter in symbols section of template.json file. On project creation, this parameter is provided with value, which gets replaced in files content and file names.

{
	"$schema": "http://json.schemastore.org/template",
	...
	"symbols": {
		"ProjectName": {
			"type": "parameter",
			"replaces": "PROJECT_NAME",
			"FileRename": "PROJECT_NAME",
			"isRequired": true
		}
	}
}

When dotnet new microservice -h is executed it outputs the possible parameters that are needed to create a project from this template:

.NET Core 3.0 micro-service (C#)
Author: Lyudmil Latinov
Options:
  -P|--ProjectName
                    string - Required

Now, the project cannot be created without providing this parameter. Command dotnet new microservice returns error Mandatory parameter –ProjectName missing for template .NET Core 3.0 micro-service. The proper command now is dotnet new microservice –ProjectName SampleMicroservice.

Conditional files and features

So far template is much nicer and looks more real. It can be also further enhanced. For e.g. there are two major types of projects needed. One option is to have two separate templates, which means maintaining two templates. Another option, in case of insignificant differences, is to have conditional functionality added based on a parameter during template creation. A boolean parameter is added into symbols section of template.json file. Another thing to be done is to exclude files depending on this parameter. This is done in the sources section of template.json file.

{
	"$schema": "http://json.schemastore.org/template",
	...
	"symbols": {
		...
		"AddHealthChecks": {
			"type": "parameter",
			"datatype": "bool",
			"defaultValue": "false"
		}
	},
	"sources": [
		{
			"modifiers": [
				{
					"condition": "(!AddHealthChecks)",
					"exclude": [
						"src/**/HealthChecks/**",
						"test/**/Client/HealthCheckClient.cs",
						"test/**/Tests/HealthCheckTest.cs"
					]
				}
			]
		}
	]
}

Customize parameters switches

If you add several of those parameters, then .NET gives them an automatic name, which may not be very meaningful. So the names of those can be customized. In the example here, which is located in dotnet.core.templates GitHub repository, the default parameter names are (dotnet new microservice -h):

.NET Core 3.0 micro-service (C#)
Author: Lyudmil Latinov
Options:
  -P|--ProjectName
                         string - Required

  -A|--AddHealthChecks
                         bool - Optional
                         Default: false / (*) true

  -Ad|--AddSqsPublisher
                         bool - Optional
                         Default: false / (*) true

  -p:A|--AddSqsConsumer
                         bool - Optional
                         Default: false / (*) true


* Indicates the value used if the switch is provided without a value.

Short names like -p:A, -Ad does not seem too convenient. Those can be easily customized by adding dotnetcli.host.json file into .template.config folder:

{
	"$schema": "http://json.schemastore.org/dotnetcli.host",
	"symbolInfo": {
		"ProjectName": {
			"longName": "ProjectName",
			"shortName": "pn"
		},
		"AddHealthChecks": {
			"longName": "AddHealthChecks",
			"shortName": "ah"
		},
		"AddSqsPublisher": {
			"longName": "AddSqsPublisher",
			"shortName": "ap"
		},
		"AddSqsConsumer": {
			"longName": "AddSqsConsumer",
			"shortName": "ac"
		}
	}
}

Now options are much different:

.NET Core 3.0 micro-service (C#)
Author: Lyudmil Latinov
Options:
  -pn|--ProjectName
                         string - Required

  -ah|--AddHealthChecks
                         bool - Optional
                         Default: false / (*) true

  -ap|--AddSqsPublisher
                         bool - Optional
                         Default: false / (*) true

  -ac|--AddSqsConsumer
                         bool - Optional
                         Default: false / (*) true


* Indicates the value used if the switch is provided without a value.

Conclusion

.NET Core provides an easy and extensible way to make project templates, which are stored and maintained under version control and can be used for creating new projects. Those templates can be project-specific, product-specific, company-specific.

Related Posts

Read more...

Restore deleted Git stash

Last Updated on by

Post summary: How to restore deleted Git stash.

Git

Git is a version control system, which is conceptually different than others. It is a mini file system, which has all the information locally. Git support fully local work, no internet connection is needed once the project is checked out. All changes are done locally and saved to the local database. Once there is internet connection changes can be synced to the server and available for others as well. See more in What is Git? page.

Git stash

Git offers so-called stashing. Current work can be temporarily saved, without being committed. When current work is stashed, the repository is reverted back to the original state. One possible use case is when new conflicting changes from the upstream are coming. Current work has to be saved, remote changes applied and then, the current work to be completed.

Many changes can be stashed. The problem with having several stashes is that there is no easy way to merge the stashes. So it is recommended not to have more than one stash at a time.

Common stash operations

The most important actions that can be done on a stash are:

  • git stash – saves current work to stash
  • git stash list – shows all stashed changes
  • git stash apply – apply the latest stash
  • git stash clear – removes all stashes

More details can be found in git-stash page.

Restore deleted Git stash

It happened several times that I am working on something important with many changes, but then I need to switch to another thing. I do not want to commit the work, as it is messy, so I have to stash it. Then I accidentally deleted the stash. The worst thing that happened was two weeks of work stash to get deleted, quite upsetting.

Luckily Git is a really sophisticated version control system and it saves intermediates states, so stash is not really lost. It can be restored. I have a favorite article on the topic, Recover a dropped Git stash. There are some command line suggestions in it, but what I love is:

gitk --all $(git fsck --no-reflogs | awk '/dangling commit/ {print $3}')

In the screenshot, it is very clearly shown the stash and the file changes into it. Then those changes can be manually applied.

Conclusion

Git is a very sophisticated version control system, more like a mini file system. It allows you to stash changes that are not ready to be committed yet. Stashes can be accidentally deleted. The good thing is there is a mechanism to restore deleted stashes.

Read more...

Ansible playbook example and how to provision it in Vagrant

Last Updated on by

Post summary: Brief introduction to Ansible with and playbook example. Example of provisioning of the Ansible Playbook into Vagrant.

The code below can be found in GitHub sample-dropwizard-rest-stub repository in Vagrantfile-ansible and playbook.yml files. Since Vagrant requires to have only one Vagrantfile if you want to run this example you have to rename Vagrantfile-ansible to Vagrantfile then run Vagrant commands described at the end of this post. This post is part of Vagrant series. All of other Vagrant related posts, as well as more theoretical information what is Vagrant and why to use it, can be found in What is Vagrant and why to use it post. I have used Ansible and Chef for deployments, unlike Chef, Ansible can easily be demonstrated in an offline demo. In order to test your Ansible playbook, it can be easily provisioned into Vagrant. This will be demonstrated into the current post.

Ansible introduction

Ansible is a tool used for managing repetitive IT tasks, such as deployments, infrastructure management, etc. It connects to the machines it has to configure via SSH. Commands are written in a Playbook, which can be saved under version control. The playbook contains some actions that have to be performed, those actions are described in YAML format using so-called modules. A module is a small step that performs a certain action, such as copy file, execute a bash command, etc. Full list of modules can be found in Ansible modules index. Playbooks can be reused, which gives flexibility. Full Ansible documentation can be found in Ansible user guide. If things become way too big to manage, there is Ansible Tower which can centrally control all your Ansible environment.

Example Playbook

In Run Dropwizard Java application on Vagrant post, I have shown how to deploy a single fat JAR Java service in Vagrant. The commands from that post are translated to an Ansible Playbook and service is deployed with Ansible.

---

- name: Deploy 'dropwizard-rest-stub'
  hosts: all
  vars:
    service_file: /etc/init.d/dropwizard
  tasks:

    - name: Check if service is installed
      stat:
        path: '{{ service_file }}'
      register: service_result

    - name: Stop service
      service:
        name: dropwizard
        state: stopped
        use: service
      when: service_result.stat.exists == True

    - name: Install Java 8
      yum:
        name: java-1.8.0-openjdk-devel

    - name: Create folders
      file:
        path: /var/dropwizard-rest-stub/logs
        state: directory

    - name: Copy JAR file
      copy:
        src: target/sample-dropwizard-rest-stub-1.0-SNAPSHOT.jar
        dest: /var/dropwizard-rest-stub/dropwizard-rest-stub.jar

    - name: Copy configuration file
      copy:
        src: config-vagrant.yml
        dest: /var/dropwizard-rest-stub/config.yml

    - name: Copy service file
      copy:
        src: linux_service_file
        dest: '{{ service_file }}'

    - name: Fix service file because of Windows new lines
      replace:
        path: '{{ service_file }}'
        regexp: '\r'
        replace: ''

    - name: Make service file executable
      file:
        path: '{{ service_file }}'
        mode: 755

    - name: Reload services
      systemd:
        daemon_reload: yes

    - name: Start service
      service:
        name: dropwizard
        state: started
        use: service

I will not go into details about the playbook, each step is self-explanatory. The service file path is declared as a service_file variable, which is later used as ‘{{ service_file }}’. Service has to be shut down in case the playbook is run for an upgrade, this is why a conditional stat module task is used. It checks if the service file exists and if it does then service task tries to stop the service. Then playbook installs Java 8, copies the JAR and configurations, creates the service and starts it. Important here in service task is use: service, otherwise it will try the default systemd which will produce an error.

Run the Playbook

Running the Playbook is a task that requires more effort. What happens usually, is that you have boxes with SSH installed which are ready to get configured with Ansible. In this demo, you have to make this box on your own. I use Oracle’s VirtualBox. CentOS 7 image can be downloaded from CentOS boxes, the password for the user is osboxes.org. Once you download the image, you create VirtualBox instance with an existing hard drive, instructions can be found in Creating a New Virtual Machine in VirtualBox tutorial. Once you do that, you have to configure the Network to Bridged Adapter.

Then you login to the virtual machine and install OpenSSH. Install in with yum -y install openssh-server openssh-clients command. Start the SSH service with chkconfig sshd on and then service sshd start commands. Edit configuration with sudo vi /etc/ssh/sshd_config command and permit root login by adding PermitRootLogin yes in the config file. Note that permitting root login should never be done in a real environment, I do it here just to make our demo easier, otherwise, I would have to create a separate user with permissions, which is more effort. Finally, restart the SSH service with service sshd restart command. Full details can be found in Install and configure ssh server and client under CentOS post. Find the IP of the virtual machine by executing ifconfig command inside the box. In my case IP was 192.168.1.59, I have highlighted it in the image below and will be using it in the commands further into the post.

After you have the SSH running you are now ready to run the Ansible Playbook. Before doing that you have to add the ECDSA key fingerprint to your known_hosts, otherwise Ansible connection will fail. To do so just SSH to the virtual box from your machine with ssh root@192.168.1.59 command. You will be asked whether to continue connecting, answer with yes and then exit the SSH session.

Create hosts file with the following content:

[all]
192.168.1.59

In Playbook we have defined hosts affected by this playbook with hosts: all, but this is only abstract definition, this is why you need the hosts file, which provides the actual machines matching the abstract definition. Once you have the hosts file playbook is run with:

ansible-playbook -i hosts -u root --ask-pass -e ansible_network_os=vyos playbook.yml

The command runs the playbook.yml file with user -u root and asks for password with –ask-pass. Command has a configuration passed with -e ansible_network_os=vyos, this basically sets a network protocol for Ansible to communicate with the virtual box.

Once the playbook is executed you can log in to the virtual machine and check http://localhost:9000/person/all in the browser.

Provision into Vagrant

Vagrant is a tool for building and managing virtual machine environments in a single workflow. Vagrant provides an Ansible provisioner. There are two modes of this provisioner – ansible and ansible_local. The difference is that with ansible_local you do not need to have real Ansible installation on your host operating system. In the current example, ansible_local is used so anyone can go directly to Vagrant provisioning without the need to install Ansible. It takes some more time as Ansible should be installed on Vagrant virtual box, so if you have Ansible installed already you can switch to ansible provisioner. Vagranfile is:

Vagrant.configure('2') do |config|

  config.vm.hostname = 'dropwizard'
  config.vm.box = 'opscode-centos-7.2'
  config.vm.box_url = 'http://opscode-vm-bento.s3.amazonaws.com/vagrant/virtualbox/opscode_centos-7.2_chef-provisionerless.box'

  config.vm.network :forwarded_port, guest: 9000, host: 9200
  config.vm.network :forwarded_port, guest: 9001, host: 9201

  config.vm.provider :virtualbox do |vb|
    vb.name = 'dropwizard-rest-stub-ansible'
  end

  config.vm.provision :ansible_local do |ansible|
    ansible.become = true
    ansible.playbook = 'playbook.yml'
  end

end

Guest’s port 9000 is exposed to 9200 to your host, so once you provision with Vagrant, you can check http://localhost:9200/person/all. One important piece here is ansible.become = true which basically is the analog for sudo in the normal commands.

Conclusion

Ansible is an easy way to streamline your configuration changes like deployments, infrastructure configuration, etc. In the current post, I have given an example of very simple Ansible Playbook which deploys a single JAR Java application. In order to test Ansible Playbook, it can be provisioned into Vagrant.

 

Related Posts

Read more...

Testing with Cypress – Code coverage with Istanbul

Last Updated on by

Post summary: This article describes how to extract and process Istanbul code coverage, and generate HTML reports.

This post is part of a Cypress series, you can see all post from the series in Testing with Cypress – lessons learned in a complete framework. Examples code is located in cypress-testing-framework GitHub repository.

Code coverage instrumentation

In Testing with Cypress – Build a React application with Node.js backend is described how the application is instrumented to track code coverage. This is a very essential part, without it, measurement is not possible.

Code coverage capturing data

Capturing of code coverage results is done in cypress/support/core/cypress_code_coverage.js file. It is included in cypress/support/index.js file with import ‘./core/cypress_code_coverage’; statement. For each and every test suite separate file with coverage data is created. Depending on the application those files can get pretty big, and writing and reading them slows the tests. So code coverage is controlled with TEST_CODE_COVERAGE environment variable. By default, it is set to false. Once all tests are run and coverage data is saved then it has to be merged. Merging is invoked with yarn cypress:report command.

Code coverage report

An important prerequisite is to generate the code coverage report is to have nyc installed as a global NPM package. Since the paths in the container are not the same as the paths locally, in order to read correct sources there is reprocessing of the paths, DOCKER_CONTAINER_PATH is replaced with the current folder. You can see how code coverage looks like in Istanbul-report. For this particular example only save_person_spec.js has been run with yarn cypress:run –spec=’cypress/tests/persons/save_person_spec.js’ command.

Conclusion

Code coverage is not a crucial part of the whole QA process but is very nice to have feature. With code coverage, we can improve on our tests, make them cover bits of the code that we have missed during analysis and creating of the tests themselves.

Related Posts

Read more...

Testing with Cypress – Custom logging of errors and JUnit results

Last Updated on by

Post summary: Description of the custom error logger and also custom JUnit XML file creator.

This post is part of a Cypress series, you can see all post from the series in Testing with Cypress – lessons learned in a complete framework. Examples code is located in cypress-testing-framework GitHub repository.

The issue

Cypress is not good at error tracking and reporting. If a test fails it is hard to understand why. Errors sometimes are vague, stacktrace is not useful as it does not lead to the proper line of your code since it is being wrapped into Cypress’ code. Forget about the nice stack traces that Java/C# code is producing, where you just go, find, and eliminate the error without even debugging. Debugging errors in tests is much harder with Cypress.

The solution

Gleb Bahmutov, currently a VP of Engineering at Cypress.io has a nice NPM package, called cypress-failed-log. It gathers commands that Cypress was executing during a test run and in case of a failure saves them to a file. You can inspect the file and trace what parts of your test were executed.

Modified solution

I started with that solution but did not enjoy it much. What I did is to take the base code and modify it. Those modifications are still tracking the Cypress commands, but also they track requests and response being exchanged by application and the backend, so in case of error you can also inspect the backend response. One important thing is that each test should have a unique name, otherwise overlapping may occur. The logging code is located in cypress/support/core/cypress_logging.js file, it is registered to Cypress within cypress/support/index.js file with import ‘./core/cypress_logging’;. The code also copies the screenshot of the test failure for better understanding of the error.

Capturing of request/response between the backend and the frontend can be controlled with TEST_CAPTURE_RESPONSES environment variable, it is true by default. Sometimes you will need to avoid certain requests/responses from being captured as they are not important. This can be done with TEST_CAPTURE_RESPONSES_EXCLUDE_PATHS variable, use asterisks to match the URLs. For e.g. I am testing a Ruby on Rails application which has a profiler enabled, which massively pollutes the logs, so I exclude those with ‘*/mini-profiler-resources/*’ pattern.

All this data is saved as a file with the name of the test inside a folder with the name of the suite. For e.g. cypress/logs/logging/multiple_testsuites_mix_spec.js/Test suite mix #1 — test case #2 (failed).json. The name of the JSON file is same as the name of the automatically generated screenshot on failure.

JUnit results with Cypress

In order to make Cypress output the test results into JUnit XML file following steps has to be done. Add the following configuration into cypress.json. This configuration makes Cypress create JUnit XML file. The important bit here is [hash] in the file name, otherwise, Cypress will overwrite the files.

{
    "reporter": "junit",
    "reporterOptions": {
        "mochaFile": "results/my-test-output-[hash].xml"
    }
}

If you use some CI tool then you can pass the XML results to it and it will visualize them.

Additionally, you can manipulate the XML results, you can merge them into just one XML file by installing junit-merge as a global NPM package and run junit-merge -d results -o results/merged.xml.

You can generate an HTML from XMLs with xunit-viewer NPM package. In case you have merged the XMLs into one then the command is xunit-viewer –results=results/merged.xml –output=results/merged.html, in case you have not the command is xunit-viewer –results=results –output=results/merged.html.

Custom JUnit results

Well, the out of the box solution is good but not enough for me. It does not show the skipped tests, it adds one more testsuite with name Root Suite, which is empty and Jenkins for e.g. avoids it, but if you want to visualize the results into HTML then it is a problem. What I have done is to generate JUnit XML on my own. This happens automatically in cypress_logging.js file. Files are put into the cypress/logs folder and have the name of the suite. Processing of the custom results is additionally made in provided code, you can read mode in Testing with Cypress – Code with Istanbul post.

Compare of JUnit reports

In this section, I will put some comparison of Cypress JUnit results and the one I have created. See the images below how HTML report looks like. HTML files can be opened from Cypress-report.html and Custom-report.html. XML results can be downloaded from xmls.zip.

Cypress standard HTML report

Cypress custom HTML report

I also made a quick Jenkins installation from its Docker container and uploaded the results for comparison. Below are the images of the comparison. Both JUnit reports are not visualized very well. Mostly this is because of the fact that JUnit is a format for Java tests, where we have packages and Jenkins is visualizing the results based on this assumption.

Cypress Jenkins standard

Cypress Jenkins custom

HTML Reports

HTML report is generated with xunit-viewer NPM package as described above. It is done by invoking the yarn cypress:report command. Above you can also see how HTML report looks like.

Semaphore file

Apart from the HTML report, there is one more file that is generated. It is named failed.txt. We are using AWS CodeBuild for CI/CD and we just need an indicator if the build passed or not. If this file is present then the build failed. The file content shows which are the failed suites. The whole artifacts are zipped and uploaded to an S3 bucket where can be investigated later.

Conclusion

In the current post, I have described the custom functionality I have for improving the debugging of failed tests by logging more information. Also, I have made a custom JUnit reporting of the test results. An HTML report is generated for better visualization of the results.

Related Posts

Read more...

Testing with Cypress – Basic API overview

Last Updated on by

Post summary: Basic overview of the Cypress API with code samples for some of the interesting features.

This post is part of a Cypress series, you can see all post from the series in Testing with Cypress – lessons learned in a complete framework. Examples code is located in cypress-testing-framework GitHub repository.

Cypress API

Cypress is so much different than Selenium, so it takes some time to get used to the way elements are located and interacted with. I am not going into details about the API here but will mention some basic things. Methods in the API are kind of self-explanatory, mainly used ones are: get, find, click, type, first, last, prev, next, children, parent, etc.

Cypress uses jQuery selectors to locate elements, so you can have things like contains, nth-child, .class, #id, [name*=”value”] (and all variations). A very interesting and sometimes useful feature is that you can make Cypress click hidden elements with click({force: true}), Cypress gives you an error that element is not clickable from a user point of view, and you can choose to find another element or just force the click. Also, you can click multiple elements with click({multiple: true}).

Explore Cypress API

When every project is created for the first time, Cypress installs examples for all their APIs. Those are very good and extensive. I have preserved their examples in the current project and they are available in cypress/examples folder. You can run all the examples with yarn cypress:examples:run command. You can explore them one by one in the Test Runner, which can be opened with yarn cypress:examples:open command.

Page Object Model

As mentioned in the main topic, Cypress recommends using custom commands instead of Page Objects. I do not like this idea, so I use page objects, as I believe they make the code more focused. Here is an example of a page object I am conformable with:

export default class AboutPage {
  constructor() {
    this.elements = {
      navigation: () => cy.getSilent('a[href$=about]'),
      paragraph: index => cy.getSilent('section.m-3 div p').eq(index),
    };
  }

  goTo() {
    cy.visit('/');
    this.elements.navigation().click();
  }

  /**
   * @param {string} version
   * @param {Date} datetime
   */
  verifyPage(version, datetime) {
    this.elements.paragraph(0)
      .should('text', 'Welcome to the about page.');
    this.elements.paragraph(1)
      .should('text', `Current API version is: ${version}`);
    this.elements.paragraph(2)
      .should('text', `Current time is: ${datetime.toISOString()}`);
  }
}

Clock

Cypress allows you to modify the clock in the browser. For e.g. About page of the application under test shows the current time. It makes much more easy to validate the visualization in case you control the current time. Otherwise, you have to parse the time and put some thresholds in the verifications.

Stub response

Another very handy feature is to be able to stub the response that API is supposed to return. In this way, you can very easily test for situations like timeout, incorrect response, error in response, etc.

Clock and Stub example

I have combined clock and stubbing into one example. The test suite file is cypress/tests/stub/response_and_clock_spec.js. The cy.clock(datetime.getTime()); sets the date to one you need. The cy.route(‘GET’, ‘/api/version’, version); simulates that API returns the version as a response. In the current case, it is a plain string, but in general case, this s JSON object.

response_and_clock_spec.js

import AboutPage from '../../pages/about_page';

describe('Check about page', () => {
  it('should show correct stubbed data and clock', () => {
    const aboutPage = new AboutPage();
    const version = '2.33';
    const datetime = new Date('2014-07-22T15:24:00');

    cy.server();
    cy.route('GET', '/api/version', version);
    cy.clock(datetime.getTime());

    aboutPage.goTo();

    aboutPage.verifyPage(version, datetime);
  });
});

about_page.js

  verifyPage(version, datetime) {
    this.elements.paragraph(0)
      .should('text', 'Welcome to the about page.');
    this.elements.paragraph(1)
      .should('text', `Current API version is: ${version}`);
    this.elements.paragraph(2)
      .should('text', `Current time is: ${datetime.toISOString()}`);
  }

Running custom Node.js code

Cypress runs into the browser, this is its biggest strength as you have direct access to your application and the browser. This is its weakness as well because the browser is much restrictive in terms of running code. In order to run custom Node.js code, you have to wrap it as a task. The Cypress task accepts only one argument, so if you need to pass more, you have to wrap them in a JSON object. The task should also return a promise. Tasks are registered into cypress/plugins/index.js file. See examples below. Task copyFile is used in cypress_loggin.js, a parameter that is passed to it is a JSON object with from and to keys. This task is registered with Cypress in index.js. Implementation is done in tasks.js where actual Node.js code is used to manipulate the file system and a Promise is returned.

cypress_logging.js

cy.task('copyFile', {
  from: `cypress/screenshots/${screenshotFilename}`,
  to: getFilePath(screenshotFilename),
});

index.js

const tasks = require('./tasks');

module.exports = (on, config) => {
  // `on` is used to hook into various events Cypress emits
  on('task', {
    copyFile: tasks.copyFile,
  });

  // `config` is the resolved Cypress config
  const newConfig = config;
  newConfig.watchForFileChanges = false;

  return newConfig;
};

tasks.js

const fs = require('fs');

const copyFile = args =>
  new Promise(resolve => {
    if (fs.existsSync(args.from)) {
      fs.writeFileSync(args.to, fs.readFileSync(args.from));
      resolve(`File ${args.from} copied to ${args.to}`);
    }
    resolve(`File ${args.from} does not exist`);
  });

module.exports = { copyFile };

Working with promises

Cypress is based on promises. Each Cypress command returns a command which is similar to a promise, but actually is different, read mode in Commands Are Not Promises. If you want to access the value from the previous operation you have to unwrap it with a then() method. If you have several dependencies then this nesting becomes bigger and bigger. This is why I have adopted some code from Nicholas Boll to avoid nesting. The article above is about using async/await but actually, it is not going to work with my custom logging, I will write in the next section. Initially, I started using directly the plugin from Nicholas, but I have observed strange bugs where a test fails but is not reported as such, so I modified it and it is proved stable now.

See examples below. The standard way of doing it is by unwrapping the command with the then() method. This is working but can get really ugly if you have too many nested unwrappings. The option is to use promisify() which wraps the Cypress command into a promise. The promise is then resolved only inside some other Cypress command, such as cy.log() or custom command cy.apiGetPerson(). If you print it directly the result in the console is a Promise.

with unwrap

it('should work with regular unwrap', () => {
  const person = new Person();
  // This is a command
  cy.apiSavePerson(person).then(personId => {
    // Value is unwrapped and printed properly
    cy.log(personId);
    console.log(personId);

    // Value is passed unwrapped
    cy.apiGetPerson(personId).then(res => cy.log(res));
  });
});

with promisify()

it('should work with promisify', () => {
  const person = new Person();
  // This is a promise
  const personId = cy.apiSavePerson(person).promisify();
  // Cypress internally resolves the promise
  cy.log(personId);
  // Prints a Promise
  console.log(personId);
  // Value is accessible after unwrap
  personId.then(pid => console.log(pid));

  // Cypress internally unwraps the value
  cy.apiGetPerson(personId).then(res => cy.log(res));
});

cypress_promisify.js

function promisify(chain) {
  return new Cypress.Promise((resolve, reject) => {
    chain.then(resolve);
  });
}

before(function() {
  cy.wrap('').__proto__.promisify = function() {
    return promisify(this);
  };
});

Working with async/await

The code above should work with async/await, which is an amazing JavaScript feature. It does work but it messes up with the custom logging I have described into Testing with Cypress – Custom logging of errors and JUnit results post. The code below works, but the custom logging is not triggered. So I would say, do not use async/await if you need those customizations. The bigger issue is that async/await does not seem to work in Electron, cy.apiGetPerson() is actually not invoked if you run the code in Electron browser.

it('should work with async/await', async () => {
  const person = new Person();
  // This is a resolved promise
  const personId = await cy.apiSavePerson(person).promisify();
  // Value is wrapped and printed properly
  cy.log(personId);
  console.log(personId);

  // Value is passed unwrapped
  cy.apiGetPerson(personId).then(res => cy.log(res));
});

Full API documentation

Cypress has very good and extensive documentation, you can read more at Cypress API article.

Conclusion

Cypress has a rich API which requires some time investments to get used to. You have good things like controlling the clock of the browser, controlling the API response from the backend. Good thing is that you have a way to run whatever code you want in your tests, but it has to wrapped as a task, otherwise you cannot just run any code in the browser.

Related Posts

Read more...

Testing with Cypress – Build a React application with Node.js backend

Last Updated on by

Post summary: Short introduction to the application under test that is created for and used in all Cypress examples. It is React frontend created with Create React App package. Backend is a Node.js application running on Express.

This post is part of a Cypress series, you can see all post from the series in Testing with Cypress – lessons learned in a complete framework. Examples code is located in cypress-testing-framework GitHub repository.

Backend

The backend is a simple Node.js application build with Express web server. It supports several APIs that can save a person, get a person by id, get all persons or delete the last person in the collection. You can read the full description in Build a REST API with Express on Node.js and run it on Docker post.

Frontend

Current post is mainly devoted to the frontend. It described how the React application is built. In order to make this part easy, Create React App is used. The best thing about it is that you do not need to handle lots of configurations and you just focus on your application. In order to create an application, Create React App has to be installed as a global NPM package with npm install -g create-react-app. The application itself is created with create-react-app my-application-name. Once this is done you can start building your application. See more details on application creation in How to Create a React App with create-react-app. I have added Bootstrap for better styles and Toastr for nicer notifications. I also use Axios for API calls. I am not going into details about how to work with React as this is a pretty huge topic and I am not really expert at it. You can inspect the GitHub repository given above of how controllers are structured.

Instrumented for code coverage

After having the application ready I wanted to add support for code coverage. The tool used to measure code coverage is Istanbul. Because of Create React App, adding the configuration is not straight-forward as practically there is no webpack.config.js file, it is hidden.

One option is to eject the application. Maybe for a big project where you need full control over the configurations, this is OK, but for this small application, I would not want to deal with it.

Another option is to use a package that builds on top of Create React App. One such plugin is react-app-rewired. It is installed along with istanbul-instrumenter-loader, the actual code coverage plugin. Once those two are installed the actual configuration is pretty simple. A file named config-overrides.js is created with the following content:

const path = require('path');
const fs = require('fs');

module.exports = function override(config, env) {
  // do stuff with the webpack config...
  config.module.rules.push({
    test: /\.js$|\.jsx$/,
    enforce: 'post',
    use: {
      loader: 'istanbul-instrumenter-loader',
      options: {
        esModules: true
      }
    },
    include: path.resolve(fs.realpathSync(process.cwd()), 'src')
  });
  return config;
};

Also, package.json has to be changed. The default react-scripts start/build/test is changed to react-app-rewired start/build/test. In order to verify that code coverage is enabled, go to Dev Tools (hit keyboard F12), then go to Console and search for __coverage__ variable.

Dockerization

In order to make it easy to run a Dockerfile has been added. It installs Yarn as a package manager, then copies package.json. Important is to copy yarn.lock as well since the actual dependencies are in it. If this is not copied, every time an install is run it will pick the latest dependencies, which may lead to instability. Then the installation of dependencies is done with command yarn, short for yarn install. Finally, all local files are copied. This is done in the end so installation is not triggered on every file change, but only on package.json or yarn.lock change.

FROM node:8.16.0-alpine

ENV APP /app
WORKDIR $APP

RUN npm install yarn -g

COPY package.json $APP
COPY yarn.lock $APP
RUN yarn

COPY . .

The docker-compose.yml file is also very simple. It has two services. The first is the backend which is exposed to 9000 port of the host. This is needed because Cypress tests directly access the APIs. It uses the image uploaded to the Docker hub repository: image: llatinov/nodejs-rest-stub. The second service is the frontend. It uses local Dockerfile: build: .. When frontend container is started yarn start command is executed and is exposed to port 3030 of the host machine. One more thing, that is added as configuration, is the backend API URL that can be controlled by setting API_URL environment variable, which then is set to REACT_APP_API_URL, used by the frontend. If no API_URL is provided then the default of http://localhost:9000 is taken.

version: '3'

services:
  backend:
    image: llatinov/nodejs-rest-stub
    ports:
      - '9000:3000'
  frontend:
    build: .
    command: yarn start
    environment:
      - REACT_APP_API_URL=${API_URL:-http://localhost:9000}
    ports:
      - '3030:3000'

Run the application

There are several ways to run the application under test in order to try Cypress examples. One way is to download both repositories of the backend and the frontend and run them separately.

Second is to run the backend with Docker command docker run -p 9000:3000 llatinov/nodejs-rest-stub. The command maps the 3000 port of the container to 9000 port of the host, this is where the APIs are available. I have uploaded the backend image to the public Docker hub repository. After backend is running, the frontend is run with yarn start command. In this case, frontend is running on port 3000, so you have to adjust the proper URL in the Cypress configurations.

The third option is to run with docker-compose with docker-compose up command. This runs the backend on port 9000 and the frontend on port 3030.

Functionality

The application is very simple, it has few pages where user can add a person or see already existing persons in the backend. On each successful action, there is a notification, in case of a network error, a message is shown.

Persons list


Add person


Version page



Conclusion

In order to demonstrate the Cypress examples, a separate React application with a backend is created with Create React App package. It is also configured to support code coverage with Istanbul.

Related Posts

Read more...

Testing with Cypress – lessons learned in a complete framework

Last Updated on by

Post summary: In the current post I will share some lessons I’ve learned using Cypress for quite a long time. Along this journey, I created a framework which solves some of the pain points that Cypress has.

Introduction

More than a year ago I made a bold presentation about Cypress. Back then I had been using Cypress on a small and very nice React application, and I was fascinated by the tool. You can read the presentation content in Cypress vs. Selenium, is this the end of an era? post.  Now more than a year later and 10K lines of test code I am still fascinated by Cypress and also I have discovered several things that were causing me pain during my work. In the current post, I will try to write for some of them, some of them I truly had forgotten. In the course of using Cypress, I had decided to change things I do not like and make them in the way I really enjoy it. The result of this is a framework, maybe this is too overrated, more likely a set of helper files which you can pick and directly use in your project. The code is located in cypress-testing-framework GitHub repository.

Post in the series

This is the first of series of posts dedicated to testing with Cypress and making your tests easier to write. All posts from the series are:

Application under test

In order to demonstrate some of the features, I have built a very simple React application. It has a backend that manipulates the data and the React application is consuming the backend APIs. More about the application itself can be found in Testing with Cypress – Build a React application with Node.js backend post.

Cypress API

Cypress has a rich API, offering lots of functionality. So far many of us are very used to Selenium, and it is a little surprise when you first deal with Cypress. There is some ramp-up time needed. Once you get acknowledged, things start to happen pretty fast and easy. Read more about the API along with some examples Testing with Cypress – Basic API overview post.

Page Object Model

Cypress does not recommend using POM but prefers using Cypress custom commands instead. See а very good and justified post on the topic, named Stop using Page Objects and Start using App Actions. Although the justification seems very logical, I do not agree with that approach. I still use custom commands, but not as a replacement of page objects, I am not giving up the Page Object Model. It gives me more focus, while with custom commands you can easily start duplicate functionality. Check an example of a Page Object Model I am comfortable with in Testing with Cypress – Basic API overview post.

Test Runner going out of memory

This is my biggest pain. I have tried a lot to overcome this but I could not find any solution. It happens in case of a long test suite with lots of actions in it. Cypress keeps a before/after version of the page on every action, memory drains pretty fast and the browser crashes with Aw snap error.

The most recommended option is to use numTestsKeptInMemory to reduce the memory footprint, but then you need it to be at least one, so you can debug and inspect data into the console.

I also tried to pass –max-old-space-size to Node process. If you pass it to Cypress directly it crashes, so what I did was to rename the node executable to node_exec, and then create a new file named node in which I put node_exec –max-old-space-size $@ to forward all arguments to node executable. This did not help either. 

Finally, I settled with the option to have custom commands to locate elements, which suppress more of the logging with {log: false}. Before/after version of locating the element is not needed, a snapshot is needed after a click or other significant action. Note that this log: false gave me a hard time when using cy.get because it was resetting the default timeout, so I had to pass the timeout as an option as well.

Cypress.Commands.add('getSilent', locator =>
  cy.get(locator, {
    log: false,
    timeout: Cypress.config('defaultCommandTimeout'),
  }),
);

This workaround did not solve the out of memory issue either, just allowed me to have a longer scenario before the Test Runner crashes.

On the other hand, this limitation is kind of a motivation for you to plan better, make more focused and short test suites.

Cypress error logging and JUnit results

Cypress does not provide very good logging, the stack trace is practically useless, as your code is wrapped into Cypress’ code. In order to work around this, I use some custom code which collects Cypress commands and then when a test fails it dumps the commands to a custom log file and a screenshot. The same code also creates custom JUnit test result files and it inserts the errors collected. Custom files are saved into cypress/logs folder of the project. You can read more about this custom logging in Testing with Cypress – Custom logging of errors and JUnit results post.

Rerun failed tests

Although Cypress is very stable, it still happens that some tests fail from time to time. I have added a task to rerun failed tests. This is done with yarn cypress:retry. This task iterates all custom created JUnit XMLs described in the previous section and makes a list of all tests that had failed. This list is saved into a file named retry-output.txt in cypress/logs folder. Those files are run again. The internal command that is called by retry code is yarn cypress:run –spec=’cypress/tests/TestSuite.js’. The same command you can use manually to run a single test suite or more using an asterisk as a wildcard.

Code coverage

Code coverage is not mandatory, more likely a nice to have a metric, we try to monitor and improve on. Read more about code coverage in What about code coverage post. For capturing code coverage Istanbul is used. Code coverage is described in more details in Testing with Cypress – Code coverage with Istanbul post.

Generate reports

An HTML report is generated in the end, it is invoked with yarn cypress:report command. This command relies on custom JUnit XMLs generated during the test run. You can read more details in Testing with Cypress – Custom logging of errors and JUnit results post.

Running tests in parallel

Cypress supports running tests in parallel. This is done with –parallel option when you run your tests. In order to do so, you need a subscription to Cypress Dashboard. There are various subscription plans, which are quite affordable. The idea is that Cypress records all your test runs and based on the timing and the available machines, it distributes evenly the tests across your machines. You can read more in Cypress Parallelization article. I have not tried that and also I do not know how it is going to work with current customizations I am doing in the current post.

Another option is to do the parallelism on your own. For this purpose, xargs Linux command can be used. The command that you run under Linux is:

find ./cypress/tests -name "*_spec.js" | xargs -n1 -P4 bash -c 'yarn cypress:run --spec="$@"' --

Where the P4 is the number you threads you want to have. The command finds all files ending with _spec.js with each if it, it invokes Cypress with a given number of simultaneous threads. Note that this parallelization is not very stable in case of Docker container. Randomly there are issues with Xvfb frame buffer.

What worked for me is to have a docker-compose-yml file with several Cypress services. Each one of them is running a group of the tests, which I manually split. All services share the same volume so results are kept in one place. After those services finish, then another service is run which retries failed tests and aggregates the results and the code coverage. This service is sharing the same volume so it has access to all the test results.

End to end process

To put the bits together. The process suggested in the current post consists of the following steps:

  • yarn cypress:run – run the tests. During the run JUnit XML files are generated. In order to speed up tests can be run in parallel as well. Set TEST_CODE_COVERAGE=true is code coverage is needed.
  • yarn cypress:retry – retry failed tests, based on the JUnit XMLs generated from the previous step. You can retry twice if you need to.
  • yarn cypress:report – generate code coverage report, HTML report with results and also semaphore file that indicates if the tests passed or not.

Conclusion

Cypress is a great tool, I strongly recommend it. It is very stable and reliable. With the improvements, you can find with this series of posts, you can make automation with Cypress even more effective, reliable and enjoyable. Very good article with useful Cypress tips is Bahmutov’s Cypress tips and tricks, I suggest you read it as well.

Related Posts

Read more...

Code coverage of Ruby on Rails application with simplecov

Last Updated on by

Post summary: How to measure code coverage of Ruby on Rails application with simplecov.

This is going to be a pretty short post. In general, it is very easy to measure code coverage on Ruby on Rails applications with a gem called simplecov.

Add coverage to Gemfile


gem 'simplecov', require: false

By using the require: false, Gem gets installed but it is not included when bundler is called. In order to use the Gem you manually have to call require ‘simplecov’.

Start code coverage

Starting the code coverage should be the first thing that the application is doing. So it can be invoked in config/boot.rb file. Add the following lines:

if ENV['ENABLE_CODE_COVERAGE']
    require 'simplecov'
    SimpleCov.start 'rails'
end

Do the testing

A small part in the current post, huge part in the actual effort. Run all the tests that you have, manual, automated, etc.

Gather code coverage

There are basically two ways to get the code coverage: shutdown the application or somehow call SimpleCov.result.format!. Not much to say on the first one. In many cases, it is more convenient to keep the application running, but only the coverage report. For the latter, you may have some controller which is available on for non-production environments, which you can hit with an HTTP call and it executed the code. The code coverage report is located in a folder called coverage in the root application folder.

Conclusion

It is pretty easy to measure code coverage on Ruby on Rails applications with the simplecov Gem.

Related Posts

Read more...

Performance testing in the browser

Last Updated on by

Post summary: Approaches for performance testing in the browser using Puppeteer, Lighthouse, and PerformanceTiming API.

In the current post, I will give some examples of how performance testing can be done in the browser using different metrics. Puppeteer is used as a tool for browser manipulation because it integrates easily with Lighthouse and DevTools Protocol. I have described all the tools before giving any examples. The code can be found in GitHub sample-performance-testing-in-browser repository.

Why?

Many things can be said on why do we do performance testing and why especially the browser. In How to do proper performance testing post I have outlined idea how to cover the backend. Assuming it is already optimized, and still, customer experience is not sufficient it is time to look at the frontend part. In general, performance testing is done to satisfy customers. It is up to the business to decide whether performance testing will have some ROI or not. In this article, I will give some ideas on how to do performance testing of the frontend part, hence in the browser.

Puppeteer

Puppeteer is a tool by Google which allows you to control Chrome or Chromium browsers. It works over DevTools Protocol, which I will describe later. Puppeteer allows you to automate your functional tests. In this regards, it is very similar to Selenium but it offers many more features in terms of control, debugging, and information within the browser. Over the DevTools Protocol, you have programmatically access to all features available in DevTools (the tool that is shown in Chrome when you hit F12). You can check Puppeteer API documentation or check advanced Puppeteer examples such as JS and CSS code coverage, site crawler, Google search features checker.

Lighthouse

Lighthouse is again tool by Google which is designed to analyze web apps and pages, making a detailed report about performance, SEO, accessibility, and best practices. The tool can be used inside Chrome’s DevTools, standalone from CLI (command line interface), or programmatically from Puppeteer project. Google had developed user-centric performance metrics which Lighthouse uses. Here is a Lighthouse report example run on my blog.

PerformanceTimings API

W3C have Navigation Timing recommendation which is supported by major browsers. The interesting part is the PerformanceTiming interface, where various timings are exposed.

DevTools Protocol

DevTools Protocol comes by Google and is a way to communicate programmatically with DevTools within Chrome and Chromium, hence you can instrument, inspect, debug, and profile those browsers.

Examples

Now comes the fun part. I have prepared several examples. All the code is in GitHub sample-performance-testing-in-browser repository.

  • Puppeteer and Lighthouse – Puppeteer is used to login and then Lighthouse checks pages for logged in user.
  • Puppeteer and PerformanceTiming API – Puppeteer navigates the site and gathers PerformanceTiming metrics from the browser.
  • Lighthouse and PerformanceTiming API – comparison between both metrics in Lighthouse and NavigationTiming.
  • Puppeteer and DevTools Protocol – simulate low bandwidth network conditions with DevTools Protocol.

Before proceeding with the examples I will outline helper functions used to gather metrics. In the examples, I use Node.js 8 which supports async/await functionality. With it, you can use an asynchronous code in a synchronous manner.

Gather single PerformanceTiming metric

async function gatherPerformanceTimingMetric(page, metricName) {
  const metric = await page.evaluate(metric => 
     window.performance.timing[metric], metricName);
  return metric;
}

I will not go into details about Puppeteer API. I will describe the functions I have used. Function page.evaluate() executes JavaScript in the browser and can return a result if needed. window.performance.timing returns all metrics from the browser and only needed by metricName one is returned by the current function.

Gather all PerformaceTiming metrics

async function gatherPerformanceTimingMetrics(page) {
  // The values returned from evaluate() function should be JSON serializable.
  const rawMetrics = await page.evaluate(() => 
    JSON.stringify(window.performance.timing));
  const metrics = JSON.parse(rawMetrics);
  return metrics;
}

This one is very similar to the previous. Instead of just one metric, all are returned. The tricky part is the call to JSON.stringify(). The values returned from page.evaluate() function should be JSON serializable. With JSON.parse() they are converted to object again.

Extract data from PerformanceTiming metrics

async function processPerformanceTimingMetrics(metrics) {
  return {
    dnsLookup: metrics.domainLookupEnd - metrics.domainLookupStart,
    tcpConnect: metrics.connectEnd - metrics.connectStart,
    request: metrics.responseStart - metrics.requestStart,
    response: metrics.responseEnd - metrics.responseStart,
    domLoaded: metrics.domComplete - metrics.domLoading,
    domInteractive: metrics.domInteractive - metrics.navigationStart,
    pageLoad: metrics.loadEventEnd - metrics.loadEventStart,
    fullTime: metrics.loadEventEnd - metrics.navigationStart
  }
}

Time data for certain events are compiled from raw metrics. For e.g., if DNS lookup or TCP connection times are slow, then this could be some network specific thing and may not need to be acted. If response time is very high, then this is indicator backend might not be performing well and needs to be further performance tested. See How to do proper performance testing post for more details.

Gather Lighthouse metrics

const lighthouse = require('lighthouse');

async function gatherLighthouseMetrics(page, config) {
  // ws://127.0.0.1:52046/devtools/browser/675a2fad-4ccf-412b-81bb-170fdb2cc39c
  const port = await page.browser().wsEndpoint().split(':')[2].split('/')[0];
  return await lighthouse(page.url(), { port: port }, config).then(results => {
    delete results.artifacts;
    return results;
  });
}

The example above shows how to use Lighthouse programmatically. Lighthouse needs to connect to a browser on a specific port. This port is taken from page.browser().wsEndpoint() which is in format ws://127.0.0.1:52046/devtools/browser/{GUID}. It is good to delete results.artifacts; because they might get very big in size and are not needed. The result is one huge object. I will talk about this is more details. Before using Lighthouse is should be installed in a Node.js project with npm install lighthouse –save-dev.

Puppeteer and Lighthouse

In this example, Puppeteer is used to navigating through the site and authenticate the user, so Lighthouse can be run for a page behind a login. Lighthouse can be run through CLI as well but in this case, you just pass and URL and Lighthouse will check it.

puppeteer-lighthouse.js

const puppeteer = require('puppeteer');
const perfConfig = require('./config.performance.js');
const fs = require('fs');
const resultsDir = 'results';
const { gatherLighthouseMetrics } = require('./helpers');

(async () => {
  const browser = await puppeteer.launch({
    headless: true,
    // slowMo: 250
  });
  const page = await browser.newPage();

  await page.goto('https://automationrhapsody.com/examples/sample-login/');
  await verify(page, 'page_home');

  await page.click('a');
  await page.waitForSelector('form');
  await page.type('input[name="username"]', 'admin');
  await page.type('input[name="password"]', 'admin');
  await page.click('input[type="submit"]');
  await page.waitForSelector('h2');
  await verify(page, 'page_loggedin');

  await browser.close();
})();

verify()

const perfConfig = require('./config.performance.js');
const fs = require('fs');
const resultsDir = 'results';
const { gatherLighthouseMetrics } = require('./helpers');

async function verify(page, pageName) {
  await createDir(resultsDir);
  await page.screenshot({
    path: `./${resultsDir}/${pageName}.png`,
    fullPage: true
  });
  const metrics = await gatherLighthouseMetrics(page, perfConfig);
  fs.writeFileSync(`./${resultsDir}/${pageName}.json`,
    JSON.stringify(metrics, null, 2));
  return metrics;
}

createDir()

const fs = require('fs');

async function createDir(dirName) {
  if (!fs.existsSync(dirName)) {
    fs.mkdirSync(dirName, '0766');
  }
}

A new browser is launched with puppeteer.launch(), arguments { headless: true, //slowMo: 250 } are put for debugging purposes. If you want to view what is happening then set headless to false and slow the motions with slowMo: 250, where time is in milliseconds. Start a new page with browser.newPage() and navigate to some URL with page.goto(‘URL’). Then verify() function is invoked. It is shown on the second tab and will be described in a while. Next functionality is used to log in the user. With page.click(‘SELECTOR’), where CSS selector is specified, you can click an element on the page. With page.waitForSelector(‘SELECTOR’) Puppeteer should wait for the element with the given CSS selector to be shown. With page.type(‘SELECTOR’, ‘TEXT’) Puppeteer types the TEXT in the element located by given CSS selector. Finally browser.close() closes the browser.

So far only Puppeteer navigation is described. Lighthouse is invoked in verify() function. Results directory is created initially with createDir() function. Then a screenshot is taken on the full page with page.screenshot() function. Lighthouse is called with gatherLighthouseMetrics(page, perfConfig). This function was described above. Basically, it gets the port on which DevTools Protocol is currently running and passes it to lighthouse() function. Another approach could be to start the browser with hardcoded debug port of 9222 with puppeteer.launch({ args: [ ‘–remote-debugging-port=9222’ ] }) and pass nothing to Lighthouse, it will try to connect to this port by default. Function lighthouse() accepts also an optional config parameter. If not specified then all Lighthouse checks are done. In the current example, only performance is important, thus a specific config file is created and used. This is config.performance.js file.

Puppeteer and PerformanceTiming API

In this example, Puppeteer is used to navigating the site and extract PerformanceTiming metrics from the browser.

const puppeteer = require('puppeteer');
const { gatherPerformanceTimingMetric,
  gatherPerformanceTimingMetrics,
  processPerformanceTimingMetrics } = require('./helpers');

(async () => {
  const browser = await puppeteer.launch({
    headless: true
  });
  const page = await browser.newPage();
  await page.goto('https://automationrhapsody.com/');

  const rawMetrics = await gatherPerformanceTimingMetrics(page);
  const metrics = await processPerformanceTimingMetrics(rawMetrics);
  console.log(`DNS: ${metrics.dnsLookup}`);
  console.log(`TCP: ${metrics.tcpConnect}`);
  console.log(`Req: ${metrics.request}`);
  console.log(`Res: ${metrics.response}`);
  console.log(`DOM load: ${metrics.domLoaded}`);
  console.log(`DOM interactive: ${metrics.domInteractive}`);
  console.log(`Document load: ${metrics.pageLoad}`);
  console.log(`Full load time: ${metrics.fullTime}`);

  const loadEventEnd = await gatherPerformanceTimingMetric(page, 'loadEventEnd');
  const date = new Date(loadEventEnd);
  console.log(`Page load ended on: ${date}`);

  await browser.close();
})();

Metrics are extracted with gatherPerformanceTimingMetrics() function described above and then data is collected from the metrics with processPerformanceTimingMetrics(). In the end, there is an example of how to extract one metric such as loadEventEnd and display it as a date object.

Lighthouse and PerformanceTiming API

const puppeteer = require('puppeteer');
const perfConfig = require('./config.performance.js');
const { gatherPerformanceTimingMetrics,
  gatherLighthouseMetrics } = require('./helpers');

(async () => {
  const browser = await puppeteer.launch({
    headless: true
  });
  const page = await browser.newPage();
  const urls = ['https://automationrhapsody.com/',
    'https://automationrhapsody.com/examples/sample-login/'];

  for (const url of urls) {
    await page.goto(url);

    const lighthouseMetrics = await gatherLighthouseMetrics(page, perfConfig);
    const firstPaint = parseInt(lighthouseMetrics.audits['first-meaningful-paint']['rawValue'], 10);
    const firstInteractive = parseInt(lighthouseMetrics.audits['first-interactive']['rawValue'], 10);
    const navigationMetrics = await gatherPerformanceTimingMetrics(page);
    const domInteractive = navigationMetrics.domInteractive - navigationMetrics.navigationStart;
    const fullLoad = navigationMetrics.loadEventEnd - navigationMetrics.navigationStart;
    console.log(`FirstPaint: ${firstPaint}, FirstInterractive: ${firstInteractive}, 
      DOMInteractive: ${domInteractive}, FullLoad: ${fullLoad}`);
  }

  await browser.close();
})();

This example shows a comparison between Lighthouse metrics and PerformanceTiming API metrics. If you run the example and compare all the timings you will notice how much slower the site looks according to Lighthouse. This is because it uses 3G (1.6Mbit/s download speed) settings by default.

Puppeteer and DevTools Protocol

const puppeteer = require('puppeteer');
const throughputKBs = process.env.throughput || 200;

(async () => {
  const browser = await puppeteer.launch({
    executablePath: 
      'C:\\Program Files (x86)\\Google\\Chrome\\Application\\chrome.exe',
    headless: false
  });
  const page = await browser.newPage();
  const client = await page.target().createCDPSession();

  await client.send('Network.emulateNetworkConditions', {
    offline: false,
    latency: 200,
    downloadThroughput: throughputKBs * 1024,
    uploadThroughput: throughputKBs * 1024
  });

  const start = (new Date()).getTime();
  await client.send('Page.navigate', {
    'url': 'https://automationrhapsody.com'
  });
  await page.waitForNavigation({
    timeout: 240000,
    waitUntil: 'load'
  });
  const end = (new Date()).getTime();
  const totalTimeSeconds = (end - start) / 1000;

  console.log(`Page loaded for ${totalTimeSeconds} seconds 
    when connection is ${throughputKBs}kbit/s`);

  await browser.close();
})();

In the current example, network conditions with restricted bandwidth are emulated in order to test page load time and perception. With executablePath Puppeteer launches an instance of Chrome browser. The path given in the example is for Windows machine. Then a client is made to communicate with DevTools Protocol with page.target().createCDPSession(). Configurations are send to browser with client.send(‘Network.emulateNetworkConditions’, { }). Then URL is opened into the page with client.send(‘Page.navigate’, { URL}). The script can be run with different values for throughput passed as environment variable. Example waits 240 seconds for the page to fully load with page.waitForNavigation().

Conclusion

In the current post, I have described several ways to measure the performance of your web application. The main tool used to control the browser is Puppeteer because it integrated very easily with Lighthouse and DevTools Protocol. All examples can be executed through the CLI, so they can be easily plugged into CI/CD process. Among the various approaches, you can compile your preferred scenario which can be run on every commit to measure if the performance of your application has been affected by certain code changes.

Related Posts

Read more...

What is The Test Mushroom and how to improve your testing

Last Updated on by

Post summary: In contrast to Test Pyramid, Test Mushroom shows a test portfolio which is restricted to costly and slow UI tests only. In the current post, I will describe approaches to act on your Test Mushroom hence improve your testing.

Test Pyramid

Test pyramid illustrates how your test portfolio is good to look like. The important thing about the test pyramid is the higher in the test pyramid the more brittle and expensive to maintain the tests are. In the bottom of the pyramid are the unit tests. They are fast and test most of the functionality in your code. By integration tests is meant the following thing. Defacto the standard now for web applications is to use some JavaScript UI framework that manipulates data from different APIs. With integration testing, we want to be in control of this data. We want for e.g. to test how web application behaves when API returns an error. By stubbing the data web application works with it is possible to fully test the web application. UI/E2E tests are the ones executed against deployed, configured, integrated and working web application. They are slow and flaky, thus they should be limited in number. In other versions of test pyramid, there is a layer called service tests, which is below UI and above integration. Those are API tests performed against deployed and working backend. I will not go into details about them, because API tests are by definition very stable.

Test Mushroom

I came up with this term when giving a presentation for Cypress, you can see more in Cypress vs. Selenium, is this the end of an era? post. This term was meant to be a funny and ironic description of a test portfolio that makes testing big pain because the quality of releases totally depends on UI tests. The mushroom leg represents unit and integration tests. It is shown with a dotted line as such test are totally missing. UI tests are slow and flaky, every release sign off takes a lot of time for debugging failed tests. Every release has a high risk of failure. This is very similar to the Test Ice-Cream Cone, with the difference that in case of Test Mushroom integration and unit tests are missing.

Need for an action

Whether it is a test mushroom or test ice-cream cone, it is not important. The important is both represent a situation where product quality depends on brittle and flaky UI tests. This should be acted upon in order to reduce the release related risk. In the current post, I will suggest some approaches how to act in this situation and put yourself in a better position. Below is a high-level list of what you can do. Each item from the list will be described with greater details later in the current post.

  • Refactor and optimize UI tests
  • Provision dedicated test environment
  • Increase integration testing
  • Unit testing and shift left

Refactor and optimize UI tests

The first thing you can do is act on UI tests because you have full control over them. Go over current tests and do a full review on them. In most of the cases, there are duplicated tests. With time being people add new tests and they keep piling up. This happens because it is easier to add new test rather than inspecting already existing ones and fit your tests scenario inside. You need to optimize your current tests. It might not happen immediately, it can be done a single step at a time, but you should do it. If several test scenarios can be fit into one automated test, then definitely do it. There are theories that say one test should test one thing only. I totally agree with this statement, but it is only relevant for the unit tests. For UI/E2E/functional tests this statement is more likely a good wish. UI tests are expensive, so we should optimize them as much as possible.

Classification tree test method

You should look for more details for classification tree method. I will give a short example. Imagine you have an e-commerce website. In this site, there are 3 main types of products: single product, a product with variations, e.g. several colors, and product set. This site offers 3 different deliveries, one international and two domestic shipping methods. Users can pay with Paypal, Visa, MasterCard, and Amex. With classification tree method you will have 3 different classifications: product type, shipping method, and payment method. The full test cases that can be done is a cartesian product from values of all classifications. In current example this 3 (product types) x 3 (shipping methods) x 4 (payment methods) = 36 full combinations. Minimum test cases that can be done though are 4, or the classification with most values. The 4 test cases we definitely must do are:

  • Single product with international delivery and PayPal
  • Blue variation product with domestic delivery #1 and Visa
  • Red variation product with international delivery and MasterCard
  • A product set with domestic delivery #2 and Amex

Soft assertions

Once you optimize tests number and workflow you would like to make as many assertions as possible while you are on each page. In this regards, so-called soft assertions can be used. Opposing to traditional unit testing asserts, where the test fails immediately when an error is found, a soft assertion is one that does not fail in case of a not critical problem. This provides the ability to execute all the steps in the test and then investigate the issues. Soft assertions that do not fail JUnit test and Soft assertions for C# unit testing frameworks (MSTest, NUnit, xUnit.net) posts can give you more details how to do it in Java and C#.

Rename test methods

It is good to rename your test methods to be as much descriptive as possible. It does not matter that method name will contradict with best practices for method naming because those are test methods. If your tests fail you can identify them very easily just by the name of the failed method.

Use smarter waits

I get really upset when I see in test something like Thread.Sleep(5000). You should never ever use such waits. Not only they are slowing you tests down but they will make the test fail if for some reason website is taking 6 seconds to render. Selenium offers explicit and implicit waits, you should be very familiar with them. Another approach is to use even smarter mechanisms. Like, check for jQuery opened connections or for the existence of some kind of loader on your web application. See Efficient waiting for Ajax call data loading with Selenium WebDriver post for more details. Cypress, on the other hand, eliminates waiting at all, as it knows what happens in the browser and it gives you the element once it is shown.

Retry failed tests

If you do not already have, you definitely need a retry mechanism for your tests. In Retry JUnit failed tests immediately post, I have described how this can be done for JUnit. In Testing with Cypress – lessons learned in a complete framework post, I have described a way to retry failed tests.

Screenshot failed tests

In order to ease yourself debugging a screenshot is a must. Along with the screenshot, it is good to have page source and URL at which screenshot was captured.

Provision dedicated test environment

In case of a shared environment, it is always possible that someone is doing something while tests are running. It is very good if you can provision a dedicated test environment. You should at any point know which version of the software under test are deployed on it and no one should mess with the test environment. If you have an application that consists of a database and API that is consumed by a UI then you can relatively easy use Docker to get a running test environment. If you are testing some application which is part of a big microservice ecosystem, then it might not be that easy, because you have to have dedicated environment for each dependent microservice, and they can be a big number.

Control test data in the database

Ideally, you want to have full control over the data in the database. In this way you can very easily assert and check for data you know is there. Ono option is in case of an application with own database, it is very easy to have a Docker image with already prefilled test data in the database and use it. If not using preloaded data you can still seed the data with API calls prior to the tests.

Dedicated test environment not only can make your tests more stable but can make them faster. Check Emanuil Slavov’s Need for Speed presentation, this talk is also available in GTAC 2016 video.

Increase integration testing

So far you have optimized your UI tests. If you are satisfied with the results, then maybe no further steps are needed. Remember, we do certain things, not because everybody is doing it but because we need it. If you need more improvements then you can look into integration testing. With term integration testing I mean testing of your application or parts of it by stubbing or mocking external dependencies. Below are several suggestions how you can do this.

JavaScript rich web application

In case of a web application built with some JavaScript framework that consumes the data from external APIs and renders the UI based on the data then there are two approaches to do integration testing. One is to use Cypress, which has a very good feature set for decent integration testing in the browser. See Cypress vs. Selenium, is this the end of an era? post for more details. The other approach is to use external stubs and have your application under test configured to work with the stubs. You can even make your testing framework to start and manage the stubs. See WireMock and Own stubbing sections below.

API backend application

In case of backend application that exposes different APIs for external consumption then the approach for integration testing is to stub or mock its dependencies. Dependency can be a database or external API that is being called. Database stubbing depends on the type of application and database used. For .NET application using Entity Framework, it is possible to mock the framework itself. The good thing about .NET is that it provides so-called TestHost, which can run your application in memory and you can also mock some of your dependencies if you have built your application properly to use inversion of control container. See more in .NET Core integration testing and mock dependencies post. When speaking with colleagues they say Spring framework for Java provides similar functionalities, but I do not have experience with it. In terms of database, it depends which database has been used. If it is MS SQL Server, then one option, besides totally mocking the DB calls, is to use SQL Express (localdb). It runs on Windows machine and is extremely fast. It is very easy to create a new database and then run your application with this database. For MySQL I’ve seen in presentations that it is possible to run it in memory but I haven’t tried this. Mocking dependencies to external applications again can be done either with WireMock or with Own Stubbing. You can have an instance of your application installed in separate integration environment and configured to use stubs instead of real dependency APIs.

Server-side HTML rendering

Integration testing of web application which HTML is rendered on the server and just given to the browser is very similar to the previous section API backend application.

WireMock

WireMock is a simulator for HTTP-based APIs. Some might consider it a service virtualization tool or a mock server. It enables you to stay productive when an API you depend on doesn’t exist or isn’t complete. It supports testing of edge cases and failure modes that the real API won’t reliably produce. And because it’s fast it can reduce your build time from hours down to minutes. I have shown how WireMock can be used in unit tests in Mock/Stub REST API with WireMock for better unit testing post. It can be run as standalone Java application with different endpoints and responses configured. So you can make WireMock reply differently based on the request it receives. This can be synchronized with your tests and you can automate whatever scenarios you need. Challenge with this approach is to keep both tests and mocked data in sync.

Own stubbing

It is possible to build own stub and configure it with whatever scenarios you want. I have described how you can do this in Java, .NET and Node.js in following posts: Build a RESTful stub server with Dropwizard, Build a REST API with .NET Core 2 and run it on Docker Linux container, and Build a REST API with Express on Node.js and run it on Docker.

Unit testing and shift left

The base of test pyramid are the unit tests. I have many posts on my blog regarding unit testing so in the current post, I will not go into further details about them because this more in development expertise. From my experience, if you have good UI and integration tests, you can have a very good level of quality without unit testing. I will refer to Emanuil Slavov’s Integration Tests Are Awesome post. He had spent a significant amount of time investigating bugs in their bug tracking system and linking them to a layer of testing. He had discovered that only 13% of the bugs they could have caught with unit testing. Another 57% of the bugs they would have caught with API and UI testing. Latter 30% they have discovered could not have been caught with any kind of testing. I guess integration testing can cover some of the 30% uncatchable bugs because you can stub and mock the dependencies and this gives you better flexibility. So this is a good proof that you can go without unit testing. The real benefit of the unit testing though is the Shift Left paradigm. It involves developers in the process of building quality in the application. If developers should write unit tests to their code they catch and fix bugs almost immediately. With the time being developers learn to imply quality in their code. Your UI and integration tests will also catch most of the bugs. The more important is that the process of reporting and fixing bugs caught during UI and integration testing includes more time and effort. This is why writing unit tests are mandatory for any organization that wants to deliver quality products.

Conclusion

This post had started with the funny term of the test mushroom. It later continues with important guidelines how you can improve your testing first by optimizing your UI tests, then develop integration tests, and finally describing why unit testing is important to an organization, because of the Shift Left paradigm which involves developers into building quality to the application.

Related Posts

Read more...