Best practices in software delivery process

Last Updated on by

Post summary: Short overview of a software delivery process which I consider very good and worth the “best practice” label that is being practiced in a very successful software company.

Recently I finished an assignment in a company which I rate as the best I’ve worked so far in terms of software delivery process, individuals professionalism and company culture. Most of the things I’ve blogged about last 2.5 years I’ve heard, seen, learned and mastered working for that company. I decided to describe the process because for me this is a very successful practice.

Background

The company provides B2B services by exposing a lot of APIs to its clients which then compose different functionality to their end customers. Business functionality is broken down into numerous amount of micro-services. Every micro-service is a separate project and is deployed on a separate machine. Those micro-services interconnect with each other and depend on each other. Micro-services are discovered through Netflix’s Eureka, no endpoint is ever hard-coded, except Eureka’s.

Technologies

There are different tools and frameworks used in order to deliver quality software on time. List of tools consists of following: Jira for a project and issue tracking, Confluence for documents collaboration, Bamboo for continuous integration and deployment, Bitbucket (former Stash) for code reviews, HipChat or Slack for communication, SonarQube for static code analysis, Fortify for security static code analysis. Software code is stored in Git, written in Java, build with Gradle and deployed on Linux servers with Chef or Ansible.

Planning

In order to plan the work Agile methodologies are followed – Scrum or Kanban. There is an external team of scrum masters which facilitate Scrum ceremonies and Scrum is being very dogmatic followed.

Development

Every story from the Jira board is developed in a separate branch. On every commit, there is a Bamboo plan that builds the branch, runs the unit tests and runs SonarQube static code analysis. In order to pass the build different code style rules should be met, also it is mandatory to have 80% code coverage of the unit tests. On each commit, Jira number is put into commit comments. This provides traceability between Jira and tools like Bamboo and Bitbucket. Built artifacts are uploaded into Amazon S3 bucket where later they are used by Chef deployment. Each branch build can be deployed to Dev test node and tested by a developer in a real environment. Branch can be merged to master only if there are two code reviews done by other team members. Code reviews are done with Bitbucket.

Testing

The main pillar of quality is the unit testing. Although JUnit is the main framework some teams are using Spock and are very successful with it. Code coverage threshold is above 80%. Between 75% and 80% SonarQube reports warning, below 75% build fails and you cannot release. Some teams practice mutation testing with PITest to improve their unit testing. This definitely eliminates a lot of the bugs, but just unit testing is not enough. We have reached up to 97% code coverage (JUnit, Spock, and PITest) by the unit testing and still have seen small bugs in production. Although there are no strict rules about it every team is required to have automated functional testing. It could be very basic, it could be very advanced, but in order to release functional tests should be green.

Deployment

Deployment is fully automatic using Chef. It is development team responsibility to prepare the cookbooks and provision test environments. Deployment is triggered by Bamboo deployment plan which calls Chef on the specified node. This makes the traceability between what Jira is being implemented, when it was built and when it was deployed, to which environment and in which build number.

Test environments

Apart from production, there are three other test environments: Dev, QA, and Staging. Each test environment can have one ore mode nodes. Each different micro-service provides at least one node in order to make complete and working B2B solution. Test nodes are in the cloud and their management is done with Scalr as well as a custom framework that uses Amazon EC2 API and spins up nodes. Spinning up a new node is as simple as a button click. Before spinning a node test environment should be properly configured, this includes network, Chef cookbook, hardware capabilities, software setup, every detail needed to have a ready to test environment. Each test environment has a different purpose:

  • Dev – used by developers, the main idea is to have some code committed into a branch, build it and deploy that branch to Dev environment in order to test with real dependencies given feature. Most micro-services have their test nodes working. Since there is a lot of development ongoing, sometimes happens that some micro-service is with the incorrect version of is down.
  • QA – this is used mainly by QAs to verify build that is a candidate to go for a release. This environment is stable. All micro-services have test nodes and downtime is something exceptional. Data in this environment is a dummy and incomplete one.
  • Staging – this is pre-production environment. It is mandatory each micro-service have a working node there. Data is in very mature state and more reliable than other environments.

Release process

Once the feature is implemented, code reviewed and tested its branch can be merged to master. Once merged team can decide to release it to production right away or wait for more features to pile up and then release. In order to release there is separate Bamboo build plan that is run manually. It builds the master branch, runs SonarQube analysis, runs Fortify security scan, deploys to QA test environment and runs the functional tests. Then build is deployed to Staging and functional tests are run again. If everything is green at this point there is a stable release candidate. In order to release to production, there is a manual step that has to be done. Release slot is negotiated with DevOps engineer. For every production deployment, there should be DevOps standby if something goes wrong. Once DevOps time is provisioned then release request with proposed release time is made with information which is the Bamboo build plan that is released. This request is managed by a separate team. They check what Jiras are being implemented, if all builds are green and if Staging deployment is green. If everything is green then release is approved. In release window deployment to production is made by a team member with a single button click in Bamboo. In most of the cases everything is good, but in case of issues DevOps engineer has access to production nodes and can fix any issue. Important thing is that deployment is done firstly on one node, then this node is verified. In case there is an issue with the new code, the latest version can be reverted back to this node and release is aborted. If the new code is OK then deployment can continue on other nodes with a rate of 2-3-4 nodes at a time. The idea is not to have too many nodes down at a time.

Canary releases

Some features are way too big, way too risky or way to unpredictable how they will behave in production. In such cases, there is a practice of canary releases. Real production node is detached from load balancer and does not receive live traffic anymore. New functionality is deployed there, it is evaluated by product owners, monitored by DevOps for issues. If functionality is OK then the node can be attached to load balancer again and be left for some time to see how production traffic influences it.

Introducing a brand new micro-service

If new micro-service has to be introduced in then it should go through an architectural review. It is being evaluated what technologies are there how it operates and most importantly how it fits the micro-service landscape. There is a team of architects that are responsible to keep landscape tidy and focused. There is extensive operational requirements checklist, such as: is HTTPS used, is logging following company standards, are passwords encrypted in DB, are sensitive configuration data encrypted on file system. There are many requirements that service should cover in order to go live. Even if it goes live the first stage is a beta release where this service is exposed to a selected number of partners which evaluate it first. Then it can be revealed to the mass public.

Conclusion

I really enjoyed working for this company. It was a great learning opportunity because they keep up to date with new technologies and good practices. Processes and tools are constantly evolving keeping a good quality of the code and the products. I definitely encourage to take a deep look, understand the process and eventually apply something to your software delivery process. Most important is the traceability that makes very transparent what feature is implemented, in which build deployed, etc. And traceability is something ISO auditors care very much about.

Category: Tutorials