Container Build Pipeline

When using containers as a tool for deployment, a development team faces a series of choices. Container technologies like docker are often adopted by teams that are in the process of evolving their practices away from more traditional development approaches toward more agile approaches. As such, these teams often have established build processes that they need to replace, or processes they should retarget to take advantage of the new capabilities that containers offer. At the same time, the opportunity presented by adopting containers also means that teams can modernize their development, deployment and testing processes to take advantage of new tools such as Jenkins or Hudson that introduce more modern approaches.

How can development teams exploit the capabilities provided by containers and adapt their build and testing processes to take full advantage of those capabilities?

Containers introduces several concepts that make it challenging for teams to directly adapt existing practices directly to functioning with them. One of the most challenging for teams that are used to traditional approaches is the concept that Docker containers, especially those in production, should be immutable. Immutable production systems are not a new concept in software engineering - in fact, a big part of the attraction of infrastructure-as-code was the idea that systems could be made immutable if only they could be constructed entirely from the ground up from a repeatable code base instead of having to be constructed from an ad-hoc mixture of code and far-too-mutable physical or virtual infrastructure.

However, the infrastructure-as-code approach was often not adopted entirely throughout an entire software development lifecycle. In more traditional software development approaches, an environment will be an entire system, often built as one or more Virtual Machines or physical environments, that serve as a location in which one particular step in a software engineering lifecycle, such as application build, integration testing, or acceptance testing will take place. What we have commonly seen is that development teams will have a distinction between “lower” environments in the development lifecycle, such as build and unit test, which are very open and mutable, and “higher” environments such as an acceptance test environment, that become progressively more locked down.

What this leads to is a situation where lower environments are often constantly “in flux” and that changes in configuration are not picked up in later environments, causing problems that are fixed in earlier environments to recur unexpectedly later. What’s more, the inconsistency of the mechanisms for defining and configuring environments results in wasted time and needless repetition of work.

Therefore,

Build a Continuous Integration and Continuous Delivery Pipeline, using common tools such as Jenkins, in which the output of each pipeline run will be an immutable Container image.

Jenkins is an open source tool that is used throughout the software development industry to define and build Continuous Integration and Delivery pipelines. Jenkins is built on the concept of a stage, which is a conceptually distinct subset of a pipeline. Each stage is built of steps that can execute within conditional logic to automate common tasks such as building a Java Jar file using Maven, or running unit tests with an automated tool like JUnit. Thus each stage can conceptually map to a physical or virtual environment of the type described above such as “Build” or “Unit Test”.

The key here is that you can use a tool like Jenkins combined with Containers to entirely eliminate the need for any of these unique physical or virtual environments. Instead you will build an image from a dockerfile in the initial setup of the pipeline, and then push this image to the image registry upon successful completion of the pipeline stages. The image is entirely rebuilt on each new run of the pipeline.

This concept is central to the approach being used by more modern Kubernetes-centric DevOps Pipeline tools like JenkinsX and Tekton. Tekton is an open-source framework for building CI/CD tools and processes. The steps (and constituent tasks) of a pipeline in Tekton all run as Pods in Kubernetes. JenkinsX is built on Tekton and gives you full GitOps and pipeline automation implementation. However, the output of these is always an image that has been constructed through the steps in the pipeline.

This approach will fix the problem of reintroducing errors into later environments by entirely removing manual configuration changes from the process. In this approach you can’t change the configuration of an image either intentionally or accidently within a single stage – you have to introduce any configuration changes into the Container build process at the very beginning and then let the changes propagate through the entire build pipeline. So for instance, let’s consider the simple case of changing the version of a Java runtime environment (JRE). In a traditional approach, with separate physical or virtual machines for each development lifecycle stage, updating this configuration would require changing each environment separately, either manually or through a scripted infrastructure-as-code tool such as Chef or Puppet. In the Container approach, you would change the dockerfile once to include the new definition, and then re-run the pipeline to repeat all the automated steps from the beginning – creating a new, immutable image at the end.

This pattern is well established as a best practice within the Docker community. For instance, the Docker documentation article on Development Pipelines describes a recommended development pipeline very much in line with the recommendations of this pattern. Likewise the IBM article on Kubernetes DevOps is just one of several examples of such pipelines being built for container projects.

At the heart of this pipeline will be the problem of dealing with images appropriately. The first issue to consider with publicly hosted images is that since they are coming from a public repository that they could, potentially, contain malware or other issues that would introduce vulnerability into your system. Thus the need for Pipeline Vulnerability Scanning becomes absolutely critical. This results in the need to introduce special stages. such as a Birthing Pool into your Pipeline in order to make sure that you are into introducing new types of vulnerabilities into your systems.