In my last blog and video, I went through some of the issues you can have when having a single pipeline do both CI and CD for your applications. In this blog we go into HOW you can split these concepts. Mostly we dive into separation of concerns. Yes, I went there, another buzzword!
I will use Containers and Kubernetes as an example, mostly because these technologies are forcing us to reconsider how we look at our infrastructure and application landscape.
But first! We need to have a short recap of the issues I discussed previously and we need to make sure we are again on the same page about CI vs CD.
In the previous entry I used below example:
In common pipelines I often see companies use a single pipeline to both do CI, which is building the software, testing the software and packaging the software, and CD which is deploying the packaged software with environment specifics to each environment.
By environment here I mean the commonly seen test and production environments, with more possible environments in between such as Acceptance and Staging where needed.
The 4 problems:
- Building code and Deploying code are really different from each other.
- Pipelines that do both building and deploying are really complex.
- Pipelines that only run when code is changed have to deal with configuration drift.
- It's never just one pipeline, it's a collection of many applications.
Okay great, now what?
Well if we think about the name of this and the previous blogs, the thing we might do to address the mentioned issues is actually split our CI from CD! Who would have guessed?!
To better understand how to do this, we should look at our pipeline and bring it back to its core functionality. The pipeline we talked about before has the following 'moving parts'.
First, we have the testing of our application code, before we build and ship anything anywhere, it is vital that we ensure our application adheres to the rules we have defined for it. We do this with code testing.
Once we are happy with our code and are confident that new changes and features work as intended we can build our software.
The goal of our build stage is that we get an artefact, an artefact that contains our latest code that we can then deploy to any of our servers or environments.
Some software itself might not need compiling, such as python but others do such as Go and Java but they can still provide an artefact that we can reuse across environments and servers.
Currently, that is almost always solved by building a Container Image.
Publishing the artefact is used to upload our artefact to a central repository, a place where we can easily download those artefacts later.
Deploy, Configure and (re)start the software
We take these three parts together in one block because, in essence, these together have a single goal. Get your latest software running! It doesn't matter how, or where, just as long as the goal is reached.
So how do we split pipelines?!
Good question! Thank you for asking! We do this very simply! Consider this: When do you want to run these steps? The steps are required, one way or the other. But when do you want them?
Testing, Building and Publishing are something that you want with every code change usually. Why? Because to deploy the application, you must first have something to deploy! As a tip, if you are wondering when to build or when not to build, just think of it like this: Always build!
But it's a test feature!
Still build it.
If you are not wondering, you probably know already at what point you want to build.
So we know that when the code changes, we need to run our tests, build and publish steps. Luckily we have something extraordinary for this already! It's pipelines! Use Pipelines to prepare your software to run the software! Trigger them when your code changes. But the pipeline should go no further up the chain than publishing. The overarching goal of the pipeline is, therefore, very simple.
Make the code deployable. That's it.
Anything else is wrong… I have an opinion about this…
So we are now in a situation where the code is validated and deployable. What remains? The system we use to manage our servers/environments! The goal of this system is very simple.
Make sure the managed servers/environments look exactly how we describe them.
We have a tendency in IT to go around in circles. We used to have tools like Puppet/Ansible/Chef/Saltstack or any of those tools to manage our server platforms. It used to be that via those tools we could deploy our code. These tools were usually managed by the infrastructure engineers, and they were the ones that managed the production environment and when the latest features could go out.
What these tools really were is a State Management system. It would manage the state of your environment and nothing more. The state itself was usually configured in Git repositories. Then Kubernetes became big, and containers with it. We moved to pipelines for everything because tools like Puppet/Ansible/Chef/Saltstack were not intended to manage constantly moving systems such as Kubernetes.
And now, only a few years after Kubernetes has become so popular that even the cool kids at school use it, we go back to what we used to do. Tools like Puppet/Ansible/Chef/Saltstack!
But not those tools…. I mean, come on, it's 2023! But the idea of those tools really was spot on! There was one place where engineers described what the environment and servers should look like. Not distributed over many pipelines but in one tool!
And so we go forward with our going in circles and move away from pipelines to manage environments once again with a tool that is specifically built for that. The tools to manage Kubernetes clusters have become any tool that implements the GitOps principle. Some examples of these tools are for instance Flux, Rancher Fleet and ArgoCD.
I myself am a big ArgoCD fan because of its flexibility.
Yes GitOps, it's the bee's knees! But what does GitOps really mean? It's state management.. That's all it does! Once again, we can describe what that means by drilling down to the goal that GitOps serves. Describe in Git what you want your environment to look like (your desired state), and let GitOps make it so!
Should the environment change, then GitOps should change it back immediately to what is described in Git.
Should the Git repo change to describe a change to your environment, GitOps will make it so!
This should make your actual deployment of code as easy as git push!
The big difference between GitOps and pipelines is that GitOps will continuously monitor your environments and make sure that it is the same as what you described in your Git repository. It does not trigger when code changes; it watches constantly. This forces you to describe your environment in Git repositories, and the tool will provide a way to see exactly what is deployed, where and how!
Looking at your GitOps tool gives you a single point of entry to see exactly what is running, and the clarity that these tools provide is immeasurable in environments as complex as Kubernetes.
Take your pipeline, and distil it to its main moving parts.
After you have all the moving parts, decide per part when you want to run this.
Do you want to run it when the code changes? It goes in the pipeline.
Do you want to run it continuously to ensure the state of your environment? Use GitOps.