Terraform & Jenkins: IaC and CI/CD Explained

Recently, I had a task to build a CI/CD pipeline for a Machine Learning Project. Since our cloud infrastructure is heavily tied into AWS, my first thought was to use CodePipeline. It’s native to AWS and doesn’t require much overhead. However, as I continued to research different options, it was clear we needed an interface that our client could access and monitor builds while being able to scale out if requirements for compute resources increased.

My final decision landed on Jenkins as I have used it from a developer standpoint in the past and am very familiar with it. The new challenge was approaching it from a System Admin or DevOps perspective: deploying a Jenkins server on top of our existing cloud infrastructure and be able to administer it. On top of that, I had to use my previous developer experience and create the build and deploy pipeline based off of the client’s workflow.

CI/CD

CI/CD or Continuous Integration / Continuous Delivery is a method to frequently deliver apps to customers by introducing automation into the stages of app development. The whole nine yards includes building, testing, and deploying.

CI can be loosely associated with the source code’s build process such as using a build tool like gradle or maven for Java which generates a binary artifact or it may involve build a container image using Docker or Podman. CI also encompasses testing, which can involve the use of unit testing tools, such as JUnit, and code analysis, such as SonarQube.

CD can be loosely associated with the deployment process such as using Ansible scripts to take a binary artifact and copy and run it on a server in the cloud or spinning up containers in Kubernetes, etc.

Jenkins

Jenkins is an open source automation server. Jenkins can be used to automate pretty much anything but it is mostly used for CI/CD. Jenkins utilizes a Jenkinsfile which defines a set of stages of steps to execute things. Jenkins uses the Jenkinsfile to execute shell scripts and cli tools just like you would from a terminal. The Jenkinsfile is normally held within a project’s root directory.

pipeline {

    stages {
        stage('Build') {
            steps {
                echo 'Building..'
            }
        }
        stage('Test') {
            steps {
                echo 'Testing..'
            }
        }
        stage('Deploy') {
            steps {
                echo 'Deploying....'
            }
        }
    }
}

As a developer, I’m familiar with writing Jenkinsfiles to automate building, testing, and deploying apps. However, in order to do that, we need to deploy Jenkins and have it run somewhere.

IaC

IaC or Infrastructure as Code is the managing and provisioning of infrastructure through code instead of through manual processes.

Manual process in this case being using the AWS Console (web gui) or the AWS CLI to manually provision the resources we need to stand up Jenkins. If you are just getting started, doing it manually can have some benefit to better understand what is being provisioned visually (which most tutorials online showcase) however, you will always be able to see the result of what you deployed after the fact in the console anyway.

Using IaC is a better practice as it allows you to bundle your cloud configuration in a code repository and makes it easier to create, edit, distribute, and destroy configurations. By codifying and documenting your configuration specifications, IaC aids configuration management and helps you to avoid undocumented, ad-hoc configuration changes.

There are many tools to use for this. In this case, I’m using Terraform.

Terraform

Terraform is an infrastructure-as-code software tool by which you can declare your cloud or data center infrastructure. By default, Terraform uses HCL (Hashicorp Configuration Language) however you are able to swap that out for several languages available via the Cloud Development Kit for Terraform (CDKTF). Two main tools are involved in building infrastructure with Terraform: the Terraform CLI and a Terraform repository containing .tf files where the configuration is written with HCL.

TF files are mainly comprised of resource and data source types.

Resources define new resources you want to create like a vpc:

resource "aws_vpc" "vpc" {
  cidr_block = var.vpc_cidr_block
}

Data sources define existing resources that you want to utilize in your configuration like a pre-existing AWS Route 53 domain name and certificate:

data "aws_route53_zone" "r53zone" {
  name         = "codingwithcarl.com"
  private_zone = false
}

data "aws_acm_certificate" "codingwithcarl" {
  domain   = "*.codingwithcarl.com"
  statuses = ["ISSUED"]
}

There are other types that can make up your cloud configuration. Check out the documentation for more information.

The Task at Hand

Now that the background info and definitions are out of the way, I’m going to attempt a deep dive into the Terraform configuration I created for the client. My files are located here so you can hack around with it and repurpose it for your environment.

My code is going to create two EC2 instances to host the Jenkins server. One for the controller, and another for the agent. The controller is where the Jenkins tool and UI lives which will tell the agent to execute the pipelines. You can get away with one EC2 instance and run the pipelines on the controller, however it’s not recommended. This repo will also stand up an Application Load Balancer so that we can enable HTTPS and DNS i.e. instead of http:10.1.0.400/ our endpoint to access Jenkins will be https://jenkins.codingwithcarl.com - this is also not entirely required if you are self hosting, but I have included it to be as close to a real world example as possible.

In order to get started, ensure you follow the prerequisites located in the README.

Ensure you create Key Pairs to ssh into the EC2 instances. Generate a pem file and it will download to your local machine. You’ll need this later to ssh into your Jenkins box for the initial configuration:

ssh -i "jenkins-controller.pem" ec2-user@XX.XXX.XXX.XXX

I’d recommend creating one for your controller and another for the agent just in case. You can always revoke ssh access later if there is an OpSec concern.

Since my cloud provider is AWS, I need to have the AWS CLI installed and configured, and an S3 bucket to store a .tfstate file which tracks the current state of my cloud infrastructure. When any changes are made, Terraform will read the .tfstate and compare that to the tf files in my repository in order to make the changes and achieve the desired state.

Note: It’s important that you avoid making any manual changes through the AWS Console (frontend web gui) such as creating new resources that can be created via Terraform or editing any resources that were already created via Terraform. This can mess up Terraform’s state tracking and can cause problems such as errors from Terraform, missing configurations, and forgetting that some resources were created manually where you end up with a bill from AWS. All you will need to delete manually are the Key Pairs and the S3 bucket in this example.

A Terraform repository can actually be a single tf file, although best practice is to separate your code by categories of the resources that you are using or any way that makes sense to you. Below is an explanation of the files and code in my Terraform repo to create a Jenkins Server:

.gitignore: Basic gitignore to ignore any files generated by terraform that we do not want to push up to the repo

provider.tf: Defines what cloud provider you are using and where your .tfstate will be stored. Consult the documentation in order to modify this for your environment.

data.tf: I put all my data sources in here. I included an Amazon Machine Image which is needed to spin up the EC2 instances to run Jenkins on. There is also a data source for an existing AWS Route 53 domain name and AWS Certificate Manager to prove that I own the domain name.

main.tf: The EC2 instances for the Jenkins Controller and Agent. They are both hooked up to a subnet and security group that is created. You’ll also see the user_data parameter where we pass in bash scripts to run while the instance is initialized.

jenkins-controller-script.sh: Installs Jenkins and it’s dependencies. This is all that is needed on this instance as the agent will be configured to do all the work.

jenkins-agent-script.sh: Installs any tools we need to run our pipelines. This example installs Apache Maven, Terraform (in case Jenkins will call Terraform infrastructure changes i.e. deploying containers), Docker for building and uploading container images, and kubectl for container orchestration in Kubernetes. Feel free to add any other tools you might need for building, testing, and deploying.

vpc.tf: Defines the Virtual Private Cloud and it’s networking.

https.tf: Defines the resources needed to create an Application Load Balancer (alb), enable HTTPS by redirecting HTTP requests to HTTPS, and creates a new Route 53 A record to map DNS to. If you do not wish to enable HTTPS (for example, you may not own a domain name) then you can comment out this file. You will need to enter the public IP instead to access Jenkins.

variables.tf: Creates variables to be used elsewhere in the configuration. If you reuse values several times in the config, it is better to update the variable values in one place.

terraform.tfvars: Assigns values to the variables created in variables.tf - this is where you will update the values if they need to change.

outputs.tf: Any values that are generated as a result of applying the Terraform config, output it here. For this example, I am outputting the IP addresses of the EC2 instances and the full domain name to access Jenkins.

Once you are familiar with everything and have modified the tf files to your needs, you can go ahead and start the apply process. You’ll need to sign into the AWS CLI - that’s what Terraform essentially does: convert your configurations into AWS CLI commands to provision your infrastructure.

terraform init initializes your Terraform repository. This will need to be re-run anytime you make changes to the code.

terraform plan will output what Terraform will create update or destroy to match your desired state define in the code. This command can be re-run to ensure you have everything you need to be provisioned and should let you know if there are any errors.

terraform apply will apply your configuration to your cloud provider. It will again output your plan, then ask for confirmation to apply the changes.

After everything is applied with no errors, you can check the AWS Console to see all the resources that were created, go to your Jenkins url, ssh into the Jenkins Controller to get the initalAdminPassword, then begin to configure Jenkins through the front end: create user accounts, disable the built in node, and connect the controller to the agent. From there, you can create a Jenkinsfile for your project, create your first pipeline, connect to the source code repository, and test out the agent by building and deploying your project.

If you are doing this as a learning exercise, you can destroy the resources in your AWS account to avoid a large bill by executing terraform destroy.

Wrapping Up

The aim of this post was to aggregate everything I learned from disparate sources into one place. I hope this has helped you in some way and made your job a little easier.


References

What is CI/CD - RedHat

What is Infrastructure as Code - RedHat

CICD Pipeline to deploy Kubernetes Applications using Terraform, EKS, and Jenkins - YouTube

How to create Application Load Balancer using Terraform? (AWS ALB | HTTPS) - Anton Putra

Controller Isolation - Jenkins

Distributed Builds - Jenkins