Migrating Jenkins master from AWS OpsWorks to Amazon EKS

What is Jenkins?

Jenkins is a popular open-source automation server that is widely used for continuous integration (CI) and continuous deployment (CD) in software development pipelines. In a Jenkins setup, multiple build agents can work in parallel to build and test code, build artifacts, generate reports, and deploy applications.

Introduction

Let’s walk through the step-by-step process of migrating Jenkins master from AWS OpsWorks to Amazon EKS. The transition from OpsWorks, a managed configuration service, to EKS, a Kubernetes-based container orchestration platform, promises enhanced scalability and flexibility for the Jenkins infrastructure.

We have migrated 35 Jenkins masters from OpsWorks to EKS. We have also migrated 24 slaves from different sources like OpsWorks, Spot group, Nomad, EC2, and ECS into EKS.

Why we chose EKS over OpsWorks

We decided to move our Jenkins services from OpsWorks to Amazon Elastic Kubernetes (EKS) prior to AWS announcing the end of support for OpsWorks stacks. We chose EKS since it is a managed Kubernetes service provided by AWS. EKS service provides multiple features, including:

Managed Kubernetes control plane: Amazon EKS fully manages the Kubernetes control plane, including the API server and etcd, ensuring high availability and scalability
Compatibility: EKS is certified Kubernetes conformant, meaning it is compatible with existing Kubernetes applications and tools
Automatic updates: EKS provides automated updates for the Kubernetes control plane, making it easier to stay up to date with the latest features and security patches
Integrated with AWS services: EKS seamlessly integrates with other AWS services such as Elastic Load Balancing (ELB), Amazon Relational Database Service (RDS), Amazon S3, and more
Multi-AZ and high availability: EKS supports deploying clusters across multiple availability zones (AZs) for high availability and fault tolerance
Security and compliance: EKS integrates with AWS Identity and Access Management (IAM) for fine-grained access control and supports Kubernetes role-based access control (RBAC). It also helps in meeting regulatory compliance requirements
Spot instances support: EKS allows you to use EC2 Spot instances as worker nodes, reducing costs for fault-tolerant and flexible workloads
VPC networking: EKS integrates with Amazon Virtual Private Cloud (VPC), allowing use of VPC networking features, including private networking and security group controls
Logging and monitoring: EKS integrates with AWS CloudWatch for logging and monitoring Kubernetes applications and infrastructure

Prerequisites

Before proceeding with the migration journey, we need to ensure that we have the following prerequisites in place:

Access to the AWS Management Console
A backup of critical Jenkins configurations, jobs, and data
Kubectl and K9s command line tools
GitHub to store the YAML files of the Jenkins master and Argo CD for deployment

1. Assess current Jenkins configuration

Initially our Jenkins setup was on OpsWorks, where Jenkins masters were segregated into different layers. Over each layer, we have a different set of EC2 nodes and their Elastic Block Store (EBS) volumes based on the usage of the Jenkins master
All these Jenkins masters have a common application load balancer (ALB) through which path-based routing was enabled
Our OpsWorks-hosted Jenkins architecture:

2. Set up Amazon EKS cluster

Create a cluster in Amazon EKS
Map the Elastic File System (EFS) to be the base storage for the cluster
In EKS, Jenkins masters are hosted as deployments where each team has a separate namespace. Resources of the master are shared across the entire EKS cluster
We are using Argo CD for deploying all the different Jenkins masters, and those Jenkins masters YAML files were stored under a Git repo
We have a dedicated Argo CD app where we can update the resources or any configuration changes for a specific Jenkins master

EKS Jenkins architecture:

3. Gather Jenkins public image from Docker repository

Gather the necessary Docker image from Docker Hub that matches the current Jenkins version in use for each product

Tag and push the Jenkins Docker image to an ECR to avoid throttling due to image pull restrictions. This avoids Docker Hub rate-limiting so we can pull the images within the private network of our AWS account
Once an image is pushed to ECR, make use of that image in the kustomization file of the respective Jenkins master

4. Migrate Jenkins data

Export Jenkins job configurations, settings, and data from OpsWorks
Migrating the Jenkins data is a bit time-consuming. A couple of teams have Jenkins masters larger than 300 GB
There are different access points for different Jenkins masters. Mount them to the respective Jenkins master in OpsWorks and then initiate the data transfer

Note: Amazon EFS access points are application-specific entry points into an EFS file system that make it easier to manage application access

Once the data transfer is finished, remove conflicting files like jenkins.fingerprints.GlobalFingerprintConfiguration.xml and fingerprints folder. If those files are not removed, they tend to cause Jenkins master startup failure

5. Update DNS and networking

Update DNS records to point to the EKS Jenkins ALB
Use Ingress to set up the networking policies and security groups for the Jenkins masters
Use external DNS to enable the internal communication between the Jenkins master and slaves

6. Configure Jenkins on EKS

Deploy Jenkins masters on the EKS cluster using the Argo CD deployment tool
Argo CD will take care of configuring necessary Kubernetes resources such as ConfigMaps, secrets, pods, service, etc

7. Test and validate

Once the Jenkins master is ready, log in using one of your sign-in methods (we are using Google sign-in)
Conduct comprehensive testing to ensure that Jenkins jobs execute as expected
Verify integrations, plugins, and dependencies in the EKS environment

Technical aspects and best practices

Plugin compatibility: Ensure that Jenkins plugins used in OpsWorks are compatible with the EKS environment. Update or replace plugins as needed
Security and access controls: Review and update security policies, IAM roles, and access controls to align with EKS best practices
Volume management and safe data transfer: We opted to make use of EFS volume as storage instead of EBS based on advantages such as elastic scalability, no pre-provisioning, cost efficiency for shared workloads, and regional and cross-AZ availability. For data transfer, mount the access point of EFS volume into the OpsWorks node and initiate the data transfer to keep the transfer process private and secure
Version management: Version management and configuration changes are easy with EKS compared to OpsWorks
Effective utilization of resources: EKS allows for more granular control over resource allocation. We can define resource requests and limits for Jenkins pods and optimize resource usage

Challenges faced during migration

Data transfer: Data transfer is one of the sensitive components of dealing with high data volumes such as 250 GB. We initiated the transfer/copy as a background process but still observed the transfer process being killed intermittently. To avoid this, we used the nohup command
Downtime: Since we are not following the high-availability clusters model in our Jenkins environment, product teams faced certain downtime during the migration
Slave connectivity issues: In OpsWorks, we enabled Java Network Launch Protocol (JNLP) connectivity of slaves using the node IP followed by the 50000 port of Jenkins. Post-migration, we started observing connectivity issues since we can’t control the node IP assignment over a pod. So we used external DNS and enabled slave connectivity
Jenkins service abruptly restarting: Post-migration, we observed the Jenkins service abruptly restarting and affecting availability. While debugging, we found the issue was with the old data of fingerprints cached in the current setup. Post removal of jenkins.fingerprints.GlobalFingerprintConfiguration.xml and the fingerprints folder, the service ran without any issues
Performance issue: Post-migration, we observed a couple of Jenkins masters taking longer to load and even observed a huge increase in job execution times compared to the environment in OpsWorks. During the analysis, we found that the instance type in the EKS was smaller families that provide less compute ability. To overcome this, we created different auto-scaling groups (ASG) nodes for the higher-end machines and moved the affected Jenkins masters to those nodes, which increased performance

Conclusion

Migrating your Jenkins master from AWS OpsWorks to Amazon EKS is a strategic move toward a more scalable and containerized infrastructure. By following the outlined steps and best practices, you can seamlessly transition your Jenkins environment and leverage the benefits offered by Kubernetes.

Authors

Migrating Jenkins master from AWS OpsWorks to Amazon EKS

What is Jenkins?

Introduction

Why we chose EKS over OpsWorks