Migrating Jenkins master from AWS OpsWorks to Amazon EKS

What is Jenkins?

Jenkins is a popular open-source automation server that is widely used for continuous integration (CI) and continuous deployment (CD) in software development pipelines. In a Jenkins setup, multiple build agents can work in parallel to build and test code, build artifacts, generate reports, and deploy applications.

Introduction

Let’s walk through the step-by-step process of migrating Jenkins master from AWS OpsWorks to Amazon EKS. The transition from OpsWorks, a managed configuration service, to EKS, a Kubernetes-based container orchestration platform, promises enhanced scalability and flexibility for the Jenkins infrastructure.

We have migrated 35 Jenkins masters from OpsWorks to EKS. We have also migrated 24 slaves from different sources like OpsWorks, Spot group, Nomad, EC2, and ECS into EKS.

Why we chose EKS over OpsWorks

We decided to move our Jenkins services from OpsWorks to Amazon Elastic Kubernetes (EKS) prior to AWS announcing the end of support for OpsWorks stacks. We chose EKS since it is a managed Kubernetes service provided by AWS. EKS service provides multiple features, including:

  1. Managed Kubernetes control plane: Amazon EKS fully manages the Kubernetes control plane, including the API server and etcd, ensuring high availability and scalability
  2. Compatibility: EKS is certified Kubernetes conformant, meaning it is compatible with existing Kubernetes applications and tools
  3. Automatic updates: EKS provides automated updates for the Kubernetes control plane, making it easier to stay up to date with the latest features and security patches
  4. Integrated with AWS services: EKS seamlessly integrates with other AWS services such as Elastic Load Balancing (ELB), Amazon Relational Database Service (RDS), Amazon S3, and more
  5. Multi-AZ and high availability: EKS supports deploying clusters across multiple availability zones (AZs) for high availability and fault tolerance
  6. Security and compliance: EKS integrates with AWS Identity and Access Management (IAM) for fine-grained access control and supports Kubernetes role-based access control (RBAC). It also helps in meeting regulatory compliance requirements
  7. Spot instances support: EKS allows you to use EC2 Spot instances as worker nodes, reducing costs for fault-tolerant and flexible workloads
  8. VPC networking: EKS integrates with Amazon Virtual Private Cloud (VPC), allowing use of VPC networking features, including private networking and security group controls
  9. Logging and monitoring: EKS integrates with AWS CloudWatch for logging and monitoring Kubernetes applications and infrastructure

Prerequisites

Before proceeding with the migration journey, we need to ensure that we have the following prerequisites in place:

  • Access to the AWS Management Console
  • A backup of critical Jenkins configurations, jobs, and data
  • Kubectl and K9s command line tools
  • GitHub to store the YAML files of the Jenkins master and Argo CD for deployment

1. Assess current Jenkins configuration

  • Initially our Jenkins setup was on OpsWorks, where Jenkins masters were segregated into different layers. Over each layer, we have a different set of EC2 nodes and their Elastic Block Store (EBS) volumes based on the usage of the Jenkins master
  • All these Jenkins masters have a common application load balancer (ALB) through which path-based routing was enabled
  • Our OpsWorks-hosted Jenkins architecture:

2. Set up Amazon EKS cluster

  • Create a cluster in Amazon EKS
  • Map the Elastic File System (EFS) to be the base storage for the cluster
  • In EKS, Jenkins masters are hosted as deployments where each team has a separate namespace. Resources of the master are shared across the entire EKS cluster
  • We are using Argo CD for deploying all the different Jenkins masters, and those Jenkins masters YAML files were stored under a Git repo
  • We have a dedicated Argo CD app where we can update the resources or any configuration changes for a specific Jenkins master

EKS Jenkins architecture: 

3. Gather Jenkins public image from Docker repository

  • Gather the necessary Docker image from Docker Hub that matches the current Jenkins version in use for each product

 

  • Tag and push the Jenkins Docker image to an ECR to avoid throttling due to image pull restrictions. This avoids Docker Hub rate-limiting so we can pull the images within the private network of our AWS account
  • Once an image is pushed to ECR, make use of that image in the kustomization file of the respective Jenkins master

4. Migrate Jenkins data

  • Export Jenkins job configurations, settings, and data from OpsWorks
  • Migrating the Jenkins data is a bit time-consuming. A couple of teams have Jenkins masters larger than 300 GB
  • There are different access points for different Jenkins masters. Mount them to the respective Jenkins master in OpsWorks and then initiate the data transfer

Note: Amazon EFS access points are application-specific entry points into an EFS file system that make it easier to manage application access

  • Once the data transfer is finished, remove conflicting files like jenkins.fingerprints.GlobalFingerprintConfiguration.xml and fingerprints folder. If those files are not removed, they tend to cause Jenkins master startup failure

5. Update DNS and networking

  • Update DNS records to point to the EKS Jenkins ALB
  • Use Ingress to set up the networking policies and security groups for the Jenkins masters
  • Use external DNS to enable the internal communication between the Jenkins master and slaves

6. Configure Jenkins on EKS

  • Deploy Jenkins masters on the EKS cluster using the Argo CD deployment tool
  • Argo CD will take care of configuring necessary Kubernetes resources such as ConfigMaps, secrets, pods, service, etc

7. Test and validate

  • Once the Jenkins master is ready, log in using one of your sign-in methods (we are using Google sign-in)
  • Conduct comprehensive testing to ensure that Jenkins jobs execute as expected
  • Verify integrations, plugins, and dependencies in the EKS environment

Technical aspects and best practices

  1. Plugin compatibility: Ensure that Jenkins plugins used in OpsWorks are compatible with the EKS environment. Update or replace plugins as needed
  2. Security and access controls: Review and update security policies, IAM roles, and access controls to align with EKS best practices
  3. Volume management and safe data transfer: We opted to make use of EFS volume as storage instead of EBS based on advantages such as elastic scalability, no pre-provisioning, cost efficiency for shared workloads, and regional and cross-AZ availability. For data transfer, mount the access point of EFS volume into the OpsWorks node and initiate the data transfer to keep the transfer process private and secure
  4. Version management: Version management and configuration changes are easy with EKS compared to OpsWorks
  5. Effective utilization of resources: EKS allows for more granular control over resource allocation. We can define resource requests and limits for Jenkins pods and optimize resource usage

Challenges faced during migration

  1. Data transfer: Data transfer is one of the sensitive components of dealing with high data volumes such as 250 GB. We initiated the transfer/copy as a background process but still observed the transfer process being killed intermittently. To avoid this, we used the nohup command
  2. Downtime: Since we are not following the high-availability clusters model in our Jenkins environment, product teams faced certain downtime during the migration
  3. Slave connectivity issues: In OpsWorks, we enabled Java Network Launch Protocol (JNLP) connectivity of slaves using the node IP followed by the 50000 port of Jenkins. Post-migration, we started observing connectivity issues since we can’t control the node IP assignment over a pod. So we used external DNS and enabled slave connectivity
  4. Jenkins service abruptly restarting: Post-migration, we observed the Jenkins service abruptly restarting and affecting availability. While debugging, we found the issue was with the old data of fingerprints cached in the current setup. Post removal of jenkins.fingerprints.GlobalFingerprintConfiguration.xml and the fingerprints folder, the service ran without any issues
  5. Performance issue: Post-migration, we observed a couple of Jenkins masters taking longer to load and even observed a huge increase in job execution times compared to the environment in OpsWorks. During the analysis, we found that the instance type in the EKS was smaller families that provide less compute ability. To overcome this, we created different auto-scaling groups (ASG) nodes for the higher-end machines and moved the affected Jenkins masters to those nodes, which increased performance

Conclusion

Migrating your Jenkins master from AWS OpsWorks to Amazon EKS is a strategic move toward a more scalable and containerized infrastructure. By following the outlined steps and best practices, you can seamlessly transition your Jenkins environment and leverage the benefits offered by Kubernetes.