Provisioning a Production-Ready Multi-Master Kubernetes Cluster with Terraform, Ansible, and Dynamic Inventory

Building a production-ready multi-master Kubernetes cluster is essential for deploying and managing mission-critical applications at scale. In this article, I will explain a comprehensive approach to provisioning a production-ready multi-master Kubernetes cluster using Terraform for infrastructure provisioning, Ansible for configuration management, and an Ansible dynamic inventory for seamless integration. We will showcase examples of scripts to illustrate the provisioning process.

Infrastructure Provisioning with Terraform

In this section, we will leverage Terraform to provision the robust and scalable infrastructure required for our production-ready Kubernetes cluster. We’ll define the necessary resources, such as virtual machines, networks, and load balancers, using Terraform’s declarative language. The script below demonstrates a key-parts of Terraform configuration for provisioning production-grade infrastructure:

//skipping some basic terraform configurations

variable "master_instance_count" {
  type    = number
  default = 3
}

variable "worker_instance_count" {
  type    = number
  default = 5
}

resource "aws_launch_template" "kcluster_masters" {
  name = "kcluster_masters"
  image_id               = "ami-0c9354388bb36c088"
  instance_type          = "t2.medium"
  key_name               = "cks"
  vpc_security_group_ids = ["${aws_security_group.kcluster.id}"]
  user_data              = base64encode(<<-EOT
    #!/bin/bash
    cp /home/ubuntu/.ssh/authorized_keys /root/.ssh/authorized_keys
    systemctl restart sshd
    EOT
  )
  tag_specifications {
    // tags
  }
}

resource "aws_autoscaling_group" "kcluster_masters" {
  name = "kcluster_master"
  desired_capacity     = var.master_instance_count
  min_size             = var.master_instance_count
  max_size             = var.master_instance_count
  launch_template {
    id      = aws_launch_template.kcluster_masters.id
    version = "$Latest"
  }
  vpc_zone_identifier = data.aws_subnets.default.ids
  target_group_arns = [aws_lb_target_group.kcluster_masters.arn]
}

resource "aws_launch_template" "kcluster_workers" {
// similar to masters
}

resource "aws_autoscaling_group" "kcluster_workers" {
  // similar to masters
}

data "aws_instances" "ec2_instances" {
  depends_on = [ aws_autoscaling_group.kcluster_workers, aws_autoscaling_group.kcluster_masters ]
  filter {
    name   = "instance-state-name"
    values = ["running"]
  }
  filter {
    name   = "tag:Purpose"
    values = ["kcluster"]
  }
}

resource "aws_lb" "kcluster_masters" {
  name               = "masters-balancer"
  internal           = false
  load_balancer_type = "network"
  subnets            = data.aws_subnets.default.ids
  tags = {
    // tags
  }
}

resource "aws_lb_listener" "masters" {
  load_balancer_arn = aws_lb.kcluster_masters.arn
  port              = 6443
  protocol          = "TCP"
  default_action {
    target_group_arn = aws_lb_target_group.kcluster_masters.arn
    type  = "forward"
  }
  tags = {
    // tags
  }
}

resource "aws_lb_target_group" "kcluster_masters" {
  name     = "kcluster-masters"
  port     = 6443
  protocol = "TCP"
  vpc_id   = data.aws_vpc.default.i
  health_check {
    //health checks
  }
  tags = {
    // tags
  }
}

Configuration Management with Ansible

In this section, I will use Ansible to perform robust configuration management tasks on our provisioned infrastructure. Ansible provides a powerful and flexible way to manage the state of our production-ready cluster nodes. I use Ansible playbooks to install necessary software packages, configure networking, and ensure consistent settings across all nodes. The following example demonstrates an Ansible playbook snippet for configuring Kubernetes on master nodes:

---
- name: Common tasks for kcluster
  hosts: all
  gather_facts: False
  tasks:
    - name: Append DNS records to hosts file
      lineinfile:
        path: /etc/hosts
        line: "{{ hostvars[item]['private_ip'] }}  {{ hostvars[item]['inventory_hostname'] }}"
      loop: "{{ groups['all'] }}"

   # some obvious instructions are skipped 

    - name: Install kube binaries
      apt:
        name: "{{ item.name }}"
        state: present
        allow_downgrades: yes
      loop:
        - { name: "kubelet=1.24.0-00" }
        - { name: "kubeadm=1.24.0-00" }
        - { name: "kubectl=1.24.0-00" }

- name: Masters init
  hosts: masters
  tasks:
  - name: first master kubeadm init
    shell: kubeadm init --control-plane-endpoint "{{ hostvars[inventory_hostname]['elb'] }}:6443" --upload-certs --pod-network-cidr 192.168.0.0/16 --kubernetes-version 1.24.0
    when: inventory_hostname == play_hosts[0]
    register: kubeadm_init_output

  - name: first master apply calico
    shell: kubectl apply -f https://docs.projectcalico.org/archive/v3.20/manifests/calico.yaml --kubeconfig /etc/kubernetes/admin.conf
    when: inventory_hostname == play_hosts[0]

  - name: rest of masters kubeadm init
    shell: "{{hostvars[play_hosts[0]]['kubeadm_init_master'].stdout}}"
    when: inventory_hostname != play_hosts[0]

  - name: get config
    ansible.builtin.fetch:
      src: /etc/kubernetes/admin.conf
      dest: ./
      flat: yes
    when: inventory_hostname == play_hosts[0]

- name: Workers init
  hosts: workers
  tasks:
  - name: workers join kcluster
    shell: "{{ hostvars['host_for_var']['kubeadm_init_worker_command'] }}"

Ansible Dynamic Inventory for Kubernetes

To dynamically manage the inventory of our production-ready Kubernetes cluster, I utilize an Ansible dynamic inventory script that retrieves information about the cluster’s nodes directly from the infrastructure provider or Kubernetes API. The dynamic inventory script ensures that Ansible always has an up-to-date view of the cluster’s current state. Below is an example of an Ansible dynamic inventory script for Kubernetes:

#!/usr/bin/env python
import json
import boto3

def get_ec2_by_tag(tag_key, tag_value):
    ec2_client = boto3.client('ec2')

    response = ec2_client.describe_instances(
        Filters=[
            { 'Name': 'instance-state-name', 'Values': ['running'] },
            { 'Name': f'tag:{tag_key}', 'Values': [tag_value] }
        ]
    )

    instances = []
    reservations = response['Reservations']
    for reservation in reservations:
        instances.extend(reservation['Instances'])

    return instances

def get_load_balancer_public_dns(load_balancer_name):
    elbv2_client = boto3.client('elbv2')
    response = elbv2_client.describe_load_balancers()
    
    if 'LoadBalancers' in response:
        load_balancer = response['LoadBalancers'][0]
        if 'DNSName' in load_balancer:
            return load_balancer['DNSName']
    
    return None

inventory = {/*inventory template*/}

elb_dns = get_load_balancer_public_dns('kcluster')

for instance in get_ec2_by_tag('Purpose', 'kcluster'):
    // setup publicIp, privateIp, privateDns, hostRole variables
    hostName = privateDns[:privateDns.index(".")]

    if hostRole == 'kcluster_master':
        inventory['masters']['hosts'].append(hostName)
        inventory['masters']['vars']['elb'] = elb_dns
    elif hostRole == 'kcluster_worker':
        inventory['workers']['hosts'].append(hostName)

    inventory['_meta']['hostvars'].update( { hostName:{'ansible_host': publicIp, 'private_ip': privateIp}})

print(json.dumps(inventory))

Provisioning a production-ready multi-master Kubernetes cluster requires meticulous orchestration of infrastructure provisioning, configuration management, and dynamic inventory management. By leveraging Terraform for infrastructure provisioning, Ansible for configuration management, and an Ansible dynamic inventory script for inventory management, we can achieve a robust, scalable, and production-ready multi-master Kubernetes cluster.

To access the full code and scripts for provisioning a production-ready multi-master Kubernetes cluster, you can visit the following GitHub repository: my github

The code and scripts provided in the repository serve as a foundation for your own deployments, enabling you to customize and adapt them to your specific production requirements. With these resources at your disposal, you’ll be well-prepared to provision and manage a highly available and production-ready Kubernetes cluster.

Please note that the GitHub repository contains the complete set of scripts, including Terraform configurations, Ansible playbooks, and the dynamic inventory script for seamless integration.

Querying Nodes with kubectl

After provisioning your multi-master Kubernetes cluster, you can use kubectl – the command-line tool for interacting with Kubernetes clusters – to perform various operations. One common task is querying the nodes in the cluster to verify their status and gather information. Here’s an example of how you can use kubectl to retrieve the list of nodes:

  1. Ensure you have kubectl installed and properly configured to connect to your Kubernetes cluster.
  2. Open a terminal or command prompt and execute the following command to get a list of nodes:
kubectl --kubeconfig admin.conf get nodes
NAME               STATUS   ROLES           AGE     VERSION
ip-172-31-18-209   Ready    control-plane   5m22s   v1.24.0
ip-172-31-2-97     Ready    control-plane   4m51s   v1.24.0
ip-172-31-25-81    Ready    <none>          4m21s   v1.24.0
ip-172-31-29-195   Ready    <none>          4m21s   v1.24.0
ip-172-31-40-170   Ready    <none>          4m21s   v1.24.0
ip-172-31-43-124   Ready    control-plane   6m4s    v1.24.0
ip-172-31-47-210   Ready    <none>          4m21s   v1.24.0
ip-172-31-6-31     Ready    <none>          4m21s   v1.24.0

By using the kubectl get nodes command, you can quickly verify the health and status of the nodes in your provisioned Kubernetes cluster. This information is valuable for monitoring, troubleshooting, and ensuring the proper functioning of your cluster.

Site Reliability Engineering notes

DevOps is used to mitigate tension between performance criterias of Devs (release faster, to deliver functionality) and Ops (keep system at the same stable state, any change can cause crash)

Practices:

  1. At least 50% of Dev work, Ops – for all: monitoring, maintenance, automation, on-call
  2. If no Ops tasks – only Dev, minimizing Ops should be considered as primary achievement.
  3. Monitoring should produce: alerts (requires immediate human attention) and tickets (requires postponed investigation)
  4. Cloud: distributed computation (borg/k8s), distributed storage

Handling risks:

  • Services should be reliable enough, but not more than it is needed (if a service’s client itself is reliable 99%, no need to be 99.9% reliable, 99% is enough)
  • Availability can be measured not by system relative uptime (it is not very informative for distributed systems), but a relative percentage of successful requests.

Service level:

  • Types of requests can be differentiated, by importance, for more important -> higher SLA.
  • Increasing availability or any other service’s improvement should be economically justified.
  • Error budget – with given SLA, budget = 100%-SLA(%), budget can be spent for experimentation or extra releases, increased delivery velocity (or whatever useful), there is no need to keep it minimized, it should only not be exceeded. Budget is calculated by external monitoring system. If budget is used, no new releases until getting new budget. Budget can say if new feature can be developed, or tech debt should be addressed.
  • SLI – service level indicators, parameters (like response time), it is better to prefer those indicators, that are valuable/visible/important for service’s users or business customers or own business KPI; SLO – objectives, like keeping SLI of response time lover 100ms in 99% cases; SLA – SLO+consequences if SLO is violated

Monitoring and alerting (like Prometheus):

  • if alerts can be ignored in case of some conditions, those conditions should be alerted at all
  • if action alert can be automated, it should be done and not alerted anymore
  • alerts only justified in case if mandatory intelligent human interaction is required

Automation:

  • automate all possible tasks
  • automation actions should be (as much as possible) idempotent

On-call:

  • effective rotations, less than 25% of working time
  • playbooks
  • postmortem culture, no blame, share postmortem information in all possible ways

Addressing cascading failures:

  • increase resources
  • stop (temporary) health checking
  • restart servers
  • traffic throttling
  • switch to degradated model