Skip to main content
How to Set Up a K3S Cluster in 2025
  1. Blog/

How to Set Up a K3S Cluster in 2025

·15 mins· ·
DevOps Homelab Automation K3S Kubernetes Ansible Self-Hosting
Robert Melcher
Author
Robert Melcher
„Sutor, ne ultra crepidam”
Table of Contents

It’s already been a year since my first Kubernetes journey. My initial clusters—where I started learning and understanding more about Kubernetes—are now all taken down. This time, I want to build a fully functional, highly available (HA) cluster.

Over the past weeks, I’ve done more research in Kubernetes communities, as well as on subreddits like [k3s], [homelab], and [selfhosted]. I discovered that one of the best ways to deploy a cluster these days is by following guides and content from Techno Tim, so I decided to write this blog and share my own approach.

Tip: If you’re new to K3s, subreddits like r/k3s and r/homelab can be great resources to learn from fellow enthusiasts.

What I Want to Achieve
#

  • A fully organized HA cluster on my hardware, so if any of my machines go down, the cluster remains functional. Specifically:
    • 1 x DELL R720k3s-master-1 and k3s-worker-1
    • 1 x DELL Optiplex Micro 3050k3s-master-2 and k3s-worker-2
    • 1 x DELL Optiplex Micro 3050k3s-master-3 and k3s-worker-3

How I Will Deploy
#

I will create six virtual machines (VMs) on a Proxmox cluster:

  • 3 x Ubuntu 22.04 Master Nodes
  • 3 x Ubuntu 22.04 Worker Nodes

The goal is to run K3s on these VMs to set up a solid Kubernetes environment with redundancy.


Let’s Begin!
#

In the upcoming sections, I’ll detail each step, from setting up Proxmox VMs to installing and configuring K3s, managing networking, storage, and beyond.

Chapter 1: Preparing DNS and IP Addresses
#

When setting up a Kubernetes cluster, DNS and IP management are crucial. Below is how I handle DHCP, static IP assignments, and DNS entries in my homelab environment.


DHCP Configuration
#

There are two possible scenarios for assigning IP addresses to your VMs:

  1. Use IP addresses outside of your DHCP range
    This method is often preferred, as your machines will keep their manually configured network settings even if your DHCP server goes down.

  2. DHCP Static Mappings
    You can map MAC -> IP in your network services to allocate IP addresses to VMs based on their MAC addresses.

Tip: If you choose the second scenario, make sure you document your static leases carefully. Proper documentation avoids conflicts and confusion later.


My Approach
#

I chose the first scenario, where I use IPs outside the DHCP range. This ensures my network remains stable if the DHCP service is unavailable.

  • IP Range: 10.57.57.30/2410.57.57.35/24 for my VMs

DNS Setup
#

I also set up a DNS entry in my Unbound service on pfSense to easily manage and access my machines. For instance, you can create an A record or similar DNS record type pointing to your VM’s IP address. Below is a simple example:

unbound_pfsense


Chapter 2: Automated VM Deployment on Proxmox with Cloud-Init
#

To streamline the next steps, I’ve created a bash script that automates crucial parts of the process, including:

  • Creating a Cloud-Init template
  • Deploying multiple VMs with static or DHCP-based IP addresses
  • Destroying the VMs if needed

If you prefer an even more automated approach using tools like Packer or Terraform, I suggest checking out this related post: Homelab as Code and adapting it to your specific scenario. However, for this blog, I’ll demonstrate a simpler, more direct approach using the script below.

Warning
This script can create or destroy VMs. Use it carefully and always keep backups of critical data.

Prerequisites
#

  • Make sure you have Proxmox up and running.
  • You’ll need to place your SSH public key (e.g., /root/.ssh/id_rsa.pub) on the Proxmox server before running the script.

Script Overview
#

Option 1: Create Cloud-Init Template

  • Downloads the Ubuntu Cloud image (currently Ubuntu 24.04, code-named “noble”)
  • Creates a VM based on the Cloud-Init image
  • Converts it into a template

Option 2: Deploy VMs

  • Clones the Cloud-Init template to create the desired number of VMs
  • Configures IP addressing, gateway, DNS, search domain, SSH key, etc.
  • Adjusts CPU, RAM, and disk size to fit your needs

Option 3: Destroy VMs

  • Stops and removes VMs created by this script

During the VM creation process, you’ll be prompted to enter the VM name for each instance (e.g., k3s-master-1, k3s-master-2, etc.).

Tip
To fully automate naming, you could edit the script to increment VM names automatically. However, prompting ensures you can organize VMs with custom naming.

The Bash Script
#

Below is the full script. Feel free to customize it based on your storage, networking, and naming preferences.

  1#!/bin/bash
  2
  3# Function to get user input with a default value
  4get_input() {
  5    local prompt=$1
  6    local default=$2
  7    local input
  8    read -p "$prompt [$default]: " input
  9    echo "${input:-$default}"
 10}
 11
 12# Ask the user whether they want to create a template, deploy or destroy VMs
 13echo "Select an option:"
 14echo "1) Create Cloud-Init Template"
 15echo "2) Deploy VMs"
 16echo "3) Destroy VMs"
 17read -p "Enter your choice (1, 2, or 3): " ACTION
 18
 19if [[ "$ACTION" != "1" && "$ACTION" != "2" && "$ACTION" != "3" ]]; then
 20    echo "❌ Invalid choice. Please run the script again and select 1, 2, or 3."
 21    exit 1
 22fi
 23
 24# === OPTION 1: CREATE CLOUD-INIT TEMPLATE ===
 25if [[ "$ACTION" == "1" ]]; then
 26    TEMPLATE_ID=$(get_input "Enter the template VM ID" "300")
 27    STORAGE=$(get_input "Enter the storage name" "local")
 28    TEMPLATE_NAME=$(get_input "Enter the template name" "ubuntu-cloud")
 29    IMG_URL="https://cloud-images.ubuntu.com/noble/current/noble-server-cloudimg-amd64.img"
 30    IMG_FILE="/root/noble-server-cloudimg-amd64.img"
 31
 32    echo "📥 Downloading Ubuntu Cloud image..."
 33    cd /root
 34    wget -O $IMG_FILE $IMG_URL || { echo "❌ Failed to download the image"; exit 1; }
 35
 36    echo "🖥️ Creating VM $TEMPLATE_ID..."
 37    qm create $TEMPLATE_ID --memory 2048 --cores 2 --name $TEMPLATE_NAME --net0 virtio,bridge=vmbr0
 38
 39    echo "💾 Importing disk to storage ($STORAGE)..."
 40    qm disk import $TEMPLATE_ID $IMG_FILE $STORAGE || { echo "❌ Failed to import disk"; exit 1; }
 41
 42    echo "🔗 Attaching disk..."
 43    qm set $TEMPLATE_ID --scsihw virtio-scsi-pci --scsi0 $STORAGE:vm-$TEMPLATE_ID-disk-0
 44
 45    echo "☁️ Adding Cloud-Init drive..."
 46    qm set $TEMPLATE_ID --ide2 $STORAGE:cloudinit
 47
 48    echo "🛠️ Configuring boot settings..."
 49    qm set $TEMPLATE_ID --boot c --bootdisk scsi0
 50
 51    echo "🖧 Adding serial console..."
 52    qm set $TEMPLATE_ID --serial0 socket --vga serial0
 53
 54    echo "📌 Converting VM to template..."
 55    qm template $TEMPLATE_ID
 56
 57    echo "✅ Cloud-Init Template created successfully!"
 58    exit 0
 59fi
 60
 61# === OPTION 2: DEPLOY VMs ===
 62if [[ "$ACTION" == "2" ]]; then
 63    TEMPLATE_ID=$(get_input "Enter the template VM ID" "300")
 64    START_ID=$(get_input "Enter the starting VM ID" "301")
 65    NUM_VMS=$(get_input "Enter the number of VMs to deploy" "6")
 66    STORAGE=$(get_input "Enter the storage name" "dataz2")
 67    IP_PREFIX=$(get_input "Enter the IP prefix (e.g., 10.57.57.)" "10.57.57.")
 68    IP_START=$(get_input "Enter the starting IP last octet" "30")
 69    GATEWAY=$(get_input "Enter the gateway IP" "10.57.57.1")
 70    DNS_SERVERS=$(get_input "Enter the DNS servers (space-separated)" "8.8.8.8 1.1.1.1")
 71    DOMAIN_SEARCH=$(get_input "Enter the search domain" "merox.dev")
 72    DISK_SIZE=$(get_input "Enter the disk size (e.g., 100G)" "100G")
 73    RAM_SIZE=$(get_input "Enter the RAM size in MB" "16384")
 74    CPU_CORES=$(get_input "Enter the number of CPU cores" "4")
 75    CPU_SOCKETS=$(get_input "Enter the number of CPU sockets" "4")
 76    SSH_KEY_PATH=$(get_input "Enter the SSH public key file path" "/root/.ssh/id_rsa.pub")
 77
 78    if [[ ! -f "$SSH_KEY_PATH" ]]; then
 79        echo "❌ Error: SSH key file not found at $SSH_KEY_PATH"
 80        exit 1
 81    fi
 82
 83    for i in $(seq 0 $((NUM_VMS - 1))); do
 84        VM_ID=$((START_ID + i))
 85        IP="$IP_PREFIX$((IP_START + i))/24"
 86        VM_NAME=$(get_input "Enter the name for VM $VM_ID" "ubuntu-vm-$((i+1))")
 87
 88        echo "🔹 Creating VM: $VM_ID (Name: $VM_NAME, IP: $IP)"
 89
 90        if qm status $VM_ID &>/dev/null; then
 91            echo "⚠️ VM $VM_ID already exists, removing..."
 92            qm stop $VM_ID &>/dev/null
 93            qm destroy $VM_ID
 94        fi
 95
 96        if ! qm clone $TEMPLATE_ID $VM_ID --full --name $VM_NAME --storage $STORAGE; then
 97            echo "❌ Failed to clone VM $VM_ID, skipping..."
 98            continue
 99        fi
100
101        qm set $VM_ID --memory $RAM_SIZE \
102                      --cores $CPU_CORES \
103                      --sockets $CPU_SOCKETS \
104                      --cpu host \
105                      --serial0 socket \
106                      --vga serial0 \
107                      --ipconfig0 ip=$IP,gw=$GATEWAY \
108                      --nameserver "$DNS_SERVERS" \
109                      --searchdomain "$DOMAIN_SEARCH" \
110                      --sshkey "$SSH_KEY_PATH"
111
112        qm set $VM_ID --delete ide2 || true
113        qm set $VM_ID --ide2 $STORAGE:cloudinit,media=cdrom
114        qm cloudinit update $VM_ID
115
116        echo "🔄 Resizing disk to $DISK_SIZE..."
117        qm resize $VM_ID scsi0 +$DISK_SIZE
118
119        qm start $VM_ID
120        echo "✅ VM $VM_ID ($VM_NAME) created and started!"
121    done
122    exit 0
123fi
124
125# === OPTION 3: DESTROY VMs ===
126if [[ "$ACTION" == "3" ]]; then
127    START_ID=$(get_input "Enter the starting VM ID to delete" "301")
128    NUM_VMS=$(get_input "Enter the number of VMs to delete" "6")
129
130    echo "⚠️ Destroying VMs from $START_ID to $((START_ID + NUM_VMS - 1))..."
131    for i in $(seq 0 $((NUM_VMS - 1))); do
132        VM_ID=$((START_ID + i))
133
134        if qm status $VM_ID &>/dev/null; then
135            echo "🛑 Stopping and destroying VM $VM_ID..."
136            qm stop $VM_ID &>/dev/null
137            qm destroy $VM_ID
138        else
139            echo "ℹ️ VM $VM_ID does not exist. Skipping..."
140        fi
141    done
142    echo "✅ Specified VMs have been destroyed."
143    exit 0
144fi

Verifying Your Deployment
#

After running the script under Option 2, you should see your new VMs listed in the Proxmox web interface. You can now log in via SSH from the machine that holds the corresponding private key

ssh ubuntu@k3s-master-01

Note: Adjust the hostname or IP as configured during the script prompts.

Chapter 3: Installing K3s with Ansible
#

This chapter will guide you through setting up K3s using Ansible on your Proxmox-based VMs. Ansible helps automate the process across multiple nodes, making the deployment faster and more reliable.


Prerequisites
#

  1. Ensure Ansible is installed on your management machine (Debian/Ubuntu or macOS):

    • Debian/Ubuntu:
      sudo apt update && sudo apt install -y ansible
      
    • macOS:
      brew install ansible
      
  2. Clone the k3s-ansible repository

    We will use Techno Tim’s k3s-ansible repository, but in this guide, we’ll use a forked version:

    git clone https://github.com/mer0x/k3s-ansible
    

Pre-Deployment Configuration
#

  1. Set up the Ansible environment:

    1   cd k3s-ansible
    2   cp ansible.example.cfg ansible.cfg
    3   ansible-galaxy install -r ./collections/requirements.yml
    4   cp -R inventory/sample inventory/my-cluster

  2. Edit inventory/my-cluster/hosts.ini

    Modify this file to match your cluster’s IP addresses. Example:

     1   [master]
     2   10.57.57.30
     3   10.57.57.31
     4   10.57.57.32
     5
     6   [node]
     7   10.57.57.33
     8   10.57.57.34
     9   10.57.57.35
    10
    11   [k3s_cluster:children]
    12   master
    13   node

  3. Edit inventory/my-cluster/group_vars/all.yml

    Some critical fields to modify:

    ansible_user:
    #

    • Default VM user is ubuntu with sudo privileges.

    system_timezone:
    #

    • Set to your local timezone (e.g., Europe/Bucharest).

    Networking (Calico vs. Flannel):
    #

    • Comment out #flannel_iface: eth0 and use calico_iface: "eth0" for better network policies.
    • Flannel is the simpler alternative if you prefer an easier setup.

    apiserver_endpoint: 10.57.57.100:
    #

    • Ensure this is an unused IP in your local network.
    • It serves as the VIP (Virtual IP) for the k3s control plane.

    k3s_token:
    #

    • Use any alphanumeric string.

    metal_lb_ip_range:
    #

    • 10.57.57.80-10.57.57.90
      • The IP belongs to your local network (LAN)
      • It’s not already in use by other network services
      • It’s outside your DHCP pool range to avoid conflicts
      • This setup enables exposing K3s container services to your network, similar to how Docker ports are exposed to their host IP.
Before running the next command, ensure SSH key authentication is set up between your management machine and all deployed VMs.

Deploy the Cluster
#

Run the following command to deploy the cluster:

ansible-playbook ./site.yml -i ./inventory/my-cluster/hosts.ini

Once the playbook execution completes, you can verify the cluster’s status:

1# Copy the kubeconfig file from the first master node
2scp [email protected]:~/.kube/config .
3
4# Move it to the correct location
5mkdir -p ~/.kube
6mv config ~/.kube/
7
8# Check if the cluster nodes are properly registered
9kubectl get nodes

If the setup was successful, kubectl get nodes should display the cluster’s nodes and their statuses.


What’s Next?
#

With K3s successfully deployed, the next steps involve setting up additional tools such as Rancher, Traefik, and Longhorn for cluster management, ingress control, and persistent storage.

Chapter 4: K3S Apps Deployment
#

Deploying Traefik
#

Install Helm Package Manager for Kubernetes
#

1curl -fsSL -o get_helm.sh https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3
2chmod 700 get_helm.sh
3./get_helm.sh

Create Namespace for Traefik
#

kubectl create namespace traefik

Add Helm Repository and Update
#

helm repo add traefik https://helm.traefik.io/traefik
helm repo update

Clone TechnoTim Launchpad Repository
#

git clone https://github.com/techno-tim/launchpad

Configure values.yaml for Traefik
#

Open the launchpad/kubernetes/traefik-cert-manager/ directory and check values.yaml. Most configurations are already set; you only need to specify the IP for the LoadBalancer service. Choose an IP from the MetalLB range defined in your setup here.

Install Traefik Using Helm
#

helm install --namespace=traefik traefik traefik/traefik --values=values.yaml

Verify Deployment
#

kubectl get svc --all-namespaces -o wide

Expected output:

1NAMESPACE          NAME                              TYPE           CLUSTER-IP      EXTERNAL-IP   PORT(S)                                    AGE     SELECTOR
2calico-system      calico-typha                      ClusterIP      10.43.80.131    <none>        5473/TCP                                   2d20h   k8s-app=calico-typha
3traefik            traefik                           LoadBalancer   10.43.185.67    10.57.57.80   80:32195/TCP,443:31598/TCP,443:31598/UDP   53s     app.kubernetes.io/instance=traefik,app.kubernetes.io/name=traefik

Apply Middleware
#

kubectl apply -f default-headers.yaml
kubectl get middleware

Expected output:

NAME              AGE
default-headers   4s

Deploying Traefik Dashboard
#

Install htpasswd
#

sudo apt-get update
sudo apt-get install apache2-utils

Generate a Base64-Encoded Credential
#

htpasswd -nb merox password | openssl base64

Configure DNS Resolver
#

Ensure that your DNS server points to the MetalLB IP specified in values.yaml here.

Example entry for pfSense DNS Resolver:

dns

routes:
  - match: Host(`traefik.k3s.merox.dev`)

Apply Kubernetes Resources
#

from traefik/dashboard folder

1kubectl apply -f secret-dashboard.yaml
2kubectl get secrets --namespace traefik
3kubectl apply -f middleware.yaml
4kubectl apply -f ingress.yaml

At this point, you should be able to access the DNS entry you created. However, it will use a self-signed SSL certificate generated by Traefik. In the next steps, we will configure Let’s Encrypt certificates using Cloudflare as the provider.


Deploying Cert-Manager
#

traefik-cert-manager/cert-manager folder

Add Jetstack Helm Repository
#

helm repo add jetstack https://charts.jetstack.io
helm repo update

Create Namespace for Cert-Manager
#

kubectl create namespace cert-manager

Apply CRDs (Custom Resource Definitions)
#

Note: Ensure you use the latest version of Cert-Manager.

kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.17.0/cert-manager.crds.yaml

Install Cert-Manager Using Helm
#

helm install cert-manager jetstack/cert-manager --namespace cert-manager --values=values.yaml --version v1.17.0

Apply Cloudflare API Secret
#

Make sure you generate the correct API token if using Cloudflare (use an API Token, not a global key).

kubectl apply -f issuers/secret-cf-token.yaml

Deploy Production Certificates
#

  • Fields to be edited before:

    issuers/letsencrypt-production.yaml

    • email, dnsZones

    certificates/production/your-domain-com.yaml

    • name, secretName, commonName, dnsNames
kubectl apply -f issuers/letsencrypt-production.yaml
kubectl apply -f certificates/production/your-domain-com.yaml

Verify Logs and Challenges
#

kubectl logs -n cert-manager -f cert-manager-(your-instance-name)
kubectl get challenges

With these steps completed, your K3s cluster now runs Traefik as an ingress controller, supports HTTPS with Let’s Encrypt, and manages certificates automatically. This setup ensures secure traffic routing and efficient load balancing for your Kubernetes applications.

traefik-k3s
✨ Nailed it!

Deploying Rancher
#

Add Rancher Helm Repository and Create Namespace
#

helm repo add rancher-latest https://releases.rancher.com/server-charts/stable
kubectl create namespace cattle-system

Since Traefik is already deployed, Rancher will utilize it for ingress. Deploy Rancher with Helm:

1helm install rancher rancher-stable/rancher \
2  --namespace cattle-system \
3  --set hostname=rancher.k3s.merox.dev \
4  --set tls=external \
5  --set replicas=3

Create Ingress for Rancher
#

Create an ingress.yml file with the following configuration:

 1apiVersion: traefik.io/v1alpha1
 2kind: IngressRoute
 3metadata:
 4  name: rancher
 5  namespace: cattle-system
 6spec:
 7  entryPoints:
 8    - websecure
 9  routes:
10    - match: Host(`rancher.k3s.merox.dev`)
11      kind: Rule
12      services:
13        - name: rancher
14          port: 443
15      middlewares:
16        - name: default-headers
17  tls:
18    secretName: k3s-merox-dev-tls

Apply the ingress configuration:

kubectl apply -f ingress.yml

Now, you should be able to manage your cluster from https://rancher.k3s.merox.dev.

rancher

Deploying Longhorn
#

If you want to use cloud-ready drive shared storage, follow these steps:

Install Required Packages
#

only on the VMs you want to deploy longhorn

sudo apt update && sudo apt install -y open-iscsi nfs-common

Enable iSCSI
#

sudo systemctl enable iscsid
sudo systemctl start iscsid

Add Longhorn Label on Nodes
#

A minimum of three nodes are required for High Availability. In this setup, we will use three worker nodes:

1kubectl label node k3s-worker-1 storage.longhorn.io/node=true 
2kubectl label node k3s-worker-2 storage.longhorn.io/node=true 
3kubectl label node k3s-worker-3 storage.longhorn.io/node=true

Deploy Longhorn
#

modified to use storage.longhorn.io/node=true label

kubectl apply -f https://raw.githubusercontent.com/mer0x/merox.docs/refs/heads/master/K3S/cluster-deployment/longhorn.yaml

Verify Deployment
#

kubectl get pods --namespace longhorn-system --watch

Print Confirmation#

kubectl get nodes
kubectl get svc -n longhorn-system

Exposing Longhorn with Traefik
#

Create Middleware Configuration
#

Create a middleware.yml file:

1apiVersion: traefik.io/v1alpha1
2kind: Middleware
3metadata:
4  name: longhorn-headers
5  namespace: longhorn-system
6spec:
7  headers:
8    customRequestHeaders:
9      X-Forwarded-Proto: "https"

Setup Ingress
#

Create an ingress.yml file:

 1apiVersion: networking.k8s.io/v1
 2kind: Ingress
 3metadata:
 4  name: longhorn-ingress
 5  namespace: longhorn-system
 6  annotations:
 7    traefik.ingress.kubernetes.io/router.entrypoints: websecure
 8    traefik.ingress.kubernetes.io/router.tls: "true"
 9    traefik.ingress.kubernetes.io/router.middlewares: longhorn-system-longhorn-headers@kubernetescrd
10spec:
11  rules:
12  - host: storage.k3s.merox.dev
13    http:
14      paths:
15      - path: /
16        pathType: Prefix
17        backend:
18          service:
19            name: longhorn-frontend
20            port:
21              number: 80
22  tls:
23  - hosts:
24    - storage.k3s.merox.dev
25    secretName: k3s-merox-dev-tls

longhorn

Using NFS Storage
#

If you want to use NFS storage in your cluster, follow this guide: Merox Docs - NFS Storage Guide

Monitoring Your Cluster
#

A great monitoring tool for your cluster is Netdata.

You can also try deploying Prometheus and Grafana from Rancher. However, if you don’t fine-tune the setup, you might notice a high resource usage due to the large number of queries processed by Prometheus.

Continuous Deployment with ArgoCD
#

ArgoCD is an excellent tool for continuous deployment. You can find more details here.

Upgrading Your Cluster
#

If you need to upgrade your cluster, I put some notes here: How to Upgrade K3s.

Final Thoughts
#

When I first deployed a K3s/RKE2 cluster (about a year ago), I struggled to find a single source of documentation that covered everything needed for at least a homelab, if not even for production use. Unfortunately, I couldn’t find anything comprehensive, so I decided to write this article to consolidate all the necessary information in one place.

If this guide helped you and you’d like to see more information added, please leave a comment, and I will do my best to update this post.

How Have You Deployed Your Clusters?
#

Let me know in the comments!

Special Thanks
#

Related

Deploying a Kubernetes-Based Media Server
·9 mins
Kubernetes Installation Homelab DevOps Tutorial Opensource Kubernetes Self-Hosting
For a long time, I’ve been on the hunt for a comprehensive and well-crafted tutorial to deploy a media server on my Kubernetes cluster.
Homelab as Code: Packer + Terraform + Ansible
·7 mins
Server Tutorial Homelab Automation Packer Terraform Ansible
Explore the evolution of a homelab, from its humble beginnings to a fully automated infrastructure. Learn how to implement cutting-edge tools and techniques for IT experimentation and skill-building.