Skip to main content
Overview

Running OpenClaw as a Homelab Infrastructure Agent

Merox
Merox HPC Sysadmin
4 min read
AI Intermediate

At 2am when a pod crashes, I don’t want to open a laptop. I want to send a message on Telegram and get an answer.

A few weeks ago I wrote about deploying OpenClaw on Proxmox — that post was mostly exploratory, testing the install process and seeing how it behaves as a generalist assistant. What I actually wanted was to replace merox-agent — my custom Python bot that handles the actual infrastructure.

merox-agent worked fine. But maintaining a custom Telegram bot, FastAPI server, session management, and systemd service across server migrations gets old. This consolidates all of it into one thing.


What it does

Send a message, get a result:

  • what pods are down?kubectl get pods -A, filtered
  • show traefik logs → last 50 lines
  • reconcile flux-system → flux reconcile, output back to you
  • how much disk is left? → df -h, memory, load
  • what changed in the infra repo? → git log
  • restart jellyfin → asks for confirmation, then does it

It responds in whatever language you write in.

Infrastructure status overview from Telegram
Infrastructure status
Jellyfin logs check from Telegram
Jellyfin logs
Confirmation before destructive operation
Confirmation before destructive ops

How it’s set up

Phone / Laptop
└── Telegram ──────────────────────► OpenClaw Gateway (loopback:18789)
Tailscale Serve (tailnet)
Claude API (Anthropic)
infra skill (SKILL.md)
┌──────────────────┴──────────────────┐
kubectl / flux docker / git

OpenClaw runs directly on the VPS. The Gateway binds to loopback only — no public port. Tailscale Serve exposes the Control UI on my tailnet. The Telegram bot connects directly to Telegram’s API.

This instance runs directly on the VPS with shell access to the infra. That’s it — single purpose, nothing else.


Prerequisites

Warning

OpenClaw uses the Anthropic API directly — not the Claude Code CLI OAuth flow that merox-agent used. You need a separate API key and pay per token. For a personal infra assistant with a few queries a day it’s around $1–3/month on claude-sonnet-4-6.


Setup

1. Install Node.js 24 + OpenClaw

Terminal window
curl -fsSL https://deb.nodesource.com/setup_24.x | sudo -E bash -
sudo apt install -y nodejs
sudo npm install -g openclaw@latest
openclaw --version

2. Configure secrets

Terminal window
mkdir -p ~/.openclaw && chmod 700 ~/.openclaw
cat > ~/.openclaw/.env << 'EOF'
ANTHROPIC_API_KEY=sk-ant-...
TELEGRAM_BOT_TOKEN=123456:AAAA...
TELEGRAM_USER_ID=123456789
EOF
chmod 600 ~/.openclaw/.env

To find your Telegram user ID: start OpenClaw, send a message to your bot, check the gateway logs with journalctl -u openclaw-gateway -f and look for from.id.

3. Configure the Gateway

Create ~/.openclaw/openclaw.json:

{
agents: {
defaults: {
model: {
primary: "anthropic/claude-sonnet-4-6",
},
workspace: "/root/.openclaw/workspace",
},
},
channels: {
telegram: {
botToken: "${TELEGRAM_BOT_TOKEN}",
dmPolicy: "allowlist",
allowFrom: ["${TELEGRAM_USER_ID}"],
},
},
gateway: {
bind: "loopback",
auth: {
mode: "token",
},
tailscale: {
mode: "serve",
},
},
}
Terminal window
chmod 600 ~/.openclaw/openclaw.json

dmPolicy: "allowlist" with your numeric user ID means only you can talk to it. Same as the old TELEGRAM_USER_ID check in merox-agent.

4. Write the infra skill

Skills are Markdown files that tell the agent what it’s managing. This is the direct equivalent of prompt.py from merox-agent — same idea, different format.

Create ~/.openclaw/workspace/skills/infra/SKILL.md:

---
name: infra
description: Kubernetes and Docker infrastructure management for merox.dev homelab
---
You are an infrastructure assistant for merox.dev homelab.
## Kubernetes Cluster
Stack: Talos Linux nodes + FluxCD GitOps.
Manifests: /srv/kubernetes/infrastructure
Common operations:
- kubectl get nodes
- kubectl get pods -A
- flux check
- cd /srv/kubernetes/infrastructure && task reconcile
## VPS Docker Services
Managed via Ansible. Services: Traefik, Pi-hole, Portainer, Homepage, Netdata, Garage S3.
- cd /srv/kubernetes/infrastructure/cloudlab-infrastructure && make health-check
- make setup # full redeploy, idempotent
## Rules
- Confirm before destructive operations (delete, reset, drain, rollout)
- Prefer GitOps over direct kubectl apply for persistent changes
- Check pod health before and after changes
- Never touch: age.key, *.sops.yaml, kubeconfig, .env files

Verify it loaded:

Terminal window
openclaw skills list | grep infra

The skill is versioned in the same git repo as the infra manifests it describes. Moving servers means copying one Markdown file.

5. Start as a system service

Terminal window
cat > /etc/systemd/system/openclaw-gateway.service << 'EOF'
[Unit]
Description=OpenClaw Gateway
After=network-online.target tailscaled.service
Wants=network-online.target
[Service]
Type=simple
User=root
EnvironmentFile=/root/.openclaw/.env
ExecStart=/usr/bin/openclaw gateway --port 18789
Restart=on-failure
RestartSec=5
WorkingDirectory=/root
[Install]
WantedBy=multi-user.target
EOF
systemctl daemon-reload
systemctl enable openclaw-gateway
systemctl start openclaw-gateway

6. Security audit

Terminal window
openclaw security audit
openclaw doctor

For a single-user allowlist setup you should see 0 critical issues. Fix any warnings before leaving it running.

7. Tailscale Serve

Exposes the Control UI on your tailnet — not the public internet:

Terminal window
tailscale serve https / proxy 18789

Skills vs prompt.py

The old merox-agent had a prompt.py that built a system prompt injected into every Claude session. It hardcoded server IPs, service names, kubeconfig paths, and rules.

OpenClaw’s SKILL.md does the same thing. The difference: it’s a file you can edit without touching the agent, it’s hot-reloaded without a service restart, and it lives in git alongside the manifests it describes.


merox-agent vs OpenClaw

I used merox-agent for months and it worked. Here’s what actually changed:

Better:

  • No Telegram bot boilerplate to maintain (grammY session handling, polling, message editing)
  • Security model is explicit and auditable (openclaw security audit)
  • Active project with regular releases — someone else fixes the bugs

Worse:

  • Requires a separate Anthropic API key — merox-agent used Claude Code’s OAuth, included in Claude Pro. Real cost difference, worth knowing upfront
  • Heavier install (Node.js ecosystem) vs merox-agent’s handful of Python packages

Same:

  • Single tool for all infra ops (kubectl, docker, flux, git)
  • Tailscale for network isolation
  • Confirm before destructive ops

If you want zero extra API cost and don’t mind maintaining a custom bot, merox-agent is still valid. If you want something that survives server migrations without code changes, OpenClaw is worth the few dollars a month.


What to back up

  • Agent config template + skill → in infra git repo ✅
  • Gateway secrets → ~/.openclaw/.env ⚠️ back up manually
  • AGE key → /srv/kubernetes/infrastructure/age.key ⚠️ losing it = losing all SOPS secrets

Rebuilding from scratch: install openclaw, copy config from repo, restore .env, start service.


Config template and infra skill: github.com/meroxdotdev/infrastructure/agent/


Want to see how OpenClaw installs and behaves as a generalist assistant? See the Proxmox deployment guide.

Share this post

Loading comments...