Grafana + Prometheus Monitoring for Your Homelab
Set up Prometheus to collect metrics and Grafana to visualize them: CPU, memory, disk, and container stats across your entire homelab.
Uptime Kuma tells you when something is down. Grafana and Prometheus tell you everything else — CPU usage trends, memory pressure over time, disk filling up, container resource usage, network throughput. Once you have it running, you’ll start noticing things: a container that spikes memory every night, a disk that’s been filling at a steady rate for three months, a service that’s been quietly hitting 100% CPU for an hour every day.
This is not a quick ten-minute install. It’s more complex than a single-container deployment. But the stack is well-documented, the defaults are good, and the dashboards you get at the end are worth the setup time.
The Stack
Three components:
Prometheus — scrapes metrics from exporters on a schedule and stores them in a time-series database. It doesn’t visualize anything — it just collects and stores.
Node Exporter — runs on each machine you want to monitor and exposes system metrics (CPU, memory, disk, network) at an HTTP endpoint that Prometheus scrapes.
Grafana — connects to Prometheus as a data source and displays metrics as dashboards and graphs.
Directory Setup
mkdir -p /opt/monitoring/{prometheus,grafana}
mkdir -p /opt/monitoring/prometheus/config
Prometheus Configuration
Create /opt/monitoring/prometheus/config/prometheus.yml:
global:
scrape_interval: 15s
evaluation_interval: 15s
scrape_configs:
- job_name: "prometheus"
static_configs:
- targets: ["localhost:9090"]
- job_name: "node"
static_configs:
- targets: ["node-exporter:9100"]
- job_name: "cadvisor"
static_configs:
- targets: ["cadvisor:8080"]
scrape_interval: 15s — Prometheus collects metrics from each target every 15 seconds. Fine for homelab use. Drop to 30s if you’re concerned about storage.
cadvisor — cAdvisor exposes per-container resource metrics (CPU, memory, network per container). Optional but useful if you want container-level visibility.
Docker Compose
Create /opt/monitoring/docker-compose.yml:
services:
prometheus:
image: prom/prometheus:latest
container_name: prometheus
volumes:
- /opt/monitoring/prometheus/config:/etc/prometheus
- prometheus_data:/prometheus
command:
- "--config.file=/etc/prometheus/prometheus.yml"
- "--storage.tsdb.path=/prometheus"
- "--storage.tsdb.retention.time=30d"
- "--web.console.libraries=/usr/share/prometheus/console_libraries"
- "--web.console.templates=/usr/share/prometheus/consoles"
ports:
- "9090:9090"
restart: unless-stopped
node-exporter:
image: prom/node-exporter:latest
container_name: node-exporter
volumes:
- /proc:/host/proc:ro
- /sys:/host/sys:ro
- /:/rootfs:ro
command:
- "--path.procfs=/host/proc"
- "--path.sysfs=/host/sys"
- "--collector.filesystem.mount-points-exclude=^/(sys|proc|dev|host|etc)($$|/)"
ports:
- "9100:9100"
restart: unless-stopped
cadvisor:
image: gcr.io/cadvisor/cadvisor:latest
container_name: cadvisor
volumes:
- /:/rootfs:ro
- /var/run:/var/run:ro
- /sys:/sys:ro
- /var/lib/docker/:/var/lib/docker:ro
- /dev/disk/:/dev/disk:ro
ports:
- "8080:8080"
privileged: true
devices:
- /dev/kmsg
restart: unless-stopped
grafana:
image: grafana/grafana:latest
container_name: grafana
volumes:
- grafana_data:/var/lib/grafana
environment:
- GF_SECURITY_ADMIN_PASSWORD=changeme
- GF_USERS_ALLOW_SIGN_UP=false
ports:
- "3000:3000"
restart: unless-stopped
volumes:
prometheus_data:
grafana_data:
--storage.tsdb.retention.time=30d — Prometheus keeps 30 days of metrics. Increase if you want more history; decrease if you’re tight on disk. Each day of metrics for a typical homelab is roughly 50–100MB.
GF_SECURITY_ADMIN_PASSWORD — The Grafana admin password. Change this before starting.
GF_USERS_ALLOW_SIGN_UP=false — Disables public user registration. You don’t want anyone who can reach port 3000 to create a Grafana account.
Start the Stack
cd /opt/monitoring
docker compose up -d
Give it a minute to start. Prometheus begins scraping metrics immediately. Grafana is available at http://YOUR_SERVER_IP:3000.
Configure Grafana
Log in to Grafana with admin and the password you set.
Add Prometheus as a data source:
- Settings (gear icon) > Data Sources > Add data source
- Select Prometheus
- URL:
http://prometheus:9090 - Click Save & Test — you should see “Data source is working”
The prometheus hostname works because both containers are on the same Docker network (the default bridge network Compose creates).
Import Dashboards
Building dashboards from scratch is possible but unnecessary. Grafana has a community dashboard library with pre-built dashboards for Node Exporter and cAdvisor.
Import Node Exporter Full dashboard (ID: 1860):
- Dashboards > Import
- Enter dashboard ID:
1860 - Select your Prometheus data source
- Click Import
You now have a comprehensive system dashboard showing CPU usage, memory, disk I/O, network throughput, and system load — all historical, with time range controls.
Import Docker/cAdvisor dashboard (ID: 193 or 14282):
Same process, dashboard ID 193. This shows per-container CPU and memory usage, network I/O per container, and overall Docker resource consumption.
Monitoring Multiple Hosts
To monitor more than one machine, install Node Exporter on each additional host:
# On the additional host
docker run -d \
--name node-exporter \
--restart unless-stopped \
-v /proc:/host/proc:ro \
-v /sys:/host/sys:ro \
-v /:/rootfs:ro \
-p 9100:9100 \
prom/node-exporter:latest \
--path.procfs=/host/proc \
--path.sysfs=/host/sys
Then add the new host to your Prometheus config:
- job_name: "node"
static_configs:
- targets:
- "node-exporter:9100"
- "192.168.1.101:9100"
- "192.168.1.102:9100"
Restart Prometheus:
cd /opt/monitoring
docker compose restart prometheus
The Node Exporter Full dashboard has a host selector dropdown — all your machines appear there automatically once Prometheus is scraping them.
Setting Up Alerts
Grafana can send alerts when metrics cross thresholds. Go to Alerting > Contact Points to configure a notification channel (Telegram, Slack, email, etc.). Then edit any panel and add an alert rule.
For homelab use, useful alerts:
- Disk over 80% full
- Memory sustained above 90% for 10 minutes
- Any container down for more than 5 minutes (combine with Uptime Kuma for this)
Start simple. Add one or two alerts. Don’t build a complex alerting tree — you’ll tune it down to nothing just to stop the noise.
Backups
Back up the named volumes:
docker run --rm \
-v prometheus_data:/source:ro \
-v /backup:/backup \
alpine tar -czf /backup/prometheus-data-$(date +%Y%m%d).tar.gz /source
docker run --rm \
-v grafana_data:/source:ro \
-v /backup:/backup \
alpine tar -czf /backup/grafana-data-$(date +%Y%m%d).tar.gz /source
The Prometheus data is the time-series metrics history. The Grafana data contains your dashboards, users, and data source configurations. Both are worth keeping.