Grafana + Prometheus Monitoring for Your Homelab

Uptime Kuma tells you when something is down. Grafana and Prometheus tell you everything else: CPU usage trends, memory pressure over time, disk filling up, container resource usage, network throughput. Once you have it running, you’ll start noticing things: a container that spikes memory every night, a disk that’s been filling at a steady rate for three months, a service that’s been quietly hitting 100% CPU for an hour every day.

This is not a quick ten-minute install. It’s more complex than a single-container deployment. But the stack is well-documented, the defaults are good, and the dashboards you get at the end are worth the setup time.

The Stack

Three components:

Prometheus: scrapes metrics from exporters on a schedule and stores them in a time-series database. It doesn’t visualize anything; it just collects and stores.

Node Exporter: runs on each machine you want to monitor and exposes system metrics (CPU, memory, disk, network) at an HTTP endpoint that Prometheus scrapes.

Grafana: connects to Prometheus as a data source and displays metrics as dashboards and graphs.

Directory Setup

mkdir -p /opt/monitoring/{prometheus,grafana}
mkdir -p /opt/monitoring/prometheus/config

Prometheus Configuration

Create /opt/monitoring/prometheus/config/prometheus.yml:

global:
  scrape_interval: 15s
  evaluation_interval: 15s

scrape_configs:
  - job_name: "prometheus"
    static_configs:
      - targets: ["localhost:9090"]

  - job_name: "node"
    static_configs:
      - targets: ["node-exporter:9100"]

  - job_name: "cadvisor"
    static_configs:
      - targets: ["cadvisor:8080"]

scrape_interval: 15s: Prometheus collects metrics from each target every 15 seconds. Fine for homelab use. Drop to 30s if you’re concerned about storage.

cadvisor: cAdvisor exposes per-container resource metrics (CPU, memory, network per container). Optional but useful if you want container-level visibility.

Docker Compose

Create /opt/monitoring/docker-compose.yml:

services:
  prometheus:
    image: prom/prometheus:latest
    container_name: prometheus
    volumes:
      - /opt/monitoring/prometheus/config:/etc/prometheus
      - prometheus_data:/prometheus
    command:
      - "--config.file=/etc/prometheus/prometheus.yml"
      - "--storage.tsdb.path=/prometheus"
      - "--storage.tsdb.retention.time=30d"
      - "--web.console.libraries=/usr/share/prometheus/console_libraries"
      - "--web.console.templates=/usr/share/prometheus/consoles"
    ports:
      - "9090:9090"
    restart: unless-stopped

  node-exporter:
    image: prom/node-exporter:latest
    container_name: node-exporter
    volumes:
      - /proc:/host/proc:ro
      - /sys:/host/sys:ro
      - /:/rootfs:ro
    command:
      - "--path.procfs=/host/proc"
      - "--path.sysfs=/host/sys"
      - "--collector.filesystem.mount-points-exclude=^/(sys|proc|dev|host|etc)($$|/)"
    ports:
      - "9100:9100"
    restart: unless-stopped

  cadvisor:
    image: gcr.io/cadvisor/cadvisor:latest
    container_name: cadvisor
    volumes:
      - /:/rootfs:ro
      - /var/run:/var/run:ro
      - /sys:/sys:ro
      - /var/lib/docker/:/var/lib/docker:ro
      - /dev/disk/:/dev/disk:ro
    ports:
      - "8080:8080"
    privileged: true
    devices:
      - /dev/kmsg
    restart: unless-stopped

  grafana:
    image: grafana/grafana:latest
    container_name: grafana
    volumes:
      - grafana_data:/var/lib/grafana
    environment:
      - GF_SECURITY_ADMIN_PASSWORD=changeme
      - GF_USERS_ALLOW_SIGN_UP=false
    ports:
      - "3000:3000"
    restart: unless-stopped

volumes:
  prometheus_data:
  grafana_data:

--storage.tsdb.retention.time=30d: Prometheus keeps 30 days of metrics. Increase if you want more history; decrease if you’re tight on disk. Each day of metrics for a typical homelab is roughly 50-100MB.

GF_SECURITY_ADMIN_PASSWORD: The Grafana admin password. Change this before starting.

GF_USERS_ALLOW_SIGN_UP=false: Disables public user registration. You don’t want anyone who can reach port 3000 to create a Grafana account.

Start the Stack

cd /opt/monitoring
docker compose up -d

Give it a minute to start. Prometheus begins scraping metrics immediately. Grafana is available at http://YOUR_SERVER_IP:3000.

Configure Grafana

Add Prometheus as a data source:

Settings (gear icon) > Data Sources > Add data source
Select Prometheus
URL: http://prometheus:9090
Click Save & Test. You should see “Data source is working”

The prometheus hostname works because both containers are on the same Docker network (the default bridge network Compose creates).

Import Dashboards

Building dashboards from scratch is possible but unnecessary. Grafana has a community dashboard library with pre-built dashboards for Node Exporter and cAdvisor.

Import Node Exporter Full dashboard (ID: 1860):

Dashboards > Import
Enter dashboard ID: 1860
Select your Prometheus data source
Click Import

You now have a comprehensive system dashboard showing CPU usage, memory, disk I/O, network throughput, and system load, all historical with time range controls.

Import Docker/cAdvisor dashboard (ID: 193 or 14282):

Same process, dashboard ID 193. This shows per-container CPU and memory usage, network I/O per container, and overall Docker resource consumption.

Monitoring Multiple Hosts

To monitor more than one machine, install Node Exporter on each additional host:

# On the additional host
docker run -d \
  --name node-exporter \
  --restart unless-stopped \
  -v /proc:/host/proc:ro \
  -v /sys:/host/sys:ro \
  -v /:/rootfs:ro \
  -p 9100:9100 \
  prom/node-exporter:latest \
  --path.procfs=/host/proc \
  --path.sysfs=/host/sys

Then add the new host to your Prometheus config:

  - job_name: "node"
    static_configs:
      - targets:
          - "node-exporter:9100"
          - "192.168.1.101:9100"
          - "192.168.1.102:9100"

Restart Prometheus:

cd /opt/monitoring
docker compose restart prometheus

The Node Exporter Full dashboard has a host selector dropdown. All your machines appear there automatically once Prometheus is scraping them.

Setting Up Alerts

Grafana can send alerts when metrics cross thresholds. Go to Alerting > Contact Points to configure a notification channel (Telegram, Slack, email, etc.). Then edit any panel and add an alert rule.

For homelab use, useful alerts:

Disk over 80% full
Memory sustained above 90% for 10 minutes
Any container down for more than 5 minutes (combine with Uptime Kuma for this)

Start simple. Add one or two alerts. Don’t build a complex alerting tree; you’ll tune it down to nothing just to stop the noise.

Backups

Back up the named volumes:

docker run --rm \
  -v prometheus_data:/source:ro \
  -v /backup:/backup \
  alpine tar -czf /backup/prometheus-data-$(date +%Y%m%d).tar.gz /source

docker run --rm \
  -v grafana_data:/source:ro \
  -v /backup:/backup \
  alpine tar -czf /backup/grafana-data-$(date +%Y%m%d).tar.gz /source

The Prometheus data is the time-series metrics history. The Grafana data contains your dashboards, users, and data source configurations. Both are worth keeping.

Once this stack is running, pair it with Uptime Kuma for service availability monitoring. Uptime Kuma handles “is it up?” while Grafana handles everything else. For container-level health and keeping your images current, Watchtower automates the update side. See everything running in this homelab at /stack/.

One thing to keep an eye on once you’re watching real numbers: the power draw your Grafana dashboard will start revealing. If you’re running on a mini PC, Node Exporter won’t show you wattage directly, but CPU idle percentages and container resource usage give you a clear picture of what’s actually doing work. The low-power homelab guide covers how to build a setup that runs 15+ services under 25W from the start, which is the kind of thing that matters once you see your first month of real metrics.