KubeLedger

KubeLedger is the System of Record that tracks the full picture of Kubernetes costs — revealing the 30% hidden in non-allocatable overhead for precise, per-namespace accounting.

Note: KubeLedger was formerly known as Kubernetes Opex Analytics aka kube-opex-analytics. Read more about this change in our announcement blog post. To handle the migration in a straightforward way, we have provided a migration procedure.

Overview
Key Features
Quick Start
Architecture
Documentation
Configuration
Troubleshooting
License
Support & Contributions

Overview

KubeLedger is a usage accounting tool that helps organizations track, analyze, and optimize CPU, Memory, and GPU resources on Kubernetes clusters over time (hourly, daily, monthly).

It acts as a System of Record for your cluster resources, providing insightful usage analytics and charts that engineering and financial teams can use as key indicators for cost optimization decisions.

Tracked Resources

CPU - Core usage and requests per namespace
Memory - RAM consumption and requests per namespace
GPU - NVIDIA GPU utilization via DCGM integration

kubeledger-overview

Multi-cluster Integration: KubeLedger tracks usage for a single Kubernetes cluster. For centralized multi-cluster analytics, see Krossboard Kubernetes Operator (demo video).

Key Features

| Feature | Description | |---------|-------------| | Hourly/Daily/Monthly Trends | Tracks actual usage and requested capacities per namespace, collected every 5 minutes and consolidated hourly | | Non-allocatable Capacity Tracking | Highlights system overhead (OS, kubelets) vs. usable application capacity at node and cluster levels | | Cluster Capacity Planning | Visualize consumed capacity globally, instantly, and over time | | Usage Efficiency Analysis | Compare resource requests against actual usage to identify over/under-provisioning | | Cost Allocation & Chargeback | Automatic resource usage accounting per namespace for billing and showback | | Prometheus Integration | Native exporter at /metrics for Grafana dashboards and alerting |

Quick Start

Prerequisites

Kubernetes cluster v1.19+ (or OpenShift 4.x+)
kubectl configured with cluster access
Helm 3.x (fine-tuned installation) or kubectl for a basic opinionated deployment
Cluster permissions: read access to pods, nodes, and namespaces
Kubernetes Metrics Server deployed in your cluster (required for CPU and memory metrics)
NVIDIA DCGM Exporter deployed in your cluster (required for GPU metrics, optional if no GPUs)

Verify Metrics Server

Before installing, ensure metrics-server is running in your cluster:

# Check if metrics-server is deployed
kubectl -n kube-system get deploy | grep metrics-server

# Verify it's working
kubectl top nodes

# If not installed, deploy with kubectl
kubectl apply -f [https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml](https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml)

Verify DCGM Exporter (GPU metrics)

If your cluster has NVIDIA GPUs and you want GPU metrics, ensure DCGM Exporter is running:

# Check if DCGM Exporter is deployed
kubectl get daemonset -A | grep dcgm

# If not installed, deploy with Helm (requires NVIDIA GPU Operator or drivers)
helm repo add gpu-helm-charts https://nvidia.github.io/dcgm-exporter/helm-charts
helm install dcgm-exporter gpu-helm-charts/dcgm-exporter \
  --namespace gpu-operator \
  --create-namespace

Clone the Repository

git clone https://github.com/realopslabs/kubeledger.git --depth=1
cd kubeledger

Installation on Kubernetes/OpenShift Cluster

Install with Kustomize (Fast Path)

OpenShift users: Skip this section and use Helm installation with OpenShift-specific settings.

# Create namespace
kubectl create namespace kubeledger

# Deploy using Kustomize
kubectl apply -k ./manifests/kubeledger/kustomize -n kubeledger

# Watch pod status
kubectl get pods -n kubeledger -w

Install with Helm (Advanced)

The following steps covers the following scenarios of advanced customization (see manifests/kubeledger/helm/values.yaml for more options):

OpenShift: Set securityContext.openshift: true
Custom storage: Set dataVolume.storageClass and dataVolume.capacity
DCGM Integration: Set dcgm.enable: true and dcgm.endpoint

# Create namespace
kubectl create namespace kubeledger

# Install with Helm on Kubernetes
helm upgrade --install kubeledger ./manifests/kubeledger/helm -n kubeledger

# Install with Helm on Kubernetes with GPU support
helm upgrade --install kubeledger ./manifests/kubeledger/helm -n kubeledger \
  --set dcgm.enable=true \
  --set dcgm.endpoint="dcgm-exporter.monotiring.svc.cluster.local:9400"

# Install with Helm on OpenShift
helm upgrade --install kubeledger ./manifests/kubeledger/helm -n kubeledger --set securityContext.openshift=true

# Install with Helm on OpenShift with GPU support
helm upgrade --install kubeledger ./manifests/kubeledger/helm -n kubeledger \
  --set securityContext.openshift=true \
  --set dcgm.enable=true \
  --set dcgm.endpoint="dcgm-exporter.monotiring.svc.cluster.local:9400"

# Watch pod status
kubectl get pods -n kubeledger -w

Access the Dashboard on Kubernetes/OpenShift Cluster

# Port-forward to access the UI
kubectl port-forward svc/kubeledger 5483:80 -n kubeledger

# Open http://localhost:5483 in your browser

Installation on Local Machine

Install with Docker

Requires kubectl proxy running locally to provide API access:

# Start kubectl proxy in background
kubectl proxy &

# Run KubeLedger
docker run -d \
  --net="host" \
  --name kubeledger \
  -v /var/lib/kubeledger:/data \
  -e KL_DB_LOCATION=/data/db \
  -e KL_K8S_API_ENDPOINT=http://127.0.0.1:8001 \
  ghcr.io/realopslabs/kubeledger

Access the Dashboard on Local Machine

The dashboard is available at http://localhost:5483.

Architecture

┌───────────────────┐
│  Metrics Server   │──┐
│  (CPU/Memory)     │  │    ┌───────────────────────────────────────┐
└───────────────────┘  ├───>│         KubeLedger                    │
┌───────────────────┐  │    │  ┌─────────┐  ┌────────┐  ┌─────────┐ │
│  DCGM Exporter    │──┘    │  │ Poller  │─>│RRD DBs │─>│ API     │ │
│  (GPU metrics)    │       │  │ (5 min) │  │        │  │         │ │
└───────────────────┘       │  └─────────┘  └────────┘  └────┬────┘ │
                            └────────────────────────────────┼──────┘
                                                             │
                            ┌────────────────────────────────┼───────┐
                            │                                v       │
                            │  ┌────────────┐    ┌──────────────┐    │
                            │  │  Web UI    │    │  /metrics    │    │
                            │  │  (D3.js)   │    │ (Prometheus) │    │
                            │  └────────────┘    └──────────────┘    │
                            └────────────────────────────────────────┘
                                     │                  │
                                     v                  v
                              Built-in Dashboards   Grafana/Alerting

Data Flow:

Metrics polled every 5 minutes (configurable):
- CPU/Memory from Kubernetes Metrics Server
- GPU from NVIDIA DCGM Exporter
Metrics are processed and stored in internal lightweight time-series databases (round-robin DBs)
Data is consolidated into hourly, daily, and monthly aggregates
API serves data to the built-in web UI and Prometheus scraper

Documentation

| Topic | Link | |-------|------| | Installation on Kubernetes and OpenShift | https://kubeledger.io/docs/installation-on-kubernetes-and-openshift/ | | Installation on Docker | https://kubeledger.io/docs/installation-on-docker/ | | Built-in Dashboards and Charts of KubeLedger | https://kubeledger.io/docs/built-in-dashboards-and-charts/ | | Prometheus Exporter and Grafana dashboards | https://kubeledger.io/docs/prometheus-exporter-grafana-dashboard/ | | KubeLedger Configuration Settings | https://kubeledger.io/docs/configuration-settings/ | | Design Fundamentals | https://kubeledger.io/docs/design-fundamentals/ |

Configuration

Migration Note: All environment variables now use the KL_ prefix. Old KOA_ variables are deprecated but will be supported for backward compatibility for 6 months.

Key environment variables:

| Variable | Description | Default | |----------|-------------|---------| | KL_K8S_API_ENDPOINT | Kubernetes API server URL | Required | | KL_K8S_AUTH_TOKEN | Service account token | Auto-detected in-cluster |

Kubeledger

Install / Use

README