Koku

An open source solution for cost management of cloud and hybrid cloud environments.

Generate Convert Improve

Install / Use

/learn @project-koku/Koku

About this skill

Quality Score

0/100

README

Koku README

About

Koku's goal is to provide an open source solution for cost management of cloud and hybrid cloud environments. This solution is offered via a web interface that exposes resource consumption and cost data in easily digestible and filterable views. The project also aims to provide insight into this data and ultimately provide suggested optimizations for reducing cost and eliminating unnecessary resource usage.

Full documentation is available in docs folder.

To submit an issue please visit https://issues.redhat.com/projects/COST/.

Getting Started

This project is developed using Python 3.11. Make sure you have at least this version installed.

Prerequisites

Docker or Rancher Desktop
(macOS only) Install Homebrew

Development

To get started developing against Koku you first need to clone a local copy of the git repositories.:

git clone https://github.com/project-koku/koku
git clone https://github.com/project-koku/nise

This project is developed using the Django web framework. Many configuration settings can be read in from a .env file. To configure, do the following:

Copy .env.example into a .env and update the following in your .env:
```
AWS_RESOURCE_NAME=YOUR_COST_MANAGEMENT_AWS_ARN
```
For on-premise deployments without access to Red Hat SaaS services, also set:
```
ONPREM=True
```
Copy dev/credentials/aws.example into dev/credentials/aws, obtain AWS credentials, then update the credentials file:
```
[default]
aws_access_key_id=YOUR_AWS_ACCESS_KEY_ID
aws_secret_access_key=YOUR_AWS_SECRET_ACCESS_KEY
```
(macOS only) Install libraries for building wheels on ARM:
```
brew install openssl librdkafka postgresql@16
```

(Fedora only) Install libraries for building wheels on Linux:

dnf install openssl-devel libpq-devel postgresql golang-sigs-k8s-kustomize

(macOS only) Also add the following to your .env or shell profile:

LDFLAGS="-L$(brew --prefix openssl)/lib -L$(brew --prefix librdkafka)/lib"
CPPFLAGS="-I$(brew --prefix openssl)/include -I$(brew --prefix librdkafka)/include"
PATH="$PATH:$(brew --prefix postgresql@16)/bin"

Developing inside a virtual environment is recommended. A Pipfile is provided. Pipenv is recommended for combining virtual environment and dependency management. To install pipenv:
```
pip3 install pipenv
```
Then project dependencies and a virtual environment can be created using:
```
pipenv install --dev
```
To activate the virtual environment run:
```
pipenv shell
```
Install the pre-commit hooks for the repository:
```
pre-commit install
```

Developing with Docker Compose

This will explain how to start the server and its dependencies using Docker (or Rancher Desktop), create AWS/OCP sources, and view reports. This will not cover all API or scenarios but should give you an end-to-end flow.

Starting Koku using Docker Compose

Note

In order for the koku_base image to build correctly, buildkit must be enabled by setting DOCKER_BUILDKIT=1. This is set in the .env file, but if you are having issues building the koku_base image, make sure buildkit is enabled.

Start the containers:
```
make docker-up-min
```
Display log output from the containers. It is recommended that logs be kept in a second terminal:
```
make docker-logs
```

With all containers running any source added will be processed by saving CSV files in MinIO and storing Parquet files in MinIO. The source's data will be summarized via Trino. Summarized data will land in the appropriate daily_summary table for the source type for consumption by the API.

Multi-Worker Support

Koku supports running multiple Celery workers locally:

# Start with multiple workers (e.g., 3 workers)
make docker-up-min scale=3

# Or with Trino
make docker-up-min-trino-no-build scale=3

# View logs from all workers
make docker-logs

The scale parameter works with any docker-compose target:

scale=1 (default): Single worker with debug port available for VSCode debugging
scale>1: Multiple workers without debug port (prevents port conflicts)

Note: Debug port 5678 is automatically managed - included for single worker (scale=1), excluded for multi-worker to prevent conflicts.

To add test sources and data:

make create-test-customer
make load-test-customer-data # Optional parameters: start={start_date} end={end_date} test_source=AWS

The MinIO UI will be available at http://127.0.0.1:9090/minio/. Use the S3_ACCESS_KEY and S3_SECRET set in your .env file as login credentials.

The Trinio UI will be available at http://127.0.0.1:8080/ui/. Details can be found there on queries. This is particularly useful for troubleshooting failures.

Access the Trino CLI using the following command:

docker exec -it trino trino --server 127.0.0.1:8080 --catalog hive --schema org1234567 --user admin --debug

Example usage:

SHOW tables;
SELECT * from aws_line_items WHERE source='{source}' AND year='2023' AND month='02' LIMIT 100;

Run AWS Scenario

Create AWS Source:

make aws-source aws_name=AWS-SOURCE-001 bucket=cost-usage-bucket

Verify source exists by visiting

http://127.0.0.1:8000/api/cost-management/v1/sources/?name=AWS-SOURCE-001
Trigger MASU processing by visiting

http://127.0.0.1:5042/api/cost-management/v1/download/
Wait for processing to complete
Verify data existing using AWS API endpoints
- http://127.0.0.1:8000/api/cost-management/v1/reports/aws/instance-types/
- http://127.0.0.1:8000/api/cost-management/v1/reports/aws/costs/
- http://127.0.0.1:8000/api/cost-management/v1/reports/aws/storage/

Run OCP Scenario

Create OCP Source:

make ocp-source-from-yaml cluster_id=my_test_cluster srf_yaml=../nise/example_ocp_static_data.yml ocp_name=my_ocp_name

Verify provider exists by visiting

http://127.0.0.1:8000/api/cost-management/v1/sources/?name=my_ocp_name
Trigger MASU processing by visiting

http://127.0.0.1:5042/api/cost-management/v1/download/
Wait for processing to complete
Verify data exists using API endpoints
- http://127.0.0.1:8000/api/cost-management/v1/reports/openshift/volumes/
- http://127.0.0.1:8000/api/cost-management/v1/reports/openshift/memory/
- http://127.0.0.1:8000/api/cost-management/v1/reports/openshift/compute/

Run GCP Scenario

Set Environment variables:

GCP_DATASET - The name of the BigQuery dataset in your GCP setup.
GCP_TABLE_ID - The identifier for the table you are pulling for the billing information.
GCP_PROJECT_ID - The identifier for the GCP project.

Create GCP source:
```
make gcp-source gcp_name=my_gcp_source
```
Verify provider exists by visiting

http://127.0.0.1:8000/api/cost-management/v1/sources/?name=my_gcp_source

Stopping Koku using Docker Compose

To bring down all the docker containers, run the following command:

make docker-down

Database

PostgreSQL is used as the database backend for Koku. A docker compose file is provided for creating a local database container. Assuming the default .env file values are used, to access the database directly using psql run:

PGPASSWORD=postgres psql postgres -U postgres -h 127.0.0.1 -p 15432

Note

There is a known limitation with Docker Compose and Linux environments with SELinux enabled. You may see the following error during the postgres container deployment:
  "mkdir: cannot create directory '/var/lib/pgsql/data/userdata': Permission denied" can be resolved by granting ./pg_data ownership permissions to uid:26 (postgres user in centos/postgresql-96-centos7)
If you see this error, run the following command (assuming you are at the project top level directory):
  setfacl -m u:26:-wx ./pg_data

Database Query Monitoring

A basic level of query monitoring has been included leveraging a local grafana container which will be built with the docker-up make target.

To use the monitor, open a new web browser tab or window and enter the following URL: http://127.0.0.1:3001

You will be presented with the grafana login page. For this monitor, use the following credentials:

User: admin
Password: admin12

Once you have logged into the server, you will be taken straight to the main dashboard. It will have 5 panels.

<table> <tr> <td colspan="2">Query statistics</td> </tr> <tr> <td>Connect States</td><td>Active Queries</td> </tr> <tr> <td>Lock Types</td><td>Lock Detail</td> </tr> </table>

Query Statistics - The max execution time, the mean execution time, number of calls and the query text
Connect States - Shows the connection states (active, idle, idle in transaction, etc)
Active Queries - Shows the approximate run time (based on the probe time) and the query text of queries detected
Lock Types - Shows the discrete lock types detected during the probe
Lock Detail - Shows any detail informaiton for the lock and the affected query.

The Query Statistics panel is cumulative. The remaining panels are ephemeral.

Information about PostgreSQL statistics can be found here: https://www.postgresql.org/docs/12/monitoring-stats.html

Information about Grafana dashboards can be found here: https://grafana.com/docs/grafana/latest/features/dashboard/dashboards/

Related Skills

node-connect

339.3k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

83.9k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

339.3k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

commit-push-pr

83.9k

Commit, push, and open a PR