Bolna

Conversational voice AI agents

Generate Convert Improve

Install / Use

/learn @bolna-ai/Bolna

About this skill

Quality Score

0/100

README

<h1 align="center"> </h1> <p align="center"> <p align="center"><b>End-to-end open-source voice agents platform</b>: Quickly build voice firsts conversational assistants through a json. </p> </p> <h4 align="center"> <a href="https://discord.gg/59kQWGgnm8">Discord</a> | <a href="https://docs.bolna.ai">Hosted Docs</a> | <a href="https://bolna.ai">Website</a> </h4> <h4 align="center"> <a href="https://discord.gg/59kQWGgnm8"> <img src="https://img.shields.io/static/v1?label=Chat%20on&message=Discord&color=blue&logo=Discord&style=flat-square" alt="Discord"> </a> <a href="https://github.com/bolna-ai/bolna/blob/main/LICENSE"> <img src="https://img.shields.io/badge/license-MIT-blue.svg" alt="Bolna is released under the MIT license." /> </a> <a href="https://github.com/bolna-ai/bolna/blob/main/CONTRIBUTING.md"> <img src="https://img.shields.io/badge/PRs-Welcome-brightgreen" alt="PRs welcome!" /> </a> </h4>

[!NOTE] We are actively looking for maintainers.

Introduction

Bolna is the end-to-end open source production ready framework for quickly building LLM based voice driven conversational applications.

Demo

https://github.com/bolna-ai/bolna/assets/1313096/2237f64f-1c5b-4723-b7e7-d11466e9b226

What is this repository?

This repository contains the entire orchestration platform to build voice AI applications. It technically orchestrates voice conversations using combination of different ASR+LLM+TTS providers and models over websockets.

Components

Bolna helps you create AI Voice Agents which can be instructed to do tasks beginning with:

Orchestration platform (this open source repository)
Hosted APIs (https://docs.bolna.ai/api-reference/introduction) built on top of this orchestration platform [currently closed source]
No-code UI playground at https://platform.bolna.ai/ using the hosted APIs + tailwind CSS [currently closed source]

Development philosophy

Any integration, enhancement or feature initially lands on this open source package since it forms the backbone of our Hosted APIs and dashboard
Post that we expose APIs or make changes to existing APIs as required for the same
Thirdly, we push it to the UI dashboard

graph LR;
    A[Bolna open source] -->B[Hosted APIs];
    B[Hosted APIs] --> C[Hosted Playground]

Supported providers and models

Initiating a phone call using telephony providers like Twilio, Plivo, Exotel (coming soon), Vonage (coming soon) etc.
Transcribing the conversations using Deepgram, Azure etc.
Using LLMs like OpenAI, DeepSeek, Llama, Cohere, Mistral, etc to handle conversations
Synthesizing LLM responses back to telephony using AWS Polly, ElevenLabs, Deepgram, OpenAI, Azure, Cartesia, Smallest etc.

Refer to the docs for a deepdive into all supported providers.

Local example setup [will be moved to a different repository]

A basic local setup includes usage of Twilio or Plivo for telephony. We have dockerized the setup in local_setup/. One will need to populate an environment .env file from .env.sample.

The setup consists of four containers:

Telephony web server:
- Choosing Twilio: for initiating the calls one will need to set up a Twilio account
- Choosing Plivo: for initiating the calls one will need to set up a Plivo account
Bolna server: for creating and handling agents
ngrok: for tunneling. One will need to add the authtoken to ngrok-config.yml
redis: for persisting agents & prompt data

Quick Start

The easiest way to get started is to use the provided script:

cd local_setup
chmod +x start.sh
./start.sh

This script will check for Docker dependencies, build all services with BuildKit enabled, and start them in detached mode.

Manual Setup

Alternatively, you can manually build and run the services:

Make sure you have Docker with Docker Compose V2 installed

Enable BuildKit for faster builds:

export DOCKER_BUILDKIT=1
export COMPOSE_DOCKER_CLI_BUILD=1

Build the images:
```
docker compose build
```
Run the services:
```
docker compose up -d
```

To run specific services only:

docker compose up -d bolna-app twilio-app
# or
docker compose up -d bolna-app plivo-app

Once the docker containers are up, you can now start to create your agents and instruct them to initiate calls.

Example agents to create, use and start making calls

You may try out different agents from example.bolna.dev.

Programmatic usage (minimal example)

You can also build and run an agent directly in Python without the local telephony setup.

Example script: examples/simple_assistant.py

import asyncio
from bolna.assistant import Assistant
from bolna.models import (
    Transcriber,
    Synthesizer,
    ElevenLabsConfig,
    LlmAgent,
    SimpleLlmAgent,
)


async def main():
    assistant = Assistant(name="demo_agent")

    # Configure audio input (ASR)
    transcriber = Transcriber(provider="deepgram", model="nova-2", stream=True, language="en")

    # Configure LLM
    llm_agent = LlmAgent(
        agent_type="simple_llm_agent",
        agent_flow_type="streaming",
        llm_config=SimpleLlmAgent(
            provider="openai",
            model="gpt-4o-mini",
            temperature=0.3,
        ),
    )

    # Configure audio output (TTS)
    synthesizer = Synthesizer(
        provider="elevenlabs",
        provider_config=ElevenLabsConfig(
            voice="George", voice_id="JBFqnCBsd6RMkjVDRZzb", model="eleven_turbo_v2_5"
        ),
        stream=True,
        audio_format="wav",
    )

    # Build a single coherent pipeline: transcriber -> llm -> synthesizer
    assistant.add_task(
        task_type="conversation",
        llm_agent=llm_agent,
        transcriber=transcriber,
        synthesizer=synthesizer,
        enable_textual_input=False,
    )

    # Stream results
    async for chunk in assistant.execute():
        print(chunk)


if __name__ == "__main__":
    asyncio.run(main())

How to run:

export OPENAI_API_KEY=...
export DEEPGRAM_AUTH_TOKEN=...
export ELEVENLABS_API_KEY=...
python examples/simple_assistant.py

This demonstrates orchestration and streaming output. For telephony, use the services in local_setup/.

Note: For REST-based usage (Agent CRUD over HTTP), see API.md in the repo root.

Expected output shape: assistant.execute() is an async generator yielding per-task result dicts (event-like chunks). The exact keys depend on configured tools/providers; treat it as a stream and process incrementally.

Text-only pipeline example

If you want a text-only flow (no transcriber/synthesizer), you can enable a text-only pipeline:

Example script: examples/text_only_assistant.py

import asyncio
from bolna.assistant import Assistant
from bolna.models import LlmAgent, SimpleLlmAgent


async def main():
    assistant = Assistant(name="text_only_agent")

    llm_agent = LlmAgent(
        agent_type="simple_llm_agent",
        agent_flow_type="streaming",
        llm_config=SimpleLlmAgent(
            provider="openai",
            model="gpt-4o-mini",
            temperature=0.2,
        ),
    )

    # No transcriber/synthesizer; enable a text-only pipeline
    assistant.add_task(
        task_type="conversation",
        llm_agent=llm_agent,
        enable_textual_input=True,
    )

    async for chunk in assistant.execute():
        print(chunk)


if __name__ == "__main__":
    asyncio.run(main())

How to run (text-only):

export OPENAI_API_KEY=...
python examples/text_only_assistant.py

Expected output shape: assistant.execute() yields streaming dicts per task step; fields vary by configuration. Handle chunk-by-chunk.

Using your own providers

You can populate the .env file to use your own keys for providers.

<details> <summary>ASR Providers</summary><br> These are the current supported ASRs Providers:

| Provider | Environment variable to be added in .env file | |--------------|-------------------------------------------------| | Deepgram | DEEPGRAM_AUTH_TOKEN |

</details>  <br> <details> <summary>LLM Providers</summary><br> Bolna uses LiteLLM package to support multiple LLM integrations.

These are the current supported LLM Provider Family: https://github.com/bolna-ai/bolna/blob/10fa26e5985d342eedb5a8985642f12f1cf92a4b/bolna/providers.py#L30-L47

For LiteLLM based LLMs, add either of the following to the .env file depending on your use-case:<br><br> LITELLM_MODEL_API_KEY: API Key of the LLM<br> LITELLM_MODEL_API_BASE: URL of the hosted LLM<br> LITELLM_MODEL_API_VERSION: API VERSION for LLMs like Azure

For LLMs hosted via VLLM, add the following to the .env file:<br> VLLM_SERVER_BASE_URL: URL of the hosted LLM using VLLM

</details>  <br> <details> <summary>TTS Providers</summary><br> These are the current supported TTS Providers: https://github.com/bolna-ai/bolna/blob/c8a0d1428793d4df29133119e354bc2f85a7ca76/bolna/providers.py#L7-L14

| Provider | Environment variable to be added in .env file | |------------|--------------------------------------------------| | AWS Polly | Accessed from system wide credentials via ~/.aws | | Elevenlabs | ELEVENLABS_API_KEY | | OpenAI | OPENAI_API_KEY | | Deepgram | DEEPGRAM_AUTH_TOKEN | | Cartesia | CARTESIA_API_KEY | | Smallest | SMALLEST_API_KEY |

</details>  <br> <details>

<sum

Related Skills

node-connect

333.3k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

82.0k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

333.3k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

commit-push-pr

82.0k

Commit, push, and open a PR