SkillAgentSearch skills...

Agents

A framework for building realtime voice AI agents 🤖🎙️📹

Install / Use

/learn @livekit/Agents
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

<!--BEGIN_BANNER_IMAGE--> <picture> <source media="(prefers-color-scheme: dark)" srcset="/.github/banner_dark.png"> <source media="(prefers-color-scheme: light)" srcset="/.github/banner_light.png"> <img style="width:100%;" alt="The LiveKit icon, the name of the repository and some sample code in the background." src="https://raw.githubusercontent.com/livekit/agents/main/.github/banner_light.png"> </picture> <!--END_BANNER_IMAGE--> <br />

PyPI - Version PyPI Downloads Slack community Twitter Follow Ask DeepWiki for understanding the codebase License

<br />

Looking for the JS/TS library? Check out AgentsJS

What is Agents?

<!--BEGIN_DESCRIPTION-->

The Agent Framework is designed for building realtime, programmable participants that run on servers. Use it to create conversational, multi-modal voice agents that can see, hear, and understand.

<!--END_DESCRIPTION-->

Features

  • Flexible integrations: A comprehensive ecosystem to mix and match the right STT, LLM, TTS, and Realtime API to suit your use case.
  • Integrated job scheduling: Built-in task scheduling and distribution with dispatch APIs to connect end users to agents.
  • Extensive WebRTC clients: Build client applications using LiveKit's open-source SDK ecosystem, supporting all major platforms.
  • Telephony integration: Works seamlessly with LiveKit's telephony stack, allowing your agent to make calls to or receive calls from phones.
  • Exchange data with clients: Use RPCs and other Data APIs to seamlessly exchange data with clients.
  • Semantic turn detection: Uses a transformer model to detect when a user is done with their turn, helps to reduce interruptions.
  • MCP support: Native support for MCP. Integrate tools provided by MCP servers with one loc.
  • Builtin test framework: Write tests and use judges to ensure your agent is performing as expected.
  • Open-source: Fully open-source, allowing you to run the entire stack on your own servers, including LiveKit server, one of the most widely used WebRTC media servers.

Installation

To install the core Agents library, along with plugins for popular model providers:

pip install "livekit-agents[openai,silero,deepgram,cartesia,turn-detector]~=1.4"

Docs and guides

Documentation on the framework and how to use it can be found here

Building with AI coding agents

If you're using an AI coding assistant to build with LiveKit Agents, we recommend the following setup for the best results:

  1. Install the LiveKit Docs MCP server — Gives your coding agent access to up-to-date LiveKit documentation, code search across LiveKit repositories, and working examples.

  2. Install the LiveKit Agent Skill — Provides your coding agent with architectural guidance and best practices for building voice AI applications, including workflow design, handoffs, tasks, and testing patterns.

    npx skills add livekit/agent-skills --skill livekit-agents
    

The Agent Skill works best alongside the MCP server: the skill teaches your agent how to approach building with LiveKit, while the MCP server provides the current API details to implement it correctly.

Core concepts

  • Agent: An LLM-based application with defined instructions.
  • AgentSession: A container for agents that manages interactions with end users.
  • entrypoint: The starting point for an interactive session, similar to a request handler in a web server.
  • AgentServer: The main process that coordinates job scheduling and launches agents for user sessions.

Usage

Simple voice agent


from livekit.agents import (
    Agent,
    AgentServer,
    AgentSession,
    JobContext,
    RunContext,
    cli,
    function_tool,
    inference,
)
from livekit.plugins import silero


@function_tool
async def lookup_weather(
    context: RunContext,
    location: str,
):
    """Used to look up weather information."""

    return {"weather": "sunny", "temperature": 70}


server = AgentServer()


@server.rtc_session()
async def entrypoint(ctx: JobContext):
    session = AgentSession(
        vad=silero.VAD.load(),
        # any combination of STT, LLM, TTS, or realtime API can be used
        # this example shows LiveKit Inference, a unified API to access different models via LiveKit Cloud
        # to use model provider keys directly, replace with the following:
        # from livekit.plugins import deepgram, openai, cartesia
        # stt=deepgram.STT(model="nova-3"),
        # llm=openai.LLM(model="gpt-4.1-mini"),
        # tts=cartesia.TTS(model="sonic-3", voice="9626c31c-bec5-4cca-baa8-f8ba9e84c8bc"),
        stt=inference.STT("deepgram/nova-3", language="multi"),
        llm=inference.LLM("openai/gpt-4.1-mini"),
        tts=inference.TTS("cartesia/sonic-3", voice="9626c31c-bec5-4cca-baa8-f8ba9e84c8bc"),
    )

    agent = Agent(
        instructions="You are a friendly voice assistant built by LiveKit.",
        tools=[lookup_weather],
    )

    await session.start(agent=agent, room=ctx.room)
    await session.generate_reply(instructions="greet the user and ask about their day")


if __name__ == "__main__":
    cli.run_app(server)

You'll need the following environment variables for this example:

  • LIVEKIT_URL
  • LIVEKIT_API_KEY
  • LIVEKIT_API_SECRET

Multi-agent handoff


This code snippet is abbreviated. For the full example, see multi_agent.py

...
class IntroAgent(Agent):
    def __init__(self) -> None:
        super().__init__(
            instructions=f"You are a story teller. Your goal is to gather a few pieces of information from the user to make the story personalized and engaging."
            "Ask the user for their name and where they are from"
        )

    async def on_enter(self):
        self.session.generate_reply(instructions="greet the user and gather information")

    @function_tool
    async def information_gathered(
        self,
        context: RunContext,
        name: str,
        location: str,
    ):
        """Called when the user has provided the information needed to make the story personalized and engaging.

        Args:
            name: The name of the user
            location: The location of the user
        """

        context.userdata.name = name
        context.userdata.location = location

        story_agent = StoryAgent(name, location)
        return story_agent, "Let's start the story!"


class StoryAgent(Agent):
    def __init__(self, name: str, location: str) -> None:
        super().__init__(
            instructions=f"You are a storyteller. Use the user's information in order to make the story personalized."
            f"The user's name is {name}, from {location}"
            # override the default model, switching to Realtime API from standard LLMs
            llm=openai.realtime.RealtimeModel(voice="echo"),
            chat_ctx=chat_ctx,
        )

    async def on_enter(self):
        self.session.generate_reply()


@server.rtc_session()
async def entrypoint(ctx: JobContext):
    userdata = StoryData()
    session = AgentSession[StoryData](
        vad=silero.VAD.load(),
        stt="deepgram/nova-3",
        llm="openai/gpt-4.1-mini",
        tts="cartesia/sonic-3:9626c31c-bec5-4cca-baa8-f8ba9e84c8bc",
        userdata=userdata,
    )

    await session.start(
        agent=IntroAgent(),
        room=ctx.room,
    )
...

Testing

Automated tests are essential for building reliable agents, especially with the non-deterministic behavior of LLMs. LiveKit Agents include native test integration to help you create dependable agents.

@pytest.mark.asyncio
async def test_no_availability() -> None:
    llm = google.LLM()
    async AgentSession(llm=llm) as sess:
        await sess.start(MyAgent())
        result = await sess.run(
            user_input="Hello, I need to place an order."
        )
        result.expect.skip_next_event_if(type="message", role="assistant")
        result.expect.next_event().is_function_call(name="start_order")
        result.expect.next_event().is_function_call_output()
        await (
            result.expect.next_event()
            .is_message(role="assistant")
            .judge(llm, intent="assistant should be asking the user what they would like")
        )

Examples

For more examples and detailed setup instructions, see the examples directory. For even more examples, see the python-agents-examples repository.

<table> <tr> <td width="50%"> <h3>🎙️ Starter Agent</h3> <p>A starter agent optimized for voice conversations.</p> <p> <a href="examples/voice_agents/basic_agent.py">Code</a> </p> </td> <td width="50%"> <h3>🔄 Multi-user push to talk</h3> <p>Responds to multiple users in the room via push-to-talk.</p> <p> <a href="examples/voice_agents/push_to_talk.py">Code</a> </p> </td> </tr> <tr> <td width="50%"> <h3>🎵 Background audio</h3> <p>Background ambient and thinking audio to improve realism.</p> <p> <a href="examples/voice_agents/bac
View on GitHub
GitHub Stars9.9k
CategoryContent
Updated51m ago
Forks3.0k

Languages

Python

Security Score

100/100

Audited on Mar 29, 2026

No findings