SkillAgentSearch skills...

LazyLLM

Easiest and laziest way for building multi-agent LLMs applications.

Install / Use

/learn @LazyAGI/LazyLLM

README

<div align="center"> <img src="https://raw.githubusercontent.com/LazyAGI/LazyLLM/main/docs/assets/LazyLLM-logo.png" width="100%"/> </div>

LazyLLM: A Low-code Development Tool For Building Multi-agent LLMs Applications.

中文 | EN

CI License GitHub star chart

What is LazyLLM?

LazyLLM is a low-code development tool for building multi-agent large language model applications. It assists developers in creating complex AI applications at very low costs and enables continuous iterative optimization. LazyLLM offers a convenient workflow for application building and provides numerous standard processes and tools for various stages of the application development process.<br> The AI application development process based on LazyLLM follows prototype building -> data feedback -> iterative optimization, which means you can quickly build a prototype application using LazyLLM, then analyze bad cases using task-specific data, and subsequently iterate on algorithms and fine-tune models at critical stages of the application to gradually improve the overall application performance.<br> LazyLLM is committed to the unity of agility and efficiency. Developers can efficiently iterate algorithms and then apply the iterated algorithms to industrial production, supporting multiple users, fault tolerance, and high concurrency. User Documentation: https://docs.lazyllm.ai/ <br> Scan the QR code below with WeChat to join the group chat(left) or learn more by watching a video(right)<br>

<p align="center"> <img src="https://github.com/user-attachments/assets/8ad8fd14-b218-48b3-80a4-7334b2a32c5a" width=250/> <img src="https://github.com/user-attachments/assets/7a042a97-1339-459e-a451-4bcd6cf64c12" width=250/> </p>

Features

Convenient AI Application Assembly Process: Even if you are not familiar with large models, you can still easily assemble AI applications with multiple agents using our built-in data flow and functional modules, just like Lego building.

One-Click Deployment of Complex Applications: We offer the capability to deploy all modules with a single click. Specifically, during the POC (Proof of Concept) phase, LazyLLM simplifies the deployment process of multi-agent applications through a lightweight gateway mechanism, solving the problem of sequentially starting each submodule service (such as LLM, Embedding, etc.) and configuring URLs, making the entire process smoother and more efficient. In the application release phase, LazyLLM provides the ability to package images with one click, making it easy to utilize Kubernetes' gateway, load balancing, and fault tolerance capabilities.

Cross-Platform Compatibility: Switch IaaS platforms with one click without modifying code, compatible with bare-metal servers, development machines, Slurm clusters, public clouds, etc. This allows developed applications to be seamlessly migrated to other IaaS platforms, greatly reducing the workload of code modification.<br>

Unified User Experience for Different Technical Choices: We provide a unified user experience for online models from different service providers and locally deployed models, allowing developers to freely switch and upgrade their models for experimentation. In addition, we also unify the user experience for mainstream inference frameworks, fine-tuning frameworks, relational databases, vector databases, and document databases.<br>

Efficient Model Fine-Tuning: Support fine-tuning models within applications to continuously improve application performance. Automatically select the best fine-tuning framework and model splitting strategy based on the fine-tuning scenario. This not only simplifies the maintenance of model iterations but also allows algorithm researchers to focus more on algorithm and data iteration, without handling tedious engineering tasks.<br>

What can you build with Lazyllm

LazyLLM can be used to build common artificial intelligence applications. Here are some examples.

3.1 ChatBots

This is a simple example of a chat bot.

# set environment variable: LAZYLLM_OPENAI_API_KEY=xx 
# or you can make a config file(~/.lazyllm/config.json) and add openai_api_key=xx
import lazyllm
chat = lazyllm.OnlineChatModule()
lazyllm.WebModule(chat).start().wait()

If you want to use a locally deployed model, please ensure you have installed at least one inference framework (lightllm or vllm), and then use the following code

import lazyllm
# Model will be downloaded automatically if you have an internet connection.
chat = lazyllm.TrainableModule('internlm2-chat-7b')
lazyllm.WebModule(chat, port=23466).start().wait()

If you installed lazyllm using pip and ensured that the bin directory of your Python environment is in your $PATH, you can quickly start a chatbot by executing lazyllm run chatbot. If you want to use a local model, you need to specify the model name with the --model parameter. For example, you can start a chatbot based on a local model by using lazyllm run chatbot --model=internlm2-chat-7b.

This is an advanced bot example with multimodality and intent recognition.

Demo Multimodal bot

<details> <summary>click to look up prompts and imports</summary>
from lazyllm import TrainableModule, WebModule, deploy, pipeline
from lazyllm.tools import IntentClassifier

painter_prompt = 'Now you are a master of drawing prompts, capable of converting any Chinese content entered by the user into English drawing prompts. In this task, you need to convert any input content into English drawing prompts, and you can enrich and expand the prompt content.'
musician_prompt = 'Now you are a master of music composition prompts, capable of converting any Chinese content entered by the user into English music composition prompts. In this task, you need to convert any input content into English music composition prompts, and you can enrich and expand the prompt content.'
</details>
base = TrainableModule('internlm2-chat-7b')
with IntentClassifier(base) as ic:
    ic.case['Chat', base]
    ic.case['Speech Recognition', TrainableModule('SenseVoiceSmall')]
    ic.case['Image QA', TrainableModule('InternVL3_5-1B').deploy_method(deploy.LMDeploy)]
    ic.case['Drawing', pipeline(base.share().prompt(painter_prompt), TrainableModule('stable-diffusion-3-medium'))]
    ic.case['Generate Music', pipeline(base.share().prompt(musician_prompt), TrainableModule('musicgen-small'))]
    ic.case['Text to Speech', TrainableModule('ChatTTS')]
WebModule(ic, history=[base], audio=True, port=8847).start().wait()

3.2 Retrieval-Augmented Generation

Demo RAG

<details> <summary>Click to view imports and prompt</summary>

import os
import lazyllm
from lazyllm import pipeline, parallel, bind, SentenceSplitter, Document, Retriever, Reranker

prompt = 'You will play the role of an AI Q&A assistant and complete a dialogue task. In this task, you need to provide your answer based on the given context and question.'
</details>

This is an online deployment example:

documents = Document(dataset_path="your data path", embed=lazyllm.OnlineEmbeddingModule(), manager=False)
documents.create_node_group(name="sentences", transform=SentenceSplitter, chunk_size=1024, chunk_overlap=100)
with pipeline() as ppl:
    with parallel().sum as ppl.prl:
        prl.retriever1 = Retriever(documents, group_name="sentences", similarity="cosine", topk=3)
        prl.retriever2 = Retriever(documents, "CoarseChunk", "bm25_chinese", 0.003, topk=3)

    ppl.reranker = Reranker("ModuleReranker", model="bge-reranker-large", topk=1) | bind(query=ppl.input)
    ppl.formatter = (lambda nodes, query: dict(context_str="".join([node.get_content() for node in nodes]), query=query)) | bind(query=ppl.input)
    ppl.llm = lazyllm.OnlineChatModule(stream=False).prompt(lazyllm.ChatPrompter(prompt, extra_keys=["context_str"]))

lazyllm.WebModule(ppl, port=23466).start().wait()

Here is an example of a local deployment:

documents = Document(dataset_path='/file/to/yourpath', embed=lazyllm.TrainableModule('bge-large-zh-v1.5'))
documents.create_node_group(name="sentences", transform=SentenceSplitter, chunk_size=1024, chunk_overlap=100)

with pipeline() as ppl:
    with parallel().sum as ppl.prl:
        prl.retriever1 = Retriever(documents, group_name="sentences", similarity="cosine", topk=3)
        prl.retriever2 = Retriever(documents, "CoarseChunk", "bm25_chinese", 0.003, topk=3)

    ppl.reranker = Reranker("ModuleReranker", model="bge-reranker-large", topk=1) | bind(query=ppl.input)
    ppl.formatter = (lambda nodes, query: dict(context_str="".join([node.get_content() for node in nodes]), query=query)) | bind(query=ppl.input)
    ppl.llm = lazyllm.TrainableModule("internlm2-chat-7b").prompt(lazyllm.ChatPrompter(prompt, extra_keys=["context_str"]))

lazyllm.WebModule(ppl, port=23456).start().wait()

https://github.com/LazyAGI/LazyLLM/assets/12124621/77267adc-6e40-47b8-96a8-895df165b0ce

If you installed lazyllm using pip and ensured that the bin directory of your Python environment is in your $PATH, you can quickly start a retrieval-augmented bot by executing lazyllm run rag --documents=/file/to/yourpath. If you want to use a local model, you need to specify the model name with the --model parameter. For example, you can start a retrieval-augmented bot based on a

Related Skills

View on GitHub
GitHub Stars3.8k
CategoryEducation
Updated5h ago
Forks369

Languages

Python

Security Score

100/100

Audited on Mar 22, 2026

No findings