LazyLLM
Easiest and laziest way for building multi-agent LLMs applications.
Install / Use
/learn @LazyAGI/LazyLLMREADME
LazyLLM: A Low-code Development Tool For Building Multi-agent LLMs Applications.
What is LazyLLM?
LazyLLM is a low-code development tool for building multi-agent large language model applications. It assists developers in creating complex AI applications at very low costs and enables continuous iterative optimization. LazyLLM offers a convenient workflow for application building and provides numerous standard processes and tools for various stages of the application development process.<br> The AI application development process based on LazyLLM follows prototype building -> data feedback -> iterative optimization, which means you can quickly build a prototype application using LazyLLM, then analyze bad cases using task-specific data, and subsequently iterate on algorithms and fine-tune models at critical stages of the application to gradually improve the overall application performance.<br> LazyLLM is committed to the unity of agility and efficiency. Developers can efficiently iterate algorithms and then apply the iterated algorithms to industrial production, supporting multiple users, fault tolerance, and high concurrency. User Documentation: https://docs.lazyllm.ai/ <br> Scan the QR code below with WeChat to join the group chat(left) or learn more by watching a video(right)<br>
<p align="center"> <img src="https://github.com/user-attachments/assets/8ad8fd14-b218-48b3-80a4-7334b2a32c5a" width=250/> <img src="https://github.com/user-attachments/assets/7a042a97-1339-459e-a451-4bcd6cf64c12" width=250/> </p>Features
Convenient AI Application Assembly Process: Even if you are not familiar with large models, you can still easily assemble AI applications with multiple agents using our built-in data flow and functional modules, just like Lego building.
One-Click Deployment of Complex Applications: We offer the capability to deploy all modules with a single click. Specifically, during the POC (Proof of Concept) phase, LazyLLM simplifies the deployment process of multi-agent applications through a lightweight gateway mechanism, solving the problem of sequentially starting each submodule service (such as LLM, Embedding, etc.) and configuring URLs, making the entire process smoother and more efficient. In the application release phase, LazyLLM provides the ability to package images with one click, making it easy to utilize Kubernetes' gateway, load balancing, and fault tolerance capabilities.
Cross-Platform Compatibility: Switch IaaS platforms with one click without modifying code, compatible with bare-metal servers, development machines, Slurm clusters, public clouds, etc. This allows developed applications to be seamlessly migrated to other IaaS platforms, greatly reducing the workload of code modification.<br>
Unified User Experience for Different Technical Choices: We provide a unified user experience for online models from different service providers and locally deployed models, allowing developers to freely switch and upgrade their models for experimentation. In addition, we also unify the user experience for mainstream inference frameworks, fine-tuning frameworks, relational databases, vector databases, and document databases.<br>
Efficient Model Fine-Tuning: Support fine-tuning models within applications to continuously improve application performance. Automatically select the best fine-tuning framework and model splitting strategy based on the fine-tuning scenario. This not only simplifies the maintenance of model iterations but also allows algorithm researchers to focus more on algorithm and data iteration, without handling tedious engineering tasks.<br>
What can you build with Lazyllm
LazyLLM can be used to build common artificial intelligence applications. Here are some examples.
3.1 ChatBots
This is a simple example of a chat bot.
# set environment variable: LAZYLLM_OPENAI_API_KEY=xx
# or you can make a config file(~/.lazyllm/config.json) and add openai_api_key=xx
import lazyllm
chat = lazyllm.OnlineChatModule()
lazyllm.WebModule(chat).start().wait()
If you want to use a locally deployed model, please ensure you have installed at least one inference framework (lightllm or vllm), and then use the following code
import lazyllm
# Model will be downloaded automatically if you have an internet connection.
chat = lazyllm.TrainableModule('internlm2-chat-7b')
lazyllm.WebModule(chat, port=23466).start().wait()
If you installed lazyllm using pip and ensured that the bin directory of your Python environment is in your $PATH, you can quickly start a chatbot by executing lazyllm run chatbot. If you want to use a local model, you need to specify the model name with the --model parameter. For example, you can start a chatbot based on a local model by using lazyllm run chatbot --model=internlm2-chat-7b.
This is an advanced bot example with multimodality and intent recognition.
from lazyllm import TrainableModule, WebModule, deploy, pipeline
from lazyllm.tools import IntentClassifier
painter_prompt = 'Now you are a master of drawing prompts, capable of converting any Chinese content entered by the user into English drawing prompts. In this task, you need to convert any input content into English drawing prompts, and you can enrich and expand the prompt content.'
musician_prompt = 'Now you are a master of music composition prompts, capable of converting any Chinese content entered by the user into English music composition prompts. In this task, you need to convert any input content into English music composition prompts, and you can enrich and expand the prompt content.'
</details>
base = TrainableModule('internlm2-chat-7b')
with IntentClassifier(base) as ic:
ic.case['Chat', base]
ic.case['Speech Recognition', TrainableModule('SenseVoiceSmall')]
ic.case['Image QA', TrainableModule('InternVL3_5-1B').deploy_method(deploy.LMDeploy)]
ic.case['Drawing', pipeline(base.share().prompt(painter_prompt), TrainableModule('stable-diffusion-3-medium'))]
ic.case['Generate Music', pipeline(base.share().prompt(musician_prompt), TrainableModule('musicgen-small'))]
ic.case['Text to Speech', TrainableModule('ChatTTS')]
WebModule(ic, history=[base], audio=True, port=8847).start().wait()
3.2 Retrieval-Augmented Generation
import os
import lazyllm
from lazyllm import pipeline, parallel, bind, SentenceSplitter, Document, Retriever, Reranker
prompt = 'You will play the role of an AI Q&A assistant and complete a dialogue task. In this task, you need to provide your answer based on the given context and question.'
</details>
This is an online deployment example:
documents = Document(dataset_path="your data path", embed=lazyllm.OnlineEmbeddingModule(), manager=False)
documents.create_node_group(name="sentences", transform=SentenceSplitter, chunk_size=1024, chunk_overlap=100)
with pipeline() as ppl:
with parallel().sum as ppl.prl:
prl.retriever1 = Retriever(documents, group_name="sentences", similarity="cosine", topk=3)
prl.retriever2 = Retriever(documents, "CoarseChunk", "bm25_chinese", 0.003, topk=3)
ppl.reranker = Reranker("ModuleReranker", model="bge-reranker-large", topk=1) | bind(query=ppl.input)
ppl.formatter = (lambda nodes, query: dict(context_str="".join([node.get_content() for node in nodes]), query=query)) | bind(query=ppl.input)
ppl.llm = lazyllm.OnlineChatModule(stream=False).prompt(lazyllm.ChatPrompter(prompt, extra_keys=["context_str"]))
lazyllm.WebModule(ppl, port=23466).start().wait()
Here is an example of a local deployment:
documents = Document(dataset_path='/file/to/yourpath', embed=lazyllm.TrainableModule('bge-large-zh-v1.5'))
documents.create_node_group(name="sentences", transform=SentenceSplitter, chunk_size=1024, chunk_overlap=100)
with pipeline() as ppl:
with parallel().sum as ppl.prl:
prl.retriever1 = Retriever(documents, group_name="sentences", similarity="cosine", topk=3)
prl.retriever2 = Retriever(documents, "CoarseChunk", "bm25_chinese", 0.003, topk=3)
ppl.reranker = Reranker("ModuleReranker", model="bge-reranker-large", topk=1) | bind(query=ppl.input)
ppl.formatter = (lambda nodes, query: dict(context_str="".join([node.get_content() for node in nodes]), query=query)) | bind(query=ppl.input)
ppl.llm = lazyllm.TrainableModule("internlm2-chat-7b").prompt(lazyllm.ChatPrompter(prompt, extra_keys=["context_str"]))
lazyllm.WebModule(ppl, port=23456).start().wait()
https://github.com/LazyAGI/LazyLLM/assets/12124621/77267adc-6e40-47b8-96a8-895df165b0ce
If you installed lazyllm using pip and ensured that the bin directory of your Python environment is in your $PATH, you can quickly start a retrieval-augmented bot by executing lazyllm run rag --documents=/file/to/yourpath. If you want to use a local model, you need to specify the model name with the --model parameter. For example, you can start a retrieval-augmented bot based on a
Related Skills
YC-Killer
2.7kA library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.
openclaw-plugin-loom
Loom Learning Graph Skill This skill guides agents on how to use the Loom plugin to build and expand a learning graph over time. Purpose - Help users navigate learning paths (e.g., Nix, German)
Leadership-Mirror
Product Overview Project Purpose Hack Atria is a leadership development and team management platform that provides AI-powered insights, feedback analysis, and learning resources to help leaders
groundhog
398Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).
