MultiAgent

Generative web browsing chat agent with text + vision input. Powered by MultiOn, llama-3, llava, qwen, Next.js, FastAPI, and Supabase. Landed me an internship at MultiOn :)

Generate Convert Improve

Install / Use

/learn @justinsunyt/MultiAgent

About this skill

Quality Score

0/100

README

MultiAgent

🎥 Video demo

Autonomous Web Browsing AI Agent with Vision - powered by MultiOn

As of May 6, MultiOnChat is now MultiAgent and frontend and backend repositories have been merged!

Local setup 💻

Prerequisites

git
Python 3.11
pipenv
Node 21
pnpm

Clone repository

Clone this GitHub respository:

git clone https://github.com/justinsunyt/multiagent.git

Create Supabase project

Create an account on Supabase if you don't have one already.
Create a project.
In Table Editor, create a table called chats with the following columns:

Make sure last_chatted and session_id are nullable.

Create the following auth policy for chats:

In Authentication -> Providers, enable Email as an auth provider.

Launch backend

Navigate to the backend/ folder:

cd backend

Create a .env file in the backend/ folder and store the following variables:

SUPABASE_URL="<Supabase project URL>"
SUPABASE_KEY="<Supabase anon key>"
SUPABASE_JWT_SECRET="<Supabase JWT secret>"
SUPABASE_JWT_ISSUER="<Supabase project URL>/auth/v1"
REPLICATE_API_TOKEN="<Replicate API token>"
GROQ_API_KEY="<Groq API key>"
MULTION_API_KEY="<Multion API key>"

Launch pipenv environment:

pipenv shell

Install required packages:

pipenv install

Run the FastAPI development server:

uvicorn main:app --reload

Launch frontend

Navigate to the frontend/ folder:

cd frontend

Create a .env.local file in the frontend/ folder and store the following variables:

NEXT_PUBLIC_SUPABASE_URL="<Supabase project URL>"
NEXT_PUBLIC_SUPABASE_ANON_KEY="<Supabase anon key>"
NEXT_PUBLIC_PLATFORM_URL="<deployed backend URL, will only be used in production>"

Install required packages:

pnpm install

Run the Next.js development server:

pnpm dev

Finally, open http://localhost:3000 with your browser to start using MultiAgent!

Tech stack ⚙️

Client: Next.js, TanStack Query
UI: Tailwind, shadcn/ui, Framer Motion, Lucide, Sonner, Spline
Server: FastAPI
Database: Supabase
AI: MultiOn, Replicate, Groq

Features 🔍

Autonomously browse the internet with your own AI agent using only an image and a command - order a Big Mac, schedule events, and shop for outfits!
Supabase database and email authentication with JWT token verification for RLS storage
Currently supports llama3-70b, llava-13b, lava-v1.6-34b, qwen-vl-chat

What's next? 💪

Llama tool calling to activate agent whenever appropriate
Refine image prompt recursively with Llama
Chat selection menu to choose between any combination of LLMs and VLMs
Toggle MultiOn local mode
Deploy! (You will have to use your own API keys)

Credits 🙏

Special thanks to MultiOn for the epic agent package and auroregmbt for the Spline animation!

Related Skills

node-connect

350.1k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

109.9k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

350.1k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

qqbot-media

350.1k

QQBot 富媒体收发能力。使用 <qqmedia> 标签，系统根据文件扩展名自动识别类型（图片/语音/视频/文件）。