Asktube
AskTube - An AI-powered YouTube video summarizer and QA assistant powered by Retrieval Augmented Generation (RAG) 🤖. Run it entirely on your local machine with Ollama, or cloud-based models like Claude, OpenAI, Gemini, Mistral, and more.
Install / Use
/learn @jonaskahn/AsktubeAbout this skill
Quality Score
0/100
Category
Content & MediaSupported Platforms
Claude Code
Claude Desktop
Gemini CLI
README
<p align="center">
<img src="docs/images/logo.png" alt="AskTube's Logo"/>
</p>
<p align="center">
<strong>AskTube - An AI-powered YouTube video summarizer and QA assistant powered by Retrieval Augmented Generation (RAG) 🤖</strong>
</p>
<p align="center">
<i>Run it entirely on your local machine with Ollama, or cloud-based models like Claude, OpenAI, Gemini, Mistral, and more</i>
</p>
🏃🏽➡️ Demo & Screenshot
<p align="center"> <img src="docs/images/demo-21.png" alt="Demo 21"/> </p> <p align="center"> <img src="docs/images/demo-22.png" alt="Demo 22"/> </p> <p align="center"> <img src="docs/images/demo-23.png" alt="Demo 23"/> </p>Watch "AskTube First Demo" on YouTube
https://github.com/user-attachments/assets/610ec00b-e25a-4ac5-900c-145c8485675f
💤 Features
- [x] Work even with unsubtitle video
- [x] No limit video time
- [x] Support multiple AI vendors
- [x] Focus on RAG implemetation
- [x] Fully run on your local machine
🤷🏽 Why does this project exist?
- I’ve seen several GitHub repositories offering AI-powered summaries for YouTube videos, but none include Q&A functionality.
- I want to implement a more comprehensive solution while also gaining experience with AI to build my own RAG application.
🔨 Technology
- Language: Python, JS
- Server: Python@v3.10, Bun@v1
- Framework/Lib: Sanic, Peewee, Pytubefix, Sentence Transformers, Sqlite, Chroma, NuxtJs/DaisyUI, etc.
- Embedding Provider (Analysis Provider):
- [x] OpenAI
- [x] Gemini
- [x] VoyageAI
- [x] Mistral
- [x] Sentence Transformers (Local)
- AI Provider:
- [x] OpenAI
- [x] Claude
- [x] Gemini
- [x] Mistral
- [x] Ollama (Local)
- Speech To Text:
- [x] Faster-Whisper (Local)
- [x] AssemblyAI
- [x] OpenAI
- [x] Gemini
🗓️ Next Todo Tasks
- [ ] Implement Speech To Text for cloud models
- [ ] AssemblyAI
- [ ] OpenAI
- [ ] Gemini
- [ ] Enhance
- [x] ~Skip using RAG for short videos~
- [ ] Chat prompts, chat messages by context limit
- [ ] RAG: Implement Query Translation
- [x] ~Multiquery~
- [ ] Fusion
- [ ] Decomposition
- [ ] Step back
- [ ] HyDE
🚀 How to run ?
For the first time running, the program maybe a bit slow due they need to install local models.
Run on your machine
-
Ensure you installed:
-
- Windows User, please download here
- Linux, MacOS User, please use
homebrewor your install package command(apt, dnf, etc) - Or use
conda
-
- Windows User open
Powershelland run:
(Invoke-WebRequest -Uri https://install.python-poetry.org -UseBasicParsing).Content | py -- Linux, MacOS User open
Terminaland run:
curl -sSL https://install.python-poetry.org | python3 - - Windows User open
-
- MacOS User
brew install ffmpeg- Linux User
# Ubuntu sudo apt install ffmpeg # Fedora sudo dnf install -y ffmpeg- Windows, please follow this tutorial Install ffmpeg for Windows
-
-
Clone repostiory
git clone https://github.com/jonaskahn/asktube.git -
Create file
.envinasktube/enginedirectory: -
Run program
- You may need to run first:
poetry env use python- Open
terminal/cmd/powershellinasktube/enginedirectory, then run:
poetry install && poetry run python engine/server.py- Open
terminal/cmd/powershellinasktube/webdirectory, then run:
bun install && bun run dev -
Open web: http://localhost:3000
With docker (In process)
Before You Start
- I built these services to docker images, but if you want to build local images, please run
build.local.batforWindowsorbuild.local.amd64.shorbuild.local.aarch64.shforMacOS,Linux- If you have a GPU (cuda or rocm), please refer ENV settings above, change params like above
Locally
- Use local.yaml compose file to start
- Open
terminal/cmd/powershellinasktubedirectory
docker compose -f compose/local.yaml pull && docker compose -f compose/local.yaml up -d
- After run, you need install
Ollamamodelqwen2andllama3.1for QA
docker run ollama ollama run qwen2
docker run ollama ollama run llama3.1
Free (with rate limit)
- You need to go Google Gemini and VoyageAI to register account and generate your own API keys:
- Gemini is free with your Google Account
- VoyageAI (recommended by Claude) gives you free 50M tokens (a huge amount) but you need to add your credit card first.
- Replace your ENV setting in docker file free and start docker
- Open
terminal/cmd/powershellinasktubedirectory
docker compose -f compose/free.yaml pull && docker compose -f compose/free.yaml up -d
Ideal
- Using
VoyageAIfor embedding texts - Using
OpenAIandClaudefor QA, register account and generate your own API keys - Replace your ENV setting in docker file ideal and start docker
- Open
terminal/cmd/powershellinasktubedirectory
docker compose -f compose/ideal.yaml pull && docker compose -f compose/ideal.yaml up -d
Result
- Open web: http://localhost:8080
💡 Architecture
The real implementation might differ from this art due to its complexity.
1️⃣ Extract data from given URL

2️⃣ Storing embedding chapter subtitles

3️⃣ Asking (included enrich question)

🪧 Notice
- Do not use this for production. This aimed for end-users on their local machines.
- Do not request any advanced features for management.
