Nojoin
A self-hosted meeting transcription app that doesn't need to join your meetings as a bot.
Install / Use
/learn @Valtora/NojoinREADME
📚 Table of Contents
- Why Nojoin?
- Pre-Requisites
- Quick Start
- Hardware Requirements
- Features
- System Architecture
- API Keys & Configuration
- User Management
- Installation & Setup
- Updating Nojoin
- Troubleshooting
- Reverse Proxy
- Roadmap
- Contributing
- Editions
- Legal
❔ Why Nojoin?
Most meeting assistants require users to invite bots to join meetings or upload sensitive business conversations to the cloud. Nojoin offers a different approach.
- Configurable Privacy: Audio and transcripts remain on your server. Using remote LLM features will send transcripts to external providers. For 100% privacy, configure a local Ollama instance.
- Unlimited: No monthly limits on recording minutes.
- Smart: Utilizes OpenAI Whisper (Turbo) for transcription and Pyannote for speaker identification.
- Interactive: Enables chat with meetings using ChatGPT, Claude, Gemini, or Ollama.
- Non-Intrusive: Nojoin does not require joining meetings as a participant.
📋 Pre-Requisites
Before installing Nojoin, ensure your system meets the following requirements:
General
- Docker: Docker Desktop or Docker Engine (Linux).
- Git (Optional, only if cloning the repository).
For NVIDIA GPU Support (Highly Recommended)
Nojoin relies on GPU acceleration for efficient audio transcription and speaker diarization.
Linux Requirements:
- NVIDIA Drivers: Ensure the proprietary NVIDIA drivers are installed on your host system.
- Ubuntu:
sudo apt install nvidia-driver-580(or latest available version). - Verify with:
nvidia-smi
- Ubuntu:
- NVIDIA Container Toolkit: Required for Docker to access the GPU.
- Installation Guide
- Configuration command:
sudo nvidia-ctk runtime configure --runtime=docker && sudo systemctl restart docker
Windows Requirements:
- WSL 2: Ensure you are using the WSL 2 backend for Docker Desktop.
- NVIDIA Drivers: Install the latest NVIDIA drivers for Windows. The drivers are automatically propagated to WSL 2.
⚡ Quick Start
[⚠️WARNING⚠️]
Nojoin is still in development so updates may break instances. I will do my best to fix these issues ASAP but users should create regular backups just in case.
Please bear in mind that this is my first open source project so you may see things which you find to be suboptimal. I would be grateful if you could provide feedback and suggestions for improvement or even submit a pull request if you have the skills to do so.
- Clone:
git clone https://github.com/Valtora/Nojoin cd Nojoin - Setup:
cp docker-compose.example.yml docker-compose.yml - Launch:
docker compose up -d(Pulls pre-built images from GHCR) - Use: Open
https://localhost:14443(Accept self-signed cert warning) - Configure: Follow the first-run wizard to set up API keys and preferences.
- Note: If you configured environment variables in
.env, these fields will be pre-filled.
- Note: If you configured environment variables in
- Companion App: Navigate to the Releases page to download, install, and connect the companion app on client machines to start recording audio.
- See Installation & Setup for CPU-only mode and configuration details.
🖥️ Hardware Requirements
- Backend Server:
- Recommended: Windows 11 (with WSL2) or Linux system with a compatible NVIDIA GPU (CUDA 12.x support).
- Minimum: 8GB VRAM for optimal performance (Whisper Turbo + Pyannote).
- macOS Hosting: Hosting the backend on macOS via Docker is not recommended.
- Docker on macOS cannot pass through the Apple Silicon GPU (Metal) to containers. This forces the system to run in CPU-only mode, which is significantly slower for transcription and diarization.
- Companion App:
- Currently supported on Windows only.
- macOS and Linux companion apps are not yet available. Contributors are welcome to help build support for these platforms!
✨ Features
- Distributed Architecture:
- Server: Dockerized backend handling heavy AI processing (Whisper, Pyannote).
- Web Client: Modern Next.js interface for managing meetings from anywhere.
- Companion App: Lightweight Rust system tray app for capturing audio on client machines.
- Advanced Audio Processing:
- Local-First Transcription: Uses OpenAI's Whisper (default Turbo) for accurate, private transcription.
- Speaker Diarization: Automatically identifies distinct speakers using Pyannote.
- System Audio Capture: Captures both system audio out and microphone input.
- Meeting Intelligence:
- LLM-Powered Notes: Generate summaries, action items, and key takeaways using OpenAI, Anthropic, Google Gemini, or Ollama.
- Chat Q&A: "Chat with your meeting" to ask specific questions about the content or make edits to notes.
- Documents: Upload documents to be processed by the LLM.
- Cross-Meeting Context: Select tags to include meetings, notes, and documents from across all meetings with the same tag(s).
- Organization & Search:
- Global Speaker Library: Centralized management of speaker identities across all recordings.
- Voiceprint Recalibration: Manually improve speaker identification by selecting high-quality samples.
- Full-Text Search: Instantly find content across transcripts, titles, and notes.
- Tagging: Organize meetings with custom tags.
- User Management & Security:
- Role-Based Access: Owner, Admin, and User roles with granular permissions.
- Invitation System: Secure registration via invite links with expiration and usage limits.
- User Data: Complete data cleanup on user deletion (files, database records, and logs).
🏗️ System Architecture
Nojoin is composed of three distinct subsystems:
-
The Server (Dockerized):
- Hosted on a machine with NVIDIA GPU capabilities.
- Runs the API (FastAPI), Worker (Celery), Database (PostgreSQL), and Broker (Redis).
- Handles all heavy lifting: VAD, Transcription, and Diarization.
-
The Web Client (Next.js):
- The primary interface for users.
- Provides a dashboard for playback, transcript editing, and system configuration.
-
The Companion App (Rust):
- Runs on Windows client machines.
- Sits in the system tray and handles audio capture.
- Uploads audio to the server for processing.
- Platform Support: Currently Windows-only. Community contributions are welcome for macOS and Linux support!
🔑 API Keys & Configuration
Nojoin requires certain API keys to function fully. The first-run wizard will request these keys, but they can also be entered in the Settings -> AI Services page of the web interface after installation.
Tip: You can pre-fill these values by setting them in your
.envfile before starting the application. See Deployment > Environment Variables for a full list of available options.
Hugging Face Token (Required for Diarization)
To enable speaker diarization (identifying who is speaking), a Hugging Face token is required.
Privacy Note: This token is only used to download the model weights from Hugging Face. All audio processing and diarization happens locally on the server. No audio data is sent to Hugging Face.
- Create an account on Hugging Face.
- Generate an Access Token. A token with fine-grained permissions can be used:
- Select "Read access to contents of selected repos".
- Select the following repositories:
pyannote/speaker-diarization-community-1pyannote/wespeaker-voxceleb-resnet34-LM
- Accept the user conditions for the following models:
- Enter this token in the Nojoin Settings > AI Settings.
LLM Providers (optional but recommended)
To use the meeting note generation, speaker/title inference, and meeting chat features, an API key from one of the supported providers is required.
Privacy Note: Configuring a cloud-based LLM provider (OpenAI, Anthropic, Google Gemini) trades absolute privacy for these features, as meeting transcripts and notes will be sent to th
