MasterSelects

<h3>Browser-based Video Compositor & 3D Engine</h3> <table><tr><td align="center" style="border:none;background:#0d1117;"> <h1>⚡ 660 KB gzip</h1> initial load </td></tr></table> GPU-first editing with 30 effects, 37 blend modes, 76 AI tools, real 3D via Three.js, and only 14 dependencies. Built from scratch in 2,400+ lines of WGSL and 138k lines of TypeScript. Import OBJ, glTF, GLB, FBX models directly into the timeline. <a href="https://github.com/Sportinger/MasterSelects/releases"><img src="https://img.shields.io/badge/version-1.4.2-blue.svg" alt="Version"></a> <a href="LICENSE"><img src="https://img.shields.io/badge/license-MIT-green.svg" alt="License"></a> <a href="https://app.fossa.com/projects/custom%2b61097%2fmasterselects"><img src="https://app.fossa.com/api/projects/custom%2b61097%2fmasterselects.svg?type=shield" alt="FOSSA Status"></a> <a href="#"><img src="https://img.shields.io/badge/WebGPU-990000?style=flat-square&logo=webgpu&logoColor=white" alt="WebGPU"></a> <a href="#"><img src="https://img.shields.io/badge/Three.js-000000?style=flat-square&logo=threedotjs&logoColor=white" alt="Three.js"></a> <a href="#"><img src="https://img.shields.io/badge/React_19-61DAFB?style=flat-square&logo=react&logoColor=black" alt="React 19"></a> <a href="#"><img src="https://img.shields.io/badge/TypeScript-3178C6?style=flat-square&logo=typescript&logoColor=white" alt="TypeScript"></a> <a href="#"><img src="https://img.shields.io/badge/Vite-646CFF?style=flat-square&logo=vite&logoColor=white" alt="Vite"></a> <a href="#native-helper"><img src="https://img.shields.io/badge/Rust-000000?style=flat-square&logo=rust&logoColor=white" alt="Rust"></a>

</div>

Supported Formats

Decoding depends on what the browser supports — the container is just the wrapper, the codec inside is what matters.

<table> <tr><th colspan="2">Import (Decode)</th></tr> <tr><td>Containers</td><td>MP4, MOV, WebM, MKV, AVI, M4V</td></tr> <tr><td>Video codecs</td><td>H.264 (AVC), H.265 (HEVC)¹, VP8, VP9, AV1</td></tr> <tr><td>Audio codecs</td><td>AAC, MP3, Opus, Vorbis, FLAC, WAV/PCM</td></tr> <tr><td>Image</td><td>PNG, JPG, WebP, GIF, BMP, AVIF, SVG</td></tr> <tr><td>3D Models</td><td>OBJ, glTF, GLB, FBX — rendered via Three.js with lighting</td></tr> <tr><td>Download</td><td>YouTube, TikTok, Instagram, Twitter/X, Vimeo + <a href="https://github.com/yt-dlp/yt-dlp/blob/master/supportedsites.md">all yt-dlp sites</a> via Native Helper</td></tr> <tr><th colspan="2">Export (Encode)</th></tr> <tr><td>Containers</td><td>MP4, WebM</td></tr> <tr><td>Video codecs</td><td>H.264, H.265¹, VP9, AV1 — GPU-accelerated via WebCodecs</td></tr> <tr><td>Audio codecs</td><td>AAC (MP4), Opus (WebM)</td></tr> <tr><td>Interchange</td><td>FCPXML (Final Cut Pro / DaVinci Resolve), PNG sequence</td></tr> </table>

¹ H.265 decode/encode depends on OS & hardware — full support on Windows, partial on macOS/Linux.

MOV files work because they share the same ISO BMFF container as MP4 — any MOV with H.264/H.265 inside plays fine. MKV works if it contains browser-decodable codecs (H.264, VP9, etc.). Files with unsupported codecs (e.g. ProRes in MOV) fall back to the Native Helper decode path when available.

What Makes This Different

Most browser-based video editors share a pattern: Canvas 2D compositing, heavyweight dependency trees, and CPU-bound rendering that falls apart at scale. This project takes a fundamentally different approach.

GPU-first architecture. Preview, scrubbing, and export all run through the same WebGPU ping-pong compositor. Video textures are imported as texture_external (zero-copy, no CPU roundtrip). 37 blend modes, 3D rotation, and inline color effects all execute in a single WGSL composite shader per layer. Three.js is lazily loaded only for 3D model rendering — no GSAP, no Canvas 2D fallback in the hot path.

Zero-copy export pipeline. Frames are captured as new VideoFrame(offscreenCanvas) directly from the GPU canvas. No readPixels(), no getImageData(), no staging buffers in the default path. The GPU renders, WebCodecs encodes. That's it.

3-tier scrubbing cache. 300 GPU textures in VRAM for instant scrub (Tier 1), per-video last-frame cache for seek transitions (Tier 2), and a 900-frame RAM Preview with CPU/GPU promotion (Tier 3). When the cache is warm, scrubbing doesn't decode at all.

13 production dependencies. React, Zustand, mp4box, mp4/webm muxers, HuggingFace Transformers, ONNX Runtime, SoundTouch, WebGPU types, plus an experimental FFmpeg WASM path. Everything else is custom-built from scratch: the entire WebGPU compositor, all 30 effect shaders, the keyframe animation system, the export engine, the audio mixer, the text renderer, the mask engine, the video scope renderers, the dock/panel system, the timeline UI. Zero runtime abstraction layers between your timeline and the GPU.

Nested composition rendering. Compositions within compositions, each with their own resolution. Rendered to pooled GPU textures with frame-level caching, composited in the parent's ping-pong pass, all in a single device.queue.submit().

On-device AI. SAM2 (Segment Anything Model 2) runs entirely in-browser via ONNX Runtime. Click to select objects in the preview, propagate masks across frames. No server, no API key, no upload. ~220MB model loaded on demand.

Why I Built This

No Adobe subscription, no patience for cracks, and every free online editor felt like garbage. I wanted something that actually works: fast in the browser, GPU-first, built for real editing instead of templates, and open enough that AI can steer the timeline instead of just suggesting ideas.

The vision: an editor where AI can directly operate the tool. The built-in chat connects to OpenAI and exposes 76 editing tools. External agents can steer the running editor over a local HTTP bridge, and in development the Vite bridge still exists too. Live outputs still matter too - I've been doing video art for 16 years, so multi-output routing was never optional.

Built with Claude as my pair-programmer. Every feature gets debugged, refactored, and beaten into shape until it does what I need. ~120k lines of TypeScript, ~2,500 lines of WGSL, and a Rust native helper for the stuff browsers still can't do cleanly.

AI Control

MasterSelects is being built around the idea that AI should be able to do the edit, not just talk about it.

Built-in editor chat: OpenAI-powered, with direct access to 76 timeline/media/editing tools
External agent bridge: Claude Code or any other agent can drive the running editor via the Native Helper HTTP bridge
Experimental multicam AI: Claude/Anthropic generates edit decision lists for the multicam workflow
On-device AI: SAM2 segmentation in-browser via ONNX Runtime, plus local Whisper transcription via Transformers.js

Example Native Helper bridge call:

curl -X POST http://127.0.0.1:9877/api/ai-tools \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <startup-token>" \
  -d '{"tool":"_list","args":{}}'

This requires the Native Helper to be running, a MasterSelects editor tab to be connected, and the helper startup token. The Vite /api/ai-tools bridge still exists in development, but it is now gated by a per-session token as well.

What It Does

| Feature | Description | |---|---| | Multi-track Timeline | Cut, copy, paste, multi-select, JKL shuttle, nested compositions | | 30 GPU Effects | Color correction, blur, distort, stylize, keying - all real-time | | Video Scopes | GPU-accelerated Histogram, Vectorscope, Waveform monitor | | Keyframe Animation | Bezier curves, copy/paste, tick marks, 5 easing modes | | Vector Masks | Pen tool, edge dragging, feathering, multiple masks per clip | | SAM2 Segmentation | AI object selection in preview - click to mask, propagate across frames | | Transitions | Crossfade transitions with GPU-accelerated rendering (experimental) | | AI Integration | Built-in OpenAI chat, 76 tool-callable edit actions, and a local bridge for external agents | | Multicam AI | Sync cameras, transcribe footage, and generate Claude-powered multicam EDLs (experimental) | | Export Pipeline | WebCodecs Fast, HTMLVideo Precise, FFmpeg WASM (experimental / WIP), FCPXML | | Live EQ & Audio | 10-band parametric EQ with real-time Web Audio preview | | Download Panel | YouTube, TikTok, Instagram, Twitter/X, Vimeo, and other yt-dlp-supported sites via Native Helper | | Text & Solids | 50 Google Fonts, stroke, shadow, solid color clips | | Proxy System | GPU-accelerated proxies with resume and cache indicator | | Output Manager | Multi-window outputs, source routing, corner pin warping, slice masks | | Slot Grid | Resolume-style 12x4 grid with multi-layer live playback | | Preview & Playback | RAM Preview, transform handles, multiple render tar

MasterSelects

Install / Use

README

MasterSelects

Supported Formats

What Makes This Different

Why I Built This

AI Control

What It Does