VisionDepth3D
Generates 3D from 2D videos in multiple output formats using AI-powered depth mapping.
Install / Use
/learn @VisionDepth/VisionDepth3DREADME
Notice
VisionDepth3D is licensed under a proprietary, no-derivatives license.
Forking, redistributing, modifying, or creating derivative works is strictly prohibited.
Table of Contents
<h2 align="center">All-in-One 3D Suite</h2> <h3 align="center">3D Generator (Stereo Composer)</h3> <p align="center"> <img width="700" height="598" alt="3Dtab" src="https://github.com/user-attachments/assets/f6c82115-c3ba-464f-91c1-7a623ed11007" /> <br> <em>(3D Generator Tab)</em> </p>- GPU-accelerated stereo warping: per-pixel, depth-aware parallax shifting (CUDA + PyTorch)
- Built on the VisionDepth3D Method, including:
- Depth shaping (Pop Controls): percentile stretch + subject recenter + curve shaping for natural separation
- Subject-anchored convergence: EMA-stabilized zero-parallax tracking for comfort and consistency
- Dynamic stereo scaling (IPD): scene-aware intensity that adapts to depth variance
- Edge-aware masking + feathering: suppress halos and clean up subject boundaries
- Floating window (DFW): cinematic edge protection to prevent window violations
- Occlusion healing: fills stereo gaps and reduces edge artifacts
- Live preview + diagnostics: anaglyph, SBS, heatmaps, edge/mask inspection, stereo difference views
- Clip-range rendering for fast testing on difficult scenes before full renders
- Export formats: Half-SBS, Full-SBS, VR (SBS 1440×1600 per eye), Anaglyph, Passive Interlaced
- Encoding pipeline: FFmpeg with CPU and hardware encoders (NVENC/AMF/QSV) plus quality controls (CRF/CQ)
Result: A production-ready 2D-to-3D engine with real-time tuning tools, stability features, and flexible export formats for VR and cinema workflows.
Depth Estimation (AI Depth Engine)
<p align="center"> <img width="700" height="598" alt="Depthtab" src="https://github.com/user-attachments/assets/c2b32af2-5be5-4d3f-9b41-cceb77b30785" /> <br> <em>(Depth Estimation Tab)</em> </p>- 25+ supported depth models (ZoeDepth, MiDaS, DPT/BEiT, DINOv2, DepthPro, Depth Anything V1/V2, Distill-Any-Depth, Marigold, and more)
- One-click model switching with auto-download + local caching
- Multiple inference backends:
- PyTorch (Transformers / TorchHub)
- ONNXRuntime (CUDA / TensorRT)
- Diffusers FP16 (for diffusion-based depth)
- Image + video + batch workflows:
- Single image
- Image folder batch
- Full video depth rendering
- Video folder batch
- Optional high precision output (when supported) for cleaner disparity and stronger stability in post
- Built-in preview modes + colormaps for fast inspection
- Stability + safety features: resolution/shape handling, codec probing, and fallback behavior to avoid common crashes
Result: Fast, flexible depth generation for everything from quick tests to full-length depth videos ready for stereo conversion.
FPS / Upscale Enhancer (RIFE + Real-ESRGAN)
<p align="center"> <img width="700" height="598" alt="frametools" src="https://github.com/user-attachments/assets/4abdc68f-b878-47b6-b185-2e39ace1ba1a" /> <br> <em>(FPS / Upscale Enhancer Tab)</em> </p>- RIFE interpolation (ONNX): 2× / 4× / 8× FPS generation with GPU acceleration
- Real-ESRGAN upscaling (ONNX): high-quality super-resolution with optional FP16
- Two processing pipelines:
- Merged (stable, low memory)
- Threaded (higher throughput, better utilization)
- Full video workflow support:
- Optional scene splitting for long videos
- Rebuild output with correct resolution, FPS, and encoding settings
- Render feedback: progress, FPS, ETA, logs, and safe cancel handling
Result: Turn low-res or low-FPS sources into clean, smooth outputs built for VR playback and high refresh displays.
Depth Blender (Multi-Source Depth Fusion)
<p align="center"> <img width="700" height="598" alt="DepthBlendTab" src="https://github.com/user-attachments/assets/89c61a02-55e8-4ff6-8ed0-bd3bd739d04e" /> <br> <em>(Depth Blender Tab)</em> </p>- Blend two depth sources into one cleaner, more stable depth map/video
- Frames or video mode:
- Pair two PNG frame folders
- Or pair two depth videos
- Live preview + scrubber: side-by-side (Base vs Blended) with fast frame navigation
- Edge-focused blend controls:
- White strength injection
- Feather blur smoothing
- CLAHE contrast shaping
- Bilateral edge-preserving denoise
- Normalization back to base for consistent depth scale
- Batch output options: overwrite base, output to new folder, or export a blended video
Result: Cleaner edges, stronger subject separation, and more consistent parallax behavior across full sequences.
Audio Tool (Rip, Attach, Attach + Stitch)
<p align="center"> <img width="558" height="587" alt="AudioTool" src="https://github.com/user-attachments/assets/bd7775a3-f625-4e77-be53-0f820a5f1b0b" /> <br> <em>(Audio Tool)</em> </p>- Rip: extract audio tracks from videos (copy or re-encode)
- Attach: mux audio back into processed clips (fast copy by default)
- Attach + Stitch: batch attach audio per clip, then stitch into one final gapless export
- Smart matching: auto-match audio to clips by filename patterns or use one audio track for all
- Audio offset control for sync fixes
- Codec control for both per-clip muxing and final output encoding
- Logging + progress via FFmpeg runner
Result: A practical post stage for restoring audio, correcting sync, and finishing multi-clip renders into one clean final movie.
Preview + Format Testing
<p align="center"> <img width="700" height="587" alt="3Dpreview" src="https://github.com/user-attachments/assets/4ce33583-8db7-40c2-a35d-d5c78efd26d9" /> <br> <em>(Live 3D Preview)</em> </p>- Real-time preview modes: Anaglyph, SBS, Passive Interlaced, Depth and Shift Heatmaps
- On-frame tuning: convergence and parallax checks without committing to long renders
- Save preview frames for quick comparisons and sharing
Smart GUI + Workflow
<img width="89" height="97" alt="image" src="https://github.com/user-attachments/assets/cb7dc3e9-403a-4e54-af0d-ac44120d1a8c" /> <img width="89" height="97" alt="HelpHotkeys" src="https://github.com/user-attachments/assets/9324a4e9-9f10-4de1-a1e9-0596711410c7" /> <img width="89" height="97" alt="Hotkeys" src="https://github.com/user-attachments/assets/45592879-ec6c-4e59-b9d5-f50144db40d9" />- Multi-tab interface with persistent settings
- Help menu + hotkeys
- Pause, resume, and cancel for long GPU jobs
- Multi-language UI support (EN, FR, ES, DE, JA)
- Hardware encoding options integrated into export workflow
Output Formats & Aspect Ratios
- Stereo formats: Half-SBS, Full-SBS, VR180, Anaglyph, Passive Interlaced
- Aspect ratios: 16:9, 2.39:1, 2.76:1, 4:3, 21:9, 1:1, 2.35:1
- Containers: MP4, MKV, AVI
- Encoders: CPU + FFmpeg hardware options (NVENC/AMF/QSV) when available
Guide Sheet: Install
📌 System Requirements
- ✔️ This program runs on python 3.13
- ✔️ This program has been tested on cuda 12.8
- ✔️ Conda (Optional, Recommended for Simplicity)
📌 Step 1: Download the VisionDepth3D Program
- 1️⃣ Download the VisionDepth3D zip file from the official download source. (green button)
- 2️⃣ Extract the zip file to your desired folder (e.g., c:\user\VisionDepth3D).
- 3️⃣ Download models Here and extract weights folder into VisionDepth3D Main Folder
- 4️⃣ Download Distill Any Depth onnx models here (if you want to use it) and put the Distill Any Depth Folder into Weights Folder
📌 Step 2: Create Env and Install Required Dependencies
🟢 Option 1: Install via pip (Standard CMD Method)
- **1️. press (Win + R), type cmd, and hit Enter
