ProcTap
Python library for capturing audio from a specific process (PID). Built on WASAPI process loopback for Windows. Linux (ALSA/PulseAudio/PipeWire) and macOS (CoreAudio) support planned.
Install / Use
/learn @m96-chan/ProcTapREADME
📡 ProcTap
Cross-Platform Per-Process Audio Capture
ProcTap is a Python library for per-process audio capture with platform-specific backends.
Capture audio from a specific process only — without system sounds or other app audio mixed in. Ideal for VRChat, games, DAWs, browsers, and AI audio analysis pipelines.
Platform Support
| Platform | Status | Backend | Notes | |----------|--------|---------|-------| | Windows | ✅ Fully Supported | WASAPI (C++ native) | Windows 10/11 (20H1+) | | Linux | ✅ Fully Supported | PipeWire Native / PulseAudio | Per-process isolation, auto-fallback (v0.3.0+) | | macOS | ✅ Officially Supported | ScreenCaptureKit | macOS 13+ (Ventura), bundleID-based (v0.4.0+) |
<sub>* Linux is fully supported with PipeWire/PulseAudio (v0.3.0+). macOS is officially supported with ScreenCaptureKit (v0.4.0+).</sub>
</div>🚀 Features
-
🎧 Capture audio from a single target process (VRChat, games, browsers, Discord, DAWs, streaming tools, etc.)
-
🌍 Cross-platform architecture → Windows (fully supported) | Linux (fully supported, v0.3.0+) | macOS (officially supported, v0.4.0+)
-
⚡ Platform-optimized backends → Windows: ActivateAudioInterfaceAsync (modern WASAPI) → Linux: PipeWire Native API / PulseAudio (fully supported, v0.3.0+) → macOS: ScreenCaptureKit API (macOS 13+, bundleID-based, v0.4.0+)
-
🧵 Low-latency, thread-safe audio engine → 48 kHz / stereo / float32 format (Windows)
-
🐍 Python-friendly high-level API
- Callback-based streaming
- Async generator streaming (
async for)
-
🔌 Native extensions for high-performance → C++ extension on Windows for optimal throughput
📦 Installation
From PyPI:
pip install proc-tap
Platform-specific dependencies are automatically installed:
- Windows: No additional dependencies
- Linux:
pulsectlis automatically installed, but you also need system packages:# Ubuntu/Debian sudo apt-get install pulseaudio-utils # Fedora/RHEL sudo dnf install pulseaudio-utils
Optional: High-Quality Audio Resampling (74% faster / 3.8x speedup for sample rate conversion):
pip install proc-tap[hq-resample]
Performance: With libsamplerate, resampling achieves 0.66ms per 10ms chunk (vs 2.6ms with scipy-only).
Compatibility Notes:
- ✅ Python 3.10-3.12: Works on all platforms
- ✅ Linux/macOS + Python 3.13+: Should work (you can try it!)
- ⚠️ Windows + Python 3.13+: May fail to build (as of 2025-01)
- If it fails, the library automatically falls back to scipy's polyphase filtering
- Still provides excellent audio quality, just 74% slower for resampling
- You can still try installing - if it works, great! If not, no harm done.
📚 Read the Full Documentation for detailed guides and API reference.
From TestPyPI (for testing pre-releases):
pip install --index-url https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple/ proctap
From Source:
git clone https://github.com/m96-chan/ProcTap
cd ProcTap
pip install -e .
🎬 CLI Usage (Pipe to FFmpeg)
ProcTap includes a CLI for piping audio directly to FFmpeg or other tools:
# Pipe to FFmpeg (MP3 encoding) - Direct command
proctap --pid 12345 --stdout | ffmpeg -f s16le -ar 48000 -ac 2 -i pipe:0 output.mp3
# Or using python -m
python -m proctap --pid 12345 --stdout | ffmpeg -f s16le -ar 48000 -ac 2 -i pipe:0 output.mp3
# Using process name instead of PID
proctap --name "VRChat.exe" --stdout | ffmpeg -f s16le -ar 48000 -ac 2 -i pipe:0 output.mp3
# FLAC encoding (lossless)
proctap --pid 12345 --stdout | ffmpeg -f s16le -ar 48000 -ac 2 -i pipe:0 output.flac
# Native float32 output (no conversion)
proctap --pid 12345 --format float32 --stdout | ffmpeg -f f32le -ar 48000 -ac 2 -i pipe:0 output.mp3
CLI Options:
| Option | Description |
|--------|-------------|
| --pid PID | Process ID to capture (required if --name not used) |
| --name NAME | Process name to capture (e.g., VRChat.exe or VRChat) |
| --stdout | Output raw PCM to stdout for piping (required) |
| --format {int16,float32} | Output format: int16 or float32 (default: int16) |
| --verbose | Enable verbose logging to stderr |
| --list-audio-procs | List all processes currently playing audio |
Finding Process IDs:
# Windows
tasklist | findstr "VRChat"
# Linux/macOS
ps aux | grep VRChat
FFmpeg Format Arguments:
The CLI outputs raw PCM at 48kHz stereo. FFmpeg needs these arguments based on --format:
int16 (default):
-f s16le: Signed 16-bit little-endian PCM-ar 48000: Sample rate (48kHz, fixed)-ac 2: Channels (stereo, fixed)-i pipe:0: Read from stdin
float32:
-f f32le: 32-bit float little-endian PCM-ar 48000: Sample rate (48kHz, fixed)-ac 2: Channels (stereo, fixed)-i pipe:0: Read from stdin
🛠 Requirements
Windows (Fully Supported):
- Windows 10 / 11 (20H1 or later)
- Python 3.10+
- WASAPI support
- No admin privileges required
Linux (Fully Supported - v0.3.0+):
- Linux with PulseAudio or PipeWire
- Python 3.10+
- Auto-detection: Automatically selects best available backend
- Native PipeWire API (in development, experimental):
libpipewire-0.3-dev:sudo apt-get install libpipewire-0.3-dev- Target latency: ~2-5ms (when fully implemented)
- Auto-selected when available (may fall back to subprocess)
- PipeWire subprocess:
pw-record: install withsudo apt-get install pipewire-media-session
- PulseAudio fallback:
pulsectllibrary: automatically installedpareccommand:sudo apt-get install pulseaudio-utils
- ✅ Per-process isolation using null-sink strategy
- ✅ Graceful fallback chain: Native → PipeWire subprocess → PulseAudio
macOS (Officially Supported - v0.4.0+):
- macOS 13.0 (Ventura) or later (macOS 13+ recommended)
- Python 3.10+
- Swift helper binary (screencapture-audio)
- Screen Recording permission (automatically prompted)
- ✅ ScreenCaptureKit Backend: Apple Silicon compatible, no AMFI/SIP hacks needed
- ✅ Simple Permissions: Screen Recording only (no Microphone/TCC hacks)
- ✅ Low Latency: ~10-15ms audio capture
🧰 Basic Usage (Callback API)
from proctap import ProcTap, StreamConfig
def on_chunk(pcm: bytes, frames: int):
print(f"Received {len(pcm)} bytes ({frames} frames)")
pid = 12345 # Target process ID
tap = ProcTap(pid, StreamConfig(), on_data=on_chunk)
tap.start()
input("Recording... Press Enter to stop.\n")
tap.close()
🔁 Async Usage (Async Generator)
import asyncio
from proctap import ProcTap
async def main():
tap = ProcTap(pid=12345)
tap.start()
async for chunk in tap.iter_chunks():
print(f"PCM chunk size: {len(chunk)} bytes")
asyncio.run(main())
📄 API Overview
class ProcTap
Control Methods:
| Method | Description |
|--------|-------------|
| start() | Start WASAPI per-process capture |
| stop() | Stop capture |
| close() | Release native resources |
Data Access:
| Method | Description |
|--------|-------------|
| iter_chunks() | Async generator yielding PCM chunks |
| read(timeout=1.0) | Synchronous: read one chunk (blocking) |
Properties:
| Property | Type | Description |
|----------|------|-------------|
| is_running | bool | Check if capture is active |
| pid | int | Get target process ID |
| config | StreamConfig | Get stream configuration |
Utility Methods:
| Method | Description |
|--------|-------------|
| set_callback(callback) | Change or remove audio callback |
| get_format() | Get audio format info (dict) |
Audio Format
Windows Backend Format (WASAPI, returned to Python):
| Parameter | Value | Description | |-----------|-------|-------------| | Sample Rate | 48,000 Hz | Professional audio quality | | Channels | 2 | Stereo | | Format | float32 | IEEE 754 floating point (-1.0 to +1.0) | | Fallback | 44.1kHz int16 | Auto-converted to 48kHz float32 if float32 init fails |
Important Note: For WAV file output, you must convert float32 to int16:
import numpy as np
def on_data(pcm: bytes, frames: int):
# Convert float32 to int16 for WAV files
float_samples = np.frombuffer(pcm, dtype=np.float32)
int16_samples = (np.clip(float_samples, -1.0, 1.0) * 32767).astype(np.int16)
wav.writeframes(int16_samples.tobytes())
🎯 Use Cases
- 🎮 R
