Image2reverb

[ICCV 2021] Image2Reverb: Cross-Modal Reverb Impulse Response Synthesis.

Generate Convert Improve

Install / Use

/learn @nikhilsinghmus/Image2reverb

About this skill

Quality Score

0/100

README

Image2Reverb

Image2Reverb: Cross-Modal Reverb Impulse Response Synthesis

Nikhil Singh, Jeff Mentch, Jerry Ng, Matthew Beveridge, Iddo Drori

Project Page

Code for the ICCV 2021 paper [arXiv]. Image2Reverb is a method for generating audio impulse responses, to simulate the acoustic reverberation of a given environment, from a 2D image of it.

Dependencies

Model/Data:

PyTorch>=1.7.0
PyTorch Lightning
torchvision
torchaudio
librosa
PyRoomAcoustics
PIL

Eval/Preprocessing:

PySoundfile
SciPy
Scikit-Learn
python-acoustics
google-images-download
matplotlib

Resources

Model Checkpoint

Code Acknowlegdements

We borrow and adapt code snippets from GANSynth (and this PyTorch re-implementation), additional snippets from this PGGAN implementation, monodepth2, this GradCAM implementation, and more.

Citation

If you find the code, data, or models useful for your research, please consider citing our paper:

@InProceedings{Singh_2021_ICCV,
    author    = {Singh, Nikhil and Mentch, Jeff and Ng, Jerry and Beveridge, Matthew and Drori, Iddo},
    title     = {Image2Reverb: Cross-Modal Reverb Impulse Response Synthesis},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    month     = {October},
    year      = {2021},
    pages     = {286-295}
}

Related Skills

node-connect

352.9k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

111.5k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

352.9k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

qqbot-media

352.9k

QQBot 富媒体收发能力。使用 <qqmedia> 标签，系统根据文件扩展名自动识别类型（图片/语音/视频/文件）。