PeerRTF
Robust Relative Transfer Function (RTF) Estimation using Graph Neural Networks
Install / Use
/learn @levidaniel96/PeerRTFREADME
peerRTF: Robust MVDR Beamforming Using Graph Convolutional Network
<div align="center">Paper | Project Page | Introduction | Training | Evaluation | Citation
</div>
Introduction
This repository contains the implementation of the method introduced in our paper, which presents a novel approach for accurately and robustly estimating Relative Transfer Functions (RTFs). Accurate RTF estimation is crucial for designing effective microphone array beamformers, particularly in challenging noisy and reverberant environments.
Overview
The proposed method leverages prior knowledge of the acoustic environment to enhance the robustness of RTF estimation by learning the RTF manifold. The key innovation in this work is the use of a Graph Convolutional Network (GCN) to learn and infer a robust representation of the RTFs within a confined area. This approach significantly improves the performance of beamformers by providing more reliable RTF estimation in complex acoustic settings.
Installation
to create the environment, run the following command:
conda create --name your_environment_name --file requirements.txt
Training
for training, run the following command:
python main.py
Note that you need to collect the data and put it in the data folder. You can estimate the RTFs using the code provided in this repository.
the data should be in the following format:
data
├── train
│ ── noisy graphs
│ ├── graph_data_1.pt
│ ├── graph_data_2.pt
│ └── ...
│── val
│ ── noisy graphs
│ ├── graph_data_1.pt
│ ├── graph_data_2.pt
│ └── ...
└── test
── noisy graphs
├── graph_data_1.pt
├── graph_data_2.pt
└── ...
each graph_data should contain the following:
{
'graph': graph, # the graph
'edge_index': edge_index, # the edge index of the graph
'RTF': RTF, # the RTF of the noisy signal are the nodes of the graph
'clean': clean, # the target RTF(Oracle)
'y': noisy data, # the noisy signal(M channels)
'x': clean data, # the clean signal(M channels)
'n': noise data, # the noise signal(M channels)
'index': index, # the index of the node in the graph
}
The code will create a model, train it, and save it in the models folder.
Evaluation
For evaluation, run the following command:
cd evaluation
python evaluation.py
During the evaluation, the code creates new noisy examples and estimates the RTFs using the GEVD. These estimated RTFs are then connected to the graphs using the KNN algorithm. The trained model is used to estimate the robust RTFs. Finally, the estimated RTFs are used to estimate the speech signal using the MVDR beamformer. The SNR, STOI, ESTOI, and DNSMOS scores for the estimated speech signal are then calculated.
example of an output:
SNR in: -6.00
SNR out GEVD: 17.48
SNR out peerRTF: 19.30
STOI in: 31.82
STOI out GEVD: 59.85
STOI out peerRTF: 59.70
ESTOI in: 20.52
ESTOI out GEVD: 43.14
ESTOI out peerRTF: 43.68
DNSMOS results:
referance signal 3.08
noisy signal 2.25
peerRTF 2.55
GEVD 2.44
Citation
If you use this code in your research, please cite our paper:
@article{levi2025peerrtf,
title={{peerRTF: Robust MVDR} Beamforming Using Graph Convolutional Network},
author={Levi, Daniel and Sofer, Amit and Gannot, Sharon},
journal={IEEE/ACM Transactions on Audio, Speech, and Language Processing},
year={2025},
volume={33},
pages={1349--1363},
publisher={IEEE}
Related Skills
node-connect
345.4kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
104.6kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
345.4kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
345.4kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
