RawNetLite: Lightweight End-to-End Audio Deepfake Detection

This repository contains the official implementation of the paper:

End-to-end Audio Deepfake Detection from RAW Waveforms: a RawNet-Based Approach with Cross-Dataset Evaluation
Andrea Di Pierno, Luca Guarnera, Dario Allegra, Sebastiano Battiato
In Proceedings of the VERIMEDIA Workshop at IJCNN 2025, Rome, Italy.

🧠 Overview

RawNetLite is a lightweight convolutional-recurrent model designed to detect audio deepfakes directly from raw waveforms, without relying on handcrafted features or large pretrained models.

The model is trained and evaluated under in-domain and cross-dataset scenarios using three public datasets: FakeOrReal, AVSpoof2021, and CodecFake.

We introduce a training pipeline based on:

Raw waveform input
Domain-mix learning
Focal Loss optimization
Waveform-level audio augmentations

📄 Paper

If you use this code, please cite our paper:

@inproceedings{dipierno2025rawnetlite,
  title     = {End-to-end Audio Deepfake Detection from RAW Waveforms: a RawNet-Based Approach with Cross-Dataset Evaluation},
  author    = {Andrea Di Pierno and Luca Guarnera and Dario Allegra and Sebastiano Battiato},
  booktitle = {International Joint Conference on Neural Networks (IJCNN) - VERIMEDIA Workshop},
  year      = {2025}
}

📂 Directory Structure

.
├── models/                   # Pretrained models
    ├── rawnet_lite.pt                                      # Basic RawNetLite model
    ├── cross_domain_rawnet_lite.pt                         # Cross-domain RawNetLite model
    ├── cross_domain_focal_rawnet_lite.pt                   # Cross-domain RawNetLite with Focal Loss
    ├── triple_cross_domain_focal_rawnet_lite.pt            # Triple cross-domain RawNetLite with Focal Loss
    ├── augmented_triple_cross_domain_focal_rawnet_lite.pt  # Augmented triple cross-domain RawNetLite with Focal Loss
├── .gitignore                # Git ignore file
├── .gitattributes            # Git attributes file
├── audio_preprocessor.py     # Audio preprocessing module
├── AVSpoof_dataset.py        # AVSpoof PyTorch dataset
├── CodecFake_dataset.py      # CodecFake PyTorch dataset
├── focal_loss.py             # Focal loss implementation
├── FOR_dataset.py            # FakeOrReal PyTorch dataset
├── LICENSE                   # License file
├── Mixed_dataset.py          # Mixed-domain PyTorch datasets
├── RawNetLite.py             # Main model architecture
├── README.md                 # This file
├── requirements.txt          # Dependencies
├── tester.py                 # Testing script
└── trainer.py                  # Training script

🛠 Setup

Clone the repository:

git clone https://github.com/adipiz99/rawnetlite.git
cd RawNetLite

Install the required packages:

pip install -r requirements.txt

🔁 Preprocessing

To preprocess your dataset into waveform tensors (.pt):

python audio_preprocessor.py \
    --csv_path metadata.csv \
    --input_dir path/to/audio \
    --output_root data/audio_processed/

This will create real_processed/ and fake_processed/ folders with normalized, trimmed, and resampled audio waveforms.

🧪 Training & evaluation

To run the training script, set the parameters in trainer.py and use:

python trainer.py

The script outputs:

Training loss, validation accuracy and F1 score
Classification metrics (Precision, Recall, F1) for the validation set
Equal Error Rate (EER) and threshold for the validation set
A model trained following the specified parameters

To run the test bench and evaluate all models across all datasets, set the parameters in tester.py and use:

python tester.py

The script outputs:

Classification metrics (Precision, Recall, F1)
Equal Error Rate (EER) and threshold
Support for FakeOrReal, AVSpoof2021, CodecFake, and mixed-domain evaluations

Please note that the training and testing scripts need to be run using different data, to avoid dataset overlapping.

🎯 Pretrained Models

Pretrained have been released into the models/ folder.

rawnet_lite.pt: Basic RawNetLite model trained on the FakeOrReal dataset with BCE Loss.
cross_domain_rawnet_lite.pt: Cross-domain RawNetLite model trained on the FOR dataset and the AVSpoof2021 dataset with BCE Loss.
cross_domain_focal_rawnet_lite.pt: Cross-domain RawNetLite model trained on the FOR dataset and the AVSpoof2021 dataset with Focal Loss.
triple_cross_domain_focal_rawnet_lite.pt: Triple cross-domain RawNetLite model trained on the FOR dataset, the AVSpoof2021 dataset, and the CodecFake dataset with Focal Loss.
augmented_triple_cross_domain_focal_rawnet_lite.pt: Augmented triple cross-domain RawNetLite model trained on the FOR dataset, the AVSpoof2021 dataset, and the CodecFake dataset with Focal Loss and augmentation.

🗂 Datasets

This repository supports the following datasets:

Each must be preprocessed using the provided script. Ensure correct splits for training and evaluation.

⚖ License

This project is licensed under the MIT License. See the LICENSE file for details.

📬 Contact

If you have questions or find this project useful, feel free to contact us:

Andrea Di Pierno — andrea.dipierno@imtlucca.it

📌 Acknowledgments

This study has been partially supported by SERICS (PE00000014) under the MUR National Recovery and Resilience Plan funded by the European Union - NextGenerationEU

RawNetLite

Install / Use

README