SkillAgentSearch skills...

FastVC

A simple voice conversion tool

Install / Use

/learn @fmiotello/FastVC
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

<h1 align="center"> FastVC </h1>

Overview

FastVC is a fast and efficient, non-parallel and any-to-any voice conversion (VC) tool. VC involves the modification of the voice of a source speaker to make it sound like that of a target speaker, without changing the linguistic content of the sentence. Our tool exploits the task by cascading an Automatic Speech Recognition (ASR) model and a Text To Speech (TTS) model.

<p align="center"> <img src="https://user-images.githubusercontent.com/17434626/122674111-014c9480-d1d4-11eb-9310-0b50250caeab.png" width="85%"//> </p>

The ASR is based on Wav2vec 2.0 and is used to transcribe the speech from a source speaker. The TTS is based on SV2TTS and is used to generate the output speech from a target speaker embedding.

For a more detailed explanation check out the paper of our project. A demo page is available here.

Installation & usage

The software was implemented using python 3.9.4

  1. Clone the repository (git clone https://github.com/fmiotello/fastVC.git) and enter the directory (cd fastVC)
  2. (optional) Create virtual env and activate it: python -m venv env and source env/bin/activate (if using macOS/Linux) or .\env\Scripts\activate (if using Windows)
  3. Upgrade pip: python -m pip install --upgrade pip
  4. Install dependencies: python -m pip install -r requirements.txt
  5. Download the pretrained models (encoder, synthesizer, vocoder) and put them in the correct directories:
./src/encoder/saved_models/pretrained.pt
./src/synthesizer/saved_models/pretrained/pretrained.pt
./src/vocoder/saved_models/pretrained/pretrained.pt
  1. Run the main script: python src/main.py (use --help for displaying available options). The output audio will be ./src/audio/audio_out.wav.

More instructions can be found here.

Notes

This application was developed as a project at Politecnico di Milano (MSc in Music and Acoustic Engineering).

Luigi Attorresi<br> Federico Miotello<br> Eugenio Poliuti<br>

View on GitHub
GitHub Stars20
CategoryContent
Updated1mo ago
Forks5

Languages

Python

Security Score

80/100

Audited on Feb 11, 2026

No findings