DiffSingerMiniEngine

A minimum inference engine for DiffSinger

Generate Convert Improve

Install / Use

/learn @openvpi/DiffSingerMiniEngine

About this skill

Quality Score

0/100

README

DiffSingerMiniEngine

A minimum inference engine for DiffSinger MIDI-less mode.

Getting Started

Install onnxruntime following the official guidance.
Install other dependencies with pip install PyYAML soundfile.
Download ONNX version of the NSF-HiFiGAN vocoder from here and unzip it into assets/vocoder directory.
Download an ONNX rhythm predictor from here and put it into assets/rhythmizer directory.
Put your ONNX acoustic models into assets/acoustic directory.
Edit configs/default.yaml or create another config file according to your preference and local environment.
Run server with python server.py or python server.py --config <YOUR_CONFIG>.

API Specification

TBD

How to Obtain Acoustic Models

Train with your own dataset or download pretrained checkpoints from here.
Export PyTorch checkpoints to ONNX format. See instructions here.

Related Skills

node-connect

345.9k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

106.4k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

345.9k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

qqbot-media

345.9k

QQBot 富媒体收发能力。使用 <qqmedia> 标签，系统根据文件扩展名自动识别类型（图片/语音/视频/文件）。