Emovoice

Build your own Real-time Speech Emotion Recognizer

Generate Convert Improve

Install / Use

/learn @hcmlab/Emovoice

About this skill

Quality Score

0/100

README

EmoVoice is a set of tools, which allow you to build your own real-time emotion recognizer based on acoustic properties of speech (not using word information).

Platform

Windows

Installation

Make sure Visual Studio 2015 Redistributable is installed on your machine. Then run install.cmd to download core binaries and install an embedded version of Python.

If you plan to extract SoundNet features, you will also have to execute install_tensorflow.cmd and download the file sound8.npy into the chains folder.

Documentation

https://rawgit.com/hcmlab/emovoice/master/docs/index.html

Credits

SSI -- Social Signal Interpretation Framework
LIBSVM -- A Library for Support Vector Machines
LIBLINEAR -- A Library for Large Linear Classification
openSMILE -- The Munich Versatile and Fast Open-Source Audio Feature Extractor
Emo-DB -- Berlin Database of Emotional Speech
SoundNet -- TensorFlow implementation of "SoundNet"

Reference

@inproceedings{Wagner13,
 author = {Wagner, Johannes and Lingenfelser, Florian and Baur, Tobias and Damian, Ionut and Kistler, Felix and Andr{\'e}, Elisabeth},
 title = {The social signal interpretation (SSI) framework: multimodal signal processing and recognition in real-time},
 booktitle = {Proceedings of the 21st ACM international conference on Multimedia},
 series = {MM '13},
 year = {2013},
 isbn = {978-1-4503-2404-5},
 location = {Barcelona, Spain},
 pages = {831--834},
 numpages = {4},
 url = {http://doi.acm.org/10.1145/2502081.2502223},
 doi = {10.1145/2502081.2502223},
 acmid = {2502223},
 publisher = {ACM},
 address = {New York, NY, USA},
 keywords = {multimodal fusion, open source framework, real-time pattern recognition, social signal processing},
}

License

The framework is released under LGPL (see LICENSE). Please note custom license files for the plug-ins (see LICENSE.*).

Author

Johannes Wagner, Lab for Human Centered Multimedia, 2018

Related Skills

node-connect

346.4k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

107.2k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

346.4k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

qqbot-media

346.4k

QQBot 富媒体收发能力。使用 <qqmedia> 标签，系统根据文件扩展名自动识别类型（图片/语音/视频/文件）。