Emovoice
Build your own Real-time Speech Emotion Recognizer
Install / Use
/learn @hcmlab/EmovoiceREADME
EmoVoice is a set of tools, which allow you to build your own real-time emotion recognizer based on acoustic properties of speech (not using word information).
Platform
Windows
Installation
Make sure Visual Studio 2015 Redistributable is installed on your machine. Then run install.cmd to download core binaries and install an embedded version of Python.
If you plan to extract SoundNet features, you will also have to execute install_tensorflow.cmd and download the file sound8.npy into the chains folder.
Documentation
https://rawgit.com/hcmlab/emovoice/master/docs/index.html
Credits
- SSI -- Social Signal Interpretation Framework
- LIBSVM -- A Library for Support Vector Machines
- LIBLINEAR -- A Library for Large Linear Classification
- openSMILE -- The Munich Versatile and Fast Open-Source Audio Feature Extractor
- Emo-DB -- Berlin Database of Emotional Speech
- SoundNet -- TensorFlow implementation of "SoundNet"
Reference
@inproceedings{Wagner13,
author = {Wagner, Johannes and Lingenfelser, Florian and Baur, Tobias and Damian, Ionut and Kistler, Felix and Andr{\'e}, Elisabeth},
title = {The social signal interpretation (SSI) framework: multimodal signal processing and recognition in real-time},
booktitle = {Proceedings of the 21st ACM international conference on Multimedia},
series = {MM '13},
year = {2013},
isbn = {978-1-4503-2404-5},
location = {Barcelona, Spain},
pages = {831--834},
numpages = {4},
url = {http://doi.acm.org/10.1145/2502081.2502223},
doi = {10.1145/2502081.2502223},
acmid = {2502223},
publisher = {ACM},
address = {New York, NY, USA},
keywords = {multimodal fusion, open source framework, real-time pattern recognition, social signal processing},
}
License
The framework is released under LGPL (see LICENSE). Please note custom license files for the plug-ins (see LICENSE.*).
Author
Johannes Wagner, Lab for Human Centered Multimedia, 2018
Related Skills
node-connect
346.4kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
107.2kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
346.4kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
346.4kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
