ResidualMaskingNetwork
ICPR 2020: Facial Expression Recognition using Residual Masking Network
Install / Use
/learn @phamquiluan/ResidualMaskingNetworkREADME
Facial Expression Recognition using Residual Masking Network
(*) State-of-the-art Facial Expression Recognition / Emotion Detection.
<p align="center"> <img width=1000 src= "https://user-images.githubusercontent.com/24642166/284939631-ee2909f0-f084-47bb-8262-2c1728166fba.jpg"/> </p>Installation
- Install from pip
pip install rmn
- Or build from source
git clone git@github.com:phamquiluan/ResidualMaskingNetwork.git
cd ResidualMaskingNetwork
pip install -e .
Quick Start
from rmn import RMN
import cv2
# Initialize the model
m = RMN()
# Detect emotions from an image
image = cv2.imread("your-image.png")
results = m.detect_emotion_for_single_frame(image)
print(results)
# Draw results on the image
image = m.draw(image, results)
cv2.imwrite("output.png", image)
Webcam Demo
from rmn import RMN
m = RMN()
m.video_demo()
<p align="center">
<img width="41%" src= "https://user-images.githubusercontent.com/24642166/117097030-d4176480-ad94-11eb-8c65-097a62ede067.png"/>
<img width="58%" src= "https://user-images.githubusercontent.com/24642166/72135777-da244d80-33b9-11ea-90ee-706b25c0a5a9.png"/>
</p>
Table of Contents
- <a href='#benchmarking_fer2013'>Benchmarking on FER2013</a>
- <a href='#benchmarking_imagenet'>Benchmarking on ImageNet</a>
- <a href='#datasets'>Download datasets</a>
- <a href='#train_fer'>Training on FER2013</a>
- <a href='#train_imagenet'>Training on ImageNet</a>
- <a href='#eval'>Evaluation results</a>
- <a href='#docs'>Download dissertation and slide</a>
Benchmarking on FER2013
We benchmark our code thoroughly on two datasets: FER2013 and VEMO. Below are the results and trained weights:
| Model | Accuracy | | --------------------------------------------------------------------------------------------------------- | -------- | | VGG19 | 70.80 | | EfficientNet_b2b | 70.80 | | Googlenet | 71.97 | | Resnet34 | 72.42 | | Inception_v3 | 72.72 | | Bam_Resnet50 | 73.14 | | Densenet121 | 73.16 | | Resnet152 | 73.22 | | Cbam_Resnet50 | 73.39 | | ResMaskingNet | 74.14 | | ResMaskingNet + 6 | 76.82 |
Results in VEMO dataset could be found in my thesis or slide (attached below)
<p id="benchmarking_imagenet"></p>Benchmarking on ImageNet
We also benchmark our model on ImageNet dataset.
| Model | Top-1 Accuracy | Top-5 Accuracy | | -------------------------------------------------------------------------------------------- | -------------- | -------------- | | Resnet34 | 72.59 | 90.92 | | CBAM Resnet34 | 73.77 | 91.72 | | ResidualMaskingNetwork | 74.16 | 91.91 |
<p id="datasets"></p>Datasets
- FER2013 Dataset (locate it in
saved/data/fer2013likesaved/data/fer2013/train.csv) - ImageNet 1K Dataset (ensure it can be loaded by torchvision.datasets.Imagenet)
Training on FER2013
To train the networks, you need to specify the model name and other hyperparameters in the config file (located at configs/*) then ensure it is loaded in main file, then run training procedure by simply run main file, for example:
python main_fer.py # Example for fer2013_config.json file
The best checkpoints will chosen at term of best validation accuracy, located at saved/checkpoints. By default, it will train alexnet model, you can switch to another model by edit configs/fer2013_config.json file (to resnet18 or cbam_resnet50 or my network resmasking_dropout1.
Training on the Imagenet dataset
To perform training resnet34 on 4 V100 GPUs on a single machine:
python ./main_imagenet.py -a resnet34 --dist-url 'tcp://127.0.0.1:12345' --dist-backend 'nccl' --multiprocessing-distributed --world-size 1 --rank 0
<p id="eval"></p>
Ensemble method
I used the no-weighted sum average ensemble method to fuse 7 different models together, to reproduce results, you need to do some steps:
- Download all needed trained weights and locate them on the
./saved/checkpoints/directory. The link to download can be found in the Benchmarking section. - Edit file
gen_resultsand run it to generate result offline for each model. - Run the
gen_ensemble.pyfile to generate accuracy for example methods.
Dissertation and Slide
- Dissertation PDF (in Vietnamese)
- Dissertation Overleaf Source
- Presentation slide PDF (in English) with full appendix
- Presentation slide Overleaf Source
- ICPR Paper
- ICPR Poster Overleaf Source
Citation
Pham Luan, The Huynh Vu, and Tuan Anh Tran. "Facial Expression Recognition using Residual Masking Network". In: Proc. ICPR. 2020.
@inproceedings{pham2021facial,
title={Facial expression recognition using residual masking network},
author={Pham, Luan and Vu, The Huynh and Tran, Tuan Anh},
booktitle={2020 25th International Conference on Pattern Recognition (ICPR)},
pages={4513--4519},
year={2021},
organization={IEEE}
}
Star History
License
This project is licensed under the MIT License - see the LICENSE file for details.
Related Skills
node-connect
346.8kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
107.6kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
346.8kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
346.8kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
