MFER

Multiscale Facial Expression Recognition Based on Dynamic Global and Static Local Attention on 《IEEE Transacions on Affective Computing》 Journal

Generate Convert Improve

Install / Use

/learn @XuJ1E/MFER

About this skill

Quality Score

0/100

README

<div align=center> Multiscale Facial Expression Recognition Based on Dynamic Global and Static Local Attention </div>

<div align=center> Jie Xu<sup>1</sup>; Yang Li<sup>1</sup>; Guanci Yang<sup>1*</sup>; Ling He<sup>1</sup>; Kexin Luo<sup>1</sup> </div>

<div align=center> 1.Key Laboratory of Advanced Manufacturing Technology of the Ministry of Education </div>

Fig. 1 Architecture of Multiscale Facial Expression Recognition based on Dynamic Global and Static Local Attention

</div> <div align=center> <img src="./asset/Fig_2_Architecture_of_DS_attention.png" width="400" height="420" />

Fig. 2 Architecture of Dynamic Global and Static Local Attention

</div>

1、Preparation

Download the dataset MS-Celeb for Self-Supervised Training.
Download RAF-DB dataset and extract the raf-basic dir to ./datasets.
Download AffectNet dadtaset and extract the AffectNet dir to ./datasets.
Then preprocess the datasets as follow:

2、Data preparation:

We use the face alignment codes in face.evl to align face images first.
the aligned face struct as follow:

  - data/raf-db/
		 train/
		     train_00001_aligned.jpg	# aligned by MTCNN
		     train_00002_aligned.jpg	# aligned by MTCNN
		     ...
		 valid/
		     test_0001_aligned.jpg	# aligned by MTCNN
		     test_0002_aligned.jpg	# aligned by MTCNN
		     ...

3、Note:

The remaining code will be updated as soon as possible.

4、Training

CUDA_VISIBLE_DEVICES=0,1 python train.py --help

5、Models

Pre-trained models can be downloaded for evaluation as following:

6、Data distribution of RAF-DB

<div align=center> Baseline model for data distribution on RAF-DB </div> <div align=center> <img src="./asset/Fig_3(a).png" width="260" height="260" /> <img src="./asset/Fig_3(b).png" width="260" height="260" /> <img src="./asset/Fig_3(c).png" width="260" height="260" />

Fig. 3(a) w/o Feature Loss ; (b) w LGM Loss ; (c) w DSF Loss

</div> <div align=center> MFER model for data distribution on RAF-DB </div> <div align=center> <img src="./asset/Fig_4(a).png" width="260" height="260" /> <img src="./asset/Fig_4(b).png" width="260" height="260" /> <img src="./asset/Fig_4(c).png" width="260" height="260" />

Fig. 4(a) w/o Feature Loss ; (b) w LGM Loss ; (c) w DSF Loss

</div>

7、Confusion Matrices for MFER

<div align=center> Confusion Matrices for MFER on RAF-DB, AffectNet-7, AffectNet-8 and FERPlus </div> <div align=center> <img src="./asset/Fig_7(a).png" width="200" height="200" /> <img src="./asset/Fig_7(b).png" width="200" height="200" /> <img src="./asset/Fig_7(c).png" width="200" height="200" /> <img src="./asset/Fig_7(c).png" width="200" height="200" />

Fig. 7(a) RAF-DB ; (b) AffectNet-7 ; (c) AffectNet-7 ; (d) FERPlus

</div>

8、Grad_CAM of different expressions on some examples face from RAF-DB dataset

<div align=center> Grad-CAM for MFER on RAF-DB dataset </div> <div align=center> <img src="./asset/Fig_8_Grad-CAM.png" width="500" height="600" />

Fig. 8 Grad-CAM

</div>

License

Our research code is released under the MIT license. See LICENSE for details.

Reference

you may want to cite:

@ARTICLE{10678884,
  author={Xu, Jie and Li, Yang and Yang, Guanci and He, Ling and Luo, Kexin},
  journal={IEEE Transactions on Affective Computing}, 
  title={Multiscale Facial Expression Recognition Based on Dynamic Global and Static Local Attention}, 
  year={2025},
  volume={16},
  number={2},
  pages={683-696},
  keywords={Feature extraction;Attention mechanisms;Context modeling;Facial features;Face recognition;Accuracy;Semantics;Facial Expression Recognition;attention mechanism;feature loss function;multiscale classifier;deep learning},
  doi={10.1109/TAFFC.2024.3458464}}

Acknowledgement

Thanks for the code of the following:
ConvNext and WZMIAOMIAO

Related Skills

node-connect

353.3k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

111.7k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

353.3k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

qqbot-media

353.3k

QQBot 富媒体收发能力。使用 <qqmedia> 标签，系统根据文件扩展名自动识别类型（图片/语音/视频/文件）。