HeadPoseEstimate
Head Pose Estimation based on 3dmm.
Install / Use
/learn @bubingy/HeadPoseEstimateREADME
HeadPoseEstimate
Introduce
This is a head pose estimation system based on 3d facial landmarks. Please realize it's not the most advanced method in this field. Until I created this repository, there have been some end-to-end solutions.

Usage
For image, run python estimate_head_pose.py -i <path of image> --onnx.
For video, run python estimate_head_pose_video.py -i <path of video> --onnx.
How does it work
1. Get the 3d facial landmarks
First, thanks for cleardusk's excellent work on 3DDFA_V2. With TDDFA model, we can get 3d facial landmarks quickly and precisely.
2. Determine direction of face
The horizontal direction hd and vertical direction vd of face can be determined by PCA. Let's notate facial orientation with fd, then fd = hd x vd. Note: x is cross products.
Here is an example. The origin image(from Biwi_Kinect_Head_Pose_Database):

The following image shows 68 landmarks.
Red axis: X
Green axis: Y
Blue axis: Z
The three yellow arrows are hd, vd and fd.

3. Estimate rotation
Normalize hd, vd and fd, make them as unit vectors.
Rotation matrix can be estimated with Kabsch algorithm.
Citation
@inproceedings{guo2020towards,
title = {Towards Fast, Accurate and Stable 3D Dense Face Alignment},
author = {Guo, Jianzhu and Zhu, Xiangyu and Yang, Yang and Yang, Fan and Lei, Zhen and Li, Stan Z},
booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
year = {2020}
}
@misc{3ddfa_cleardusk,
author = {Guo, Jianzhu and Zhu, Xiangyu and Lei, Zhen},
title = {3DDFA},
howpublished = {\url{https://github.com/cleardusk/3DDFA}},
year = {2018}
}
Related Skills
node-connect
351.4kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
110.7kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
351.4kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
351.4kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
