MocapNET

We present MocapNET, a real-time method that estimates the 3D human pose directly in the popular Bio Vision Hierarchy (BVH) format, given estimations of the 2D body joints originating from monocular color images. Our contributions include: (a) A novel and compact 2D pose NSRM representation. (b) A human body orientation classifier and an ensemble of orientation-tuned neural networks that regress the 3D human pose by also allowing for the decomposition of the body to an upper and lower kinematic hierarchy. This permits the recovery of the human pose even in the case of significant occlusions. (c) An efficient Inverse Kinematics solver that refines the neural-network-based solution providing 3D human pose estimations that are consistent with the limb sizes of a target person (if known). All the above yield a 33% accuracy improvement on the Human 3.6 Million (H3.6M) dataset compared to the baseline method (MocapNET) while maintaining real-time performance

Generate Convert Improve

Install / Use

/learn @FORTH-ModelBasedTracker/MocapNET

About this skill

Quality Score

0/100

README

MocapNET Project

MocapNET

One click deployment in Google Collab :

News

12-06-2025

Trying Mediapipe + MocapNET v4.0 on a Lenovo Ideapad Pro 5 14IMH9 it can run at a sustained 30Hz while tackling body + hands + face!

14-2-2024 Last Wednesday (8-2-2024) I successfully defended the MocapNET Ph.D. thesis! I am currently at the process of finalizing the MocapNET PhD thesis text, however there are also some good news for the project in regards to the continuation of development and its funding in the context of the Greece4.0 call. Thank you for your patience in regards with the recent repository inactivity, which I am working hard in order to improve..! MocapNET PhD Thesis photo

2-11-2023 I am currently finalizing my PhD thesis and preparing my defence, for this reason repository maintenance is subpar at this time. However, hopefully soon, I will be able to share my ~316 page thesis that clearly and in detail goes through all of the development effort and research behind MocapNET and this repository.

2-10-2023 MocapNET v4 was successfully presented in the AMFG 2023 ICCV workshop

13-9-2023

MocapNET v4 has been accepted in the AMFG 2023 ICCV workshop with an Oral and Poster presentation. It can now also retrieve 3D gaze and BVH facial configurations. The whole codebase has been written from scratch in Python to hopefully make it more useful for the community! This is a very big change including handling all of 3D rendering using Blender python scripts. In order to keep this repository manageable the new version lives in its own mnet4 branch

We have also implemented a one click Google Collab setup, which you can use to quickly and easily test the method :

Hope that you find it useful and hopefully see you at ICCV 2023!

30-12-2022

MocapNET has a new plugin/script for the Blender 3D editor that when combined with MPFB2 (the MakeHuman addon for Blender) can make 3D animations using custom skinned humans with the output BVH files of MocapNET with a few clicks. The code targets recent Blender Versions 3.4+ Watch this video to learn how to install MPFB2 and this video to learn how to interface the provided plugin in this repository with the MocapNET output BVH file and the generated MakeHuman armature.

MocapNET Blender Plugin

30-9-2022

MocapNET was demonstrated at the Foundation of Research and technology of Greece as part of the European Researcher's Night 2022 event.

Researcher's Night 2022

1-6-2022

An added python/mediapipe utility can now help with generating 2D data for experiments! This can help you create datasets that include hands that can be processed using MocapNETv3

7-4-2022

The open call of BONSAPPS (https://bonsapps.eu/) for AI talents received 126 proposals from 31 EU countries. Out of these proposals, 30 were actually accepted. Out of the 30 running BONSAPPs projects, 10 were selected yesterday to continue into phase 2. I am very happy to report that our AUTO-MNET MocapNET based work made it to the top ten!

BonsAPPs Hackathon/Stage 2 selection

9-3-2022

MocapNET was one of the selected projects in the BonsAPPs Open Call for AI talents We are now preparing a version of MocapNET called AUTO-MNET tailored for 3D Body Tracking for automotive uses

Due to our limited resources this has currently pushed back merging of the mnet3 branch, however hopefully we will soon have a working MocapNET in the Bonseyes platform.

8-11-2021

MocapNET3 with hand pose estimation support has landed in this repository! The latest version that has been accepted in BMVC2021 is now commited in the mnet3 branch of this repository. Since however there is considerable code-polish missing and currently the 2D joint estimator offered does not contain hands there needs to be a transition to a 2D joint estimator like Mediapipe Holistic for a better live webcam demo. MocapNET3 will appear in the 32nd British Machine Vision Conference that will be held virtually and is free to attend this year!!

An upgraded 2020 version of MocapNET has landed! It contains a very big list of improvements that have been carried out during 2020 over the original work that allows higher accuracy, smoother BVH output and better occlusion robustness while maintaining realtime perfomance. MocapNET2 will appear in the 25th International Conference on Pattern Recognition

If you are interested in the older MocapNET v1 release you can find it in the mnet1 branch,

Visualization Example: With MocapNET2 an RGB video feed like this can be converted to BVH motion frames in real-time. The result can be easily used in your favourite 3D engine or application.

Sample run

Example Output: | Youtube Video | MocapNET Output | Editing on Blender | | ------------- | ------------- | ------------- | | | | |

Ensemble of SNN Encoders for 3D Human Pose Estimation in RGB Images

We present MocapNET v2, a real-time method that estimates the 3D human pose directly in the popular Bio Vision Hierarchy (BVH) format, given estimations of the 2D body joints originating from monocular color images.

Our contributions include:

A novel and compact 2D pose NSRM representation.
A human body orientation classifier and an ensemble of orientation-tuned neural networks that regress the 3D human pose by also allowing for the decomposition of the body to an upper and lower kinematic hierarchy. This permits the recovery of the human pose even in the case of significant occlusions.
An efficient Inverse Kinematics solver that refines the neural-network-based solution providing 3D human pose estimations that are consistent with the limb sizes of a target person (if known).

All the above yield a 33% accuracy improvement on the Human 3.6 Million (H3.6M) dataset compared to the baseline method (MocapNET v1) while maintaining real-time performance (70 fps in CPU-only execution).

MocapNET

Youtube Videos

Related Skills

node-connect

334.1k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

82.1k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

334.1k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

commit-push-pr

82.1k

Commit, push, and open a PR