Deep3DFaceReconstruction
Accurate 3D Face Reconstruction with Weakly-Supervised Learning: From Single Image to Image Set (CVPRW 2019)
Install / Use
/learn @microsoft/Deep3DFaceReconstructionREADME
Accurate 3D Face Reconstruction with Weakly-Supervised Learning: From Single Image to Image Set
<p align="center"> <img src="/images/example.gif"> </p>***07/20/2021: A PyTorch implementation which has much better performance and is much easier to use is available now. This repo will not be maintained in future. ***
This is a tensorflow implementation of the following paper:
Y. Deng, J. Yang, S. Xu, D. Chen, Y. Jia, and X. Tong, Accurate 3D Face Reconstruction with Weakly-Supervised Learning: From Single Image to Image Set, IEEE Computer Vision and Pattern Recognition Workshop (CVPRW) on Analysis and Modeling of Faces and Gestures (AMFG), 2019. (Best Paper Award!)
The method enforces a hybrid-level weakly-supervised training for CNN-based 3D face reconstruction. It is fast, accurate, and robust to pose and occlussions. It achieves state-of-the-art performance on multiple datasets such as FaceWarehouse, MICC Florence and BU-3DFE.
Features
● Accurate shapes
The method reconstructs faces with high accuracy. Quantitative evaluations (shape errors in mm) on several benchmarks show its state-of-the-art performance:
|Method|FaceWareHouse|Florence|BU3DFE| |:---:|:---:|:---:|:---:| |Tewari et al. 17</center>|2.19±0.54|-|-| |Tewari et al. 18|1.84±0.38|-|-| |Genova et al. 18|-|1.77±0.53|-| |Sela et al. 17|-|-|2.91±0.60| |PRN 18|-|-|1.86±0.47| |Ours|1.81±0.50|1.67±0.50|1.40±0.31|
(Please refer to our paper for more details about these results)
● High fidelity textures
The method produces high fidelity face textures meanwhile preserves identity information of input images. Scene illumination is also disentangled to generate a pure albedo.
<p align="center"> <img src="/images/albedo.png"> </p>● Robust
The method can provide reasonable results under extreme conditions such as large pose and occlusions.
<p align="center"> <img src="/images/extreme.png"> </p>● Aligned with images
Our method aligns reconstruction faces with input images. It provides face pose estimation and 68 facial landmarks which are useful for other tasks. We conduct an experiment on AFLW_2000 dataset (NME) to evaluate the performance, as shown in the table below:
<p align="center"> <img src="/images/alignment.png"> </p>|Method|[0°,30°]|[30°,60°]|[60°,90°]|Overall| |:---:|:---:|:---:|:---:|:---:| |3DDFA 16</center>|3.78|4.54|7.93|5.42| |3DDFA+SDM 16|3.43|4.24|7.17|4.94| |Bulat et al. 17|2.47|3.01|4.31|3.26| |PRN 18|2.75|3.51|4.61|3.62| |Ours|2.56|3.11|4.45|3.37|
● Easy and Fast
Faces are represented with Basel Face Model 2009, which is easy for further manipulations (e.g expression transfer). ResNet-50 is used as backbone network to achieve over 50 fps (on GTX 1080) for reconstructions.
Getting Started
Testing Requirements
- Reconstructions can be done on both Windows and Linux. However, we suggest running on Linux because the rendering process is only supported on Linux.
- Python 3.6 (numpy, scipy, pillow, argparse).
- Tensorflow 1.12.
- Basel Face Model 2009 (BFM09).
- Expression Basis (transferred from Facewarehouse by Guo et al.). The original BFM09 model does not handle expression variations so extra expression basis are needed.
- tf mesh renderer. We use the library to render reconstruction images. Note that the rendering tool can only be used on Linux.
Installation
1. Clone the repository
git clone https://github.com/Microsoft/Deep3DFaceReconstruction --recursive
cd Deep3DFaceReconstruction
2. Set up the python environment
If you use anaconda, run the following:
conda create -n deep3d python=3.6
source activate deep3d
conda install tensorflow-gpu==1.12.0 scipy
pip install pillow argparse
Alternatively, you can install tensorflow via pip install (In this way, you need to link /usr/local/cuda to cuda-9.0):
pip install tensorflow-gpu==1.12.0
3. Compile tf_mesh_renderer
If you install tensorflow using pip, we provide a pre-compiled binary file (rasterize_triangles_kernel.so) of the library. Note that the pre-compiled file can only be run with tensorflow 1.12.
If you install tensorflow using conda, you have to compile tf_mesh_renderer from sources. Compile tf_mesh_renderer with Bazel. Set -D_GLIBCXX_USE_CXX11_ABI=1 in ./mesh_renderer/kernels/BUILD before the compilation:
cd tf_mesh_renderer
git checkout ba27ea1798
git checkout master WORKSPACE
bazel test ...
cd ..
If the library is compiled correctly, there should be a file named "rasterize_triangles_kernel.so" in ./tf_mesh_renderer/bazel-bin/mesh_renderer/kernels.
After compilation, copy corresponding files to ./renderer subfolder:
cd renderer
cp ./tf_mesh_renderer/mesh_renderer/{camera_utils.py,mesh_renderer.py,rasterize_triangles.py} ./renderer/
cp ./tf_mesh_renderer/bazel-bin/mesh_renderer/kernels/rasterize_triangles_kernel.so ./renderer/
If you download our pre-compiled binary file, put it into ./renderer subfolder as well.
Replace the library path in Line 26 in ./renderer/rasterize_triangles.py with "./renderer/rasterize_triangles_kernel.so".
Replace "xrange" function in Line 109 in ./renderer/rasterize_triangles.py with "range" function for compatibility with python3.
Testing with pre-trained network
-
Download the Basel Face Model. Due to the license agreement of Basel Face Model, you have to download the BFM09 model after submitting an application on its home page. After getting the access to BFM data, download "01_MorphableModel.mat" and put it into ./BFM subfolder.
-
Download the Expression Basis provided by Guo et al. You can find a link named "CoarseData" in the first row of Introduction part in their repository. Download and unzip the Coarse_Dataset.zip. Put "Exp_Pca.bin" into ./BFM subfolder. The expression basis are constructed using Facewarehouse data and transferred to BFM topology.
-
Download the pre-trained reconstruction network, unzip it and put "FaceReconModel.pb" into ./network subfolder.
-
Run the demo code.
python demo.py
- ./input subfolder contains several test images and ./output subfolder stores their reconstruction results. For each input test image, two output files can be obtained after running the demo code:
- "xxx.mat" :
- cropped_img: an RGB image after alignment, which is the input to the R-Net
- recon_img: an RGBA reconstruction image aligned with the input image (only on Linux).
- coeff: output coefficients of R-Net.
- face_shape: vertex positions of 3D face in the world coordinate.
- face_texture: vertex texture of 3D face, which excludes lighting effect.
- face_color: vertex color of 3D face, which takes lighting into consideration.
- lm_68p: 68 2D facial landmarks derived from the reconstructed 3D face. The landmarks are aligned with cropped_img.
- lm_5p: 5 detected landmarks aligned with cropped_img.
- "xxx_mesh.obj" : 3D face mesh in the world coordinate (best viewed in MeshLab).
- "xxx.mat" :
Training requirements
- Training is only supported on Linux. To train new model from scratch, more requirements are needed on top of the requirements listed in the testing stage.
- Facenet provided by Sandberg et al. In our paper, we use a network to exrtact perceptual face features. This network model cannot be publicly released. As an alternative, we recommend using the Facenet from Sandberg et al. This repo uses the version 20170512-110547 trained on MS-Celeb-1M. Training process has been tested with this model to ensure similar results.
- Resnet50-v1 pre-trained on ImageNet from Tensorflow Slim. We use the version resnet_v1_50_2016_08_28.tar.gz as an initialization of the face reconstruction network.
- 68-facial-landmark detector. We use 68 facial landmarks for loss calculation during training. To make the training process reproducible, we provide a lightweight detector that produce comparable results to the method of Bulat et al.. The detector is trained on 300WLP, LFW, and LS3D-W.
Training preparation
- Download the pre-trained weights of Facenet provided by Sandberg et al., unzip it and put all files in ./weights/id_net.
- Download the pre-trained weights of Resnet_v1_50 provided by Tensorflow Slim, unzip it and put resnet_v1_50.ckpt in ./weights/resnet.
- Download the [68 landmark detector](https://drive.google.com/file/d/1KYFeTb963jg0F47sTiwqDdh
Related Skills
YC-Killer
2.7kA library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.
groundhog
399Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).
sec-edgar-agentkit
10AI agent toolkit for accessing and analyzing SEC EDGAR filing data. Build intelligent agents with LangChain, MCP-use, Gradio, Dify, and smolagents to analyze financial statements, insider trading, and company filings.
last30days-skill
5.9kAI agent skill that researches any topic across Reddit, X, YouTube, HN, Polymarket, and the web - then synthesizes a grounded summary
