StableNormal

[SIGGRAPH Asia 2024 (Journal Track)] StableNormal: Reducing Diffusion Variance for Stable and Sharp Normal

Generate Convert Improve

Install / Use

/learn @Stable-X/StableNormal

About this skill

Quality Score

0/100

README

StableNormal: Reducing Diffusion Variance for Stable and Sharp Normal<br>

Chongjie Ye*, Lingteng Qiu*, Xiaodong Gu, Qi Zuo, Yushuang Wu, Zilong Dong, Liefeng Bo, Yuliang Xiu#, Xiaoguang Han#<br>

* Equal contribution <br> # Corresponding Author

<h3 align="center">SIGGRAPH Asia 2024 (Journal Track)</h3> <div align="center">

</div>

We propose StableNormal, which tailors the diffusion priors for monocular normal estimation. Unlike prior diffusion-based works, we focus on enhancing estimation stability by reducing the inherent stochasticity of diffusion models ( i.e. , Stable Diffusion). This enables “Stable-and-Sharp” normal estimation, which outperforms multiple baselines (try Compare), and improves various real-world applications (try Demo).

teaser

News

StableNormal-turbo (10 times faster) is now avaliable on ModelScope . We invite you to explore its features! :fire::fire::fire: (10.11, 2024 UTC)
StableNormal is accepted by SIGGRAPH Asia 2024. (Journal Track)) (09.11, 2024 UTC)
Release StableDelight :fire::fire::fire: (09.07, 2024 UTC)
Release StableNormal :fire::fire::fire: (08.27, 2024 UTC)

Installation:

Please run following commands to build package:

git clone https://github.com/Stable-X/StableNormal.git
cd StableNormal
pip install -r requirements.txt

or directly build package:

pip install git+https://github.com/Stable-X/StableNormal.git

Usage

To use the StableNormal pipeline, you can instantiate the model and apply it to an image as follows:

import torch
from PIL import Image

# Load an image
input_image = Image.open("path/to/your/image.jpg")

# Create predictor instance
predictor = torch.hub.load("Stable-X/StableNormal", "StableNormal", trust_repo=True)

# Apply the model to the image
normal_image = predictor(input_image)

# Save or display the result
normal_image.save("output/normal_map.png")

Additional Options:

If you need faster inference(10 times faster), use StableNormal_turbo:

predictor = torch.hub.load("Stable-X/StableNormal", "StableNormal_turbo", trust_repo=True)

If Hugging Face is not available from terminal, you could download the pretrained weights to weights dir:

predictor = torch.hub.load("Stable-X/StableNormal", "StableNormal", trust_repo=True, local_cache_dir='./weights')

Compute Metrics:

This section provides guidance on evaluating your normal predictor using the DIODE dataset.

Step 1: Prepare Your Results Folder

First, make sure you have generated a normal map and structured your results folder as shown below:

├── YOUR-FOLDER-NAME 
│   ├── scan_00183_00019_00183_indoors_000_010_gt.png
│   ├── scan_00183_00019_00183_indoors_000_010_init.png
│   ├── scan_00183_00019_00183_indoors_000_010_ref.png
│   ├── scan_00183_00019_00183_indoors_000_010_step0.png
│   ├── scan_00183_00019_00183_indoors_000_010_step1.png
│   ├── scan_00183_00019_00183_indoors_000_010_step2.png
│   ├── scan_00183_00019_00183_indoors_000_010_step3.png

Step 2: Compute Metric Values

Once your results folder is set up, you can compute the metrics for your normal predictions by running the following scripts:

# compute metrics
python ./stablenormal/metrics/compute_metric.py -i ${YOUR-FOLDER-NAME}

# compute variance
python ./stablenormal/metrics/compute_variance.py -i ${YOUR-FOLDER-NAME}

Replace ${YOUR-FOLDER-NAME}; with the actual name of your results folder. Following these steps will allow you to effectively evaluate your normal predictor's performance on the DIODE dataset.

Metrics

On DIODE-indoor

| | Mean Error | Median Error | <11.25 | <22.5 | <30 | | :----------------- | :--------: | :----------: | :--------: | :--------: | :--------: | | GeoWizard | 19.371 | 15.408 | 30.551 | 75.426 | 86.357 | | Marigold Normal | 16.671 | 12.084 | 45.776 | 82.076 | 89.879 | | GenPercept | 18.348 | 13.367 | 39.178 | 79.819 | 88.551 | | DSINE | 18.453 | 13.871 | 36.274 | 77.527 | 86.976 | | StableNormal-turbo | 16.748 | 13.573 | 35.806 | 84.585 | 91.335 | | StableNormal | 13.701 | 9.460 | 63.447 | 86.309 | 92.107 |

On IBims-1

| | Mean Error | Median Error | < 11.25 | < 22.5 | < 30 | | :----------------- | :--------: | :----------: | :--------: | :--------: | :--------: | | GeoWizard | 19.748 | 9.702 | 58.427 | 77.616 | 81.575 | | Marigold Normal | 18.463 | 8.442 | 64.727 | 79.559 | 83.199 | | GenPercept | 18.600 | 8.293 | 64.697 | 79.329 | 82.978 | | DSINE | 18.773 | 8.258 | 64.131 | 78.570 | 82.160 | | StableNormal-turbo | 17.433 | 8.145 | 65.683 | 80.909 | 84.527 | | StableNormal | 17.248 | 8.057 | 66.655 | 81.134 | 84.632 |

On Scannet

| | Mean Error | Median Error | < 11.25 | < 22.5 | < 30 | | :----------------- | :--------: | :----------: | :--------: | :--------: | :--------: | | GeoWizard | 21.439 | 13.390 | 37.080 | 71.653 | 79.712 | | Marigold Normal | 21.284 | 12.268 | 45.649 | 72.666 | 79.045 | | GenPercept | 20.652 | 10.502 | 53.017 | 74.470 | 80.364 | | DSINE | 18.610 | 9.885 | 56.132 | 76.944 | 82.606 | | StableNormal-turbo | 17.432 | 9.644 | 58.643 | 79.177 | 84.717 | | StableNormal | 18.098 | 10.097 | 56.007 | 78.776 | 84.115 |

On NYUv2

| | Mean Error | Median Error | < 11.25 | < 22.5 | < 30 | | ------------------ | :--------: | :----------: | :--------: | :--------: | :--------: | | GeoWizard | 20.363 | 11.898 | 46.954 | 73.787 | 80.804 | | Marigold Normal | 20.864 | 11.134 | 50.457 | 73.003 | 79.332 | | GenPercept | 20.896 | 11.516 | 50.712 | 73.037 | 79.216 | | DSINE | - | - | - | - | - | | StableNormal-turbo | 18.788 | 10.381 | 53.741 | 76.713 | 82.884 | | StableNormal | 19.707 | 10.527 | 53.042 | 75.889 | 81.723 |

Citation

@article{ye2024stablenormal,
  title={StableNormal: Reducing Diffusion Variance for Stable and Sharp Normal},
  author={Ye, Chongjie and Qiu, Lingteng and Gu, Xiaodong and Zuo, Qi and Wu, Yushuang and Dong, Zilong and Bo, Liefeng and Xiu, Yuliang and Han, Xiaoguang},
  journal={ACM Transactions on Graphics (TOG)},
  year={2024},
  publisher={ACM New York, NY, USA}
}

Related Skills

node-connect

345.4k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

104.6k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

345.4k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

qqbot-media

345.4k

QQBot 富媒体收发能力。使用 <qqmedia> 标签，系统根据文件扩展名自动识别类型（图片/语音/视频/文件）。