SkillAgentSearch skills...

LikePhys

[ICLR2026] LikePhys, a training-free method that evaluates intuitive physics in video diffusion models by distinguishing physically valid and impossible videos using the denoising objective as an ELBO-based likelihood surrogate on a curated dataset of valid-invalid pairs.

Install / Use

/learn @YuanJianhao508/LikePhys
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

LikePhys: Evaluating intuitive physics understanding in video diffusion models via likelihood preference

LikePhys, a training-free method that evaluates intuitive physics in video diffusion models by distinguishing physically valid and impossible videos using the denoising objective as an ELBO-based likelihood surrogate on a curated dataset of valid-invalid pairs. ICLR 2026

[arXiv] [Project Page] [Dataset]

Usage

Quick Start

  1. Setup Environment:
# Clone repository
git clone https://github.com/YuanJianhao508/LikePhys.git
cd LikePhys

# Install dependencies
pip install torch torchvision diffusers accelerate transformers
pip install opencv-python pillow numpy matplotlib tqdm

# Download dataset from Hugging Face
# Option 1: Using git clone (recommended)
git clone https://huggingface.co/datasets/JianhaoDYDY/LikePhys-Benchmark data

# Option 2: Using huggingface-cli
pip install huggingface_hub
huggingface-cli download JianhaoDYDY/LikePhys-Benchmark --repo-type dataset --local-dir ./data
  1. Run Single Evaluation:
python evaluator.py --model animatediff --data ball_drop --seed 42 --guidance_scale
  1. Run Batch Evaluation:
bash run_eval.sh

Command Line Arguments

  • --model: Model to evaluate (e.g., animatediff, cogvideox, hunyuan_t2v, ltx, mochi)
  • --data: Physics scenario to test (e.g., ball_drop, ball_collision, pendulum)
  • --seed: Random seed for reproducibility
  • --guidance_scale: Use classifier-free guidance (flag)
  • --tag_name: Custom tag for organizing experiment results

Sample Scripts

Single Model Evaluation

# Evaluate a single model on one physics scenario
python evaluator.py \
    --model animatediff \
    --data ball_drop \
    --seed 42 \
    --guidance_scale \
    --tag_name "experiment_1"

Batch Evaluation

# Run comprehensive evaluation across all models and scenarios
bash run_eval.sh

Dataset

The dataset is hosted on Hugging Face and contains paired videos (physically plausible vs. implausible) across 12 different physics scenarios.

Download from Hugging Face: https://huggingface.co/datasets/JianhaoDYDY/LikePhys-Benchmark

# Option 1: Using git clone (recommended for full dataset)
git clone https://huggingface.co/datasets/JianhaoDYDY/LikePhys-Benchmark data

# Option 2: Using huggingface-cli
pip install huggingface_hub
huggingface-cli download JianhaoDYDY/LikePhys-Benchmark --repo-type dataset --local-dir ./data

LikePhys Dataset Overview

Supported Models

  • AnimateDiff (animatediff)
  • AnimateDiff SDXL (animatediff_sdxl)
  • CogVideoX (cogvideox, cogvideox-5b)
  • Hunyuan Video (hunyuan_t2v)
  • LTX Video (ltx)
  • ModelScope (modelscope)
  • Wan Video (wan2.1-T2V-1.3b, wan2.1-T2V-14b)
  • ZeroScope (zeroscope)

Results Analysis

After evaluation, use the analysis script to check results

python read_exp_final.py

Citation

If you use LikePhys in your research, please cite:

@inproceedings{yuan2025likephys,
  title={LikePhys: Evaluating Intuitive Physics Understanding in Video Diffusion Models via Likelihood Preference},
  author={Yuan, Jianhao and Pizzati, Fabio and Pinto, Francesco and Kunze, Lars and Laptev, Ivan and Newman, Paul and Torr, Philip and De Martini, Daniele},
  booktitle={International Conference on Learning Representations (ICLR)},
  year={2026}
}

View on GitHub
GitHub Stars16
CategoryContent
Updated6d ago
Forks0

Languages

Python

Security Score

75/100

Audited on Apr 1, 2026

No findings