Pixartdiffusion

An experiment training a diffusion model on 32x32 pixel art characters

Generate Convert Improve

Install / Use

/learn @zzbuzzard/Pixartdiffusion

About this skill

Quality Score

0/100

README

pixel-art-diffusion

A small project, using a diffusion model to generate 32x32 pixel art characters. I trained a model which is over here on some 32x32 pixel art characters. The architecture is (supposed to be) similar to the one from this paper.

Usage

The Colab (above) is the easiest way to use this! Otherwise...

This project uses PyTorch 1.11.0, Numpy 1.22.3, TorchVision 0.12.0, matplotlib 3.5.1 and tqdm.

To sample from the model, use sample.py

sample.py ../models/AOS_AOF.pt 4 -o out.png -noise_mul 8

which will give you an image similar to

example

As you can see, the quality is very variable... It works best if you use CLIP to find the good ones with the caption 'cool pixel art character' or something 😎. Here's the best 100 out of 1000 samples, according to CLIP:

example

To train the model, use train.py. E.g. to train from scratch on your own data:

train.py ../../data/*.png -save_path model.pt

Or to load from a checkpoint:

train.py ../../data/*.png -load_path ../models/AOS_AOF.pt -save_path model.pt

I have not included the dataset I used, but you can find information on it in models/README.txt.

The code has some quirks as it was written with the assumption that the dataset size is quite small (so one epoch is assumed to take not much time). This is because my dataset was only about ~2k images big :)

(Also, I haven't tested this on images that aren't 32x32, so there might be issues there)

Related Skills

node-connect

343.3k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

92.1k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

343.3k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

qqbot-media

343.3k

QQBot 富媒体收发能力。使用 <qqmedia> 标签，系统根据文件扩展名自动识别类型（图片/语音/视频/文件）。