ProSpect
Official implementation of the paper "ProSpect: Prompt Spectrum for Attribute-Aware Personalization of Diffusion Models"(SIGGRAPH Asia 2023)
Install / Use
/learn @zyxElsa/ProSpectREADME
ProSpect: Prompt Spectrum for Attribute-Aware Personalization of Diffusion Models
<!--  -->
Personalizing generative models offers a way to guide image generation with user-provided references. Current personalization methods can invert an object or concept into the textual conditioning space and compose new natural sentences for text-to-image diffusion models. However, representing and editing specific visual attributes like material, style, layout, etc. remains a challenge, leading to a lack of disentanglement and editability. To address this, we propose a novel approach that leverages the step-by-step generation process of diffusion models, which generate images from low- to high-frequency information, providing a new perspective on representing, generating, and editing images. We develop Prompt Spectrum Space P*, an expanded textual conditioning space, and a new image representation method called ProSpect. ProSpect represents an image as a collection of inverted textual token embeddings encoded from per-stage prompts, where each prompt corresponds to a specific generation stage (i.e., a group of consecutive steps) of the diffusion model. Experimental results demonstrate that P* and ProSpect offer stronger disentanglement and controllability compared to existing methods. We apply ProSpect in various personalized attribute-aware image generation applications, such as image/text-guided material/style/layout transfer/editing, achieving previously unattainable results with a single image input without fine-tuning the diffusion models.
For details see the paper
<p align="right">(<a href="#top">back to top</a>)</p> <!-- ### Built With --> <!-- This section should list any major frameworks/libraries used to bootstrap your project. Leave any add-ons/plugins for the acknowledgements section. Here are a few examples. * [Next.js](https://nextjs.org/) * [React.js](https://reactjs.org/) * [Vue.js](https://vuejs.org/) * [Angular](https://angular.io/) * [Svelte](https://svelte.dev/) * [Laravel](https://laravel.com) * [Bootstrap](https://getbootstrap.com) * [JQuery](https://jquery.com) <p align="right">(<a href="#top">back to top</a>)</p> --> <!-- GETTING STARTED -->Getting Started
Prerequisites
For packages, see environment.yaml.
conda env create -f environment.yaml
conda activate ldm
<p align="right">(<a href="#top">back to top</a>)</p>
Installation
Clone the repo
git clone https://github.com/zyxElsa/ProSpect.git
<p align="right">(<a href="#top">back to top</a>)</p>
Train
Train ProSpect:
python main.py --base configs/stable-diffusion/v1-finetune.yaml
-t
--actual_resume ./models/sd/sd-v1-4.ckpt
-n <run_name>
--gpus 0,
--data_root /path/to/directory/with/images
See configs/stable-diffusion/v1-finetune.yaml for more options
Download the pretrained Stable Diffusion Model and save it at ./models/sd/sd-v1-4.ckpt.
<p align="right">(<a href="#top">back to top</a>)</p>Test
To generate new images, run ProSpect.ipynb
Instructions
main(prompt = '*', \
ddim_steps = 50, \
strength = 0.6, \
seed=42, \
height = 512, \
width = 768, \
prospect_words = ['a teddy * walking in times square', # 10 generation ends\
'a teddy * walking in times square', # 9 \
'a teddy * walking in times square', # 8 \
'a teddy * walking in times square', # 7 \
'a teddy * walking in times square', # 6 \
'a teddy * walking in times square', # 5 \
'a teddy * walking in times square', # 4 \
'a teddy * walking in times square', # 3 \
'a teddy walking in times square', # 2 \
'a teddy walking in times square', # 1 generation starts\
], \
model = model,\
)
prompt: text promt that injected into all stages.
A '*' in the prompt will be replaced by prospect_words, if the prospect_words is not None.
Otherwise, '*' will be replaced by the learned token embedding.
Edit prospect_words to change the prompts injected into different stages.
A '*' in the prospect_words will be replaced by the learned token embedding.
For img2img, a content_dir to the image, and a strength for diffusion are needed.
A more detailed example:
Reference Image:

Content-aware T2I generation
main(prompt = '*', \
ddim_steps = 50, \
strength = 0.6, \
seed=42, \
height = 512, \
width = 768, \
prospect_words = ['a teddy * walking in times square', # 10 generation ends\
'a teddy * walking in times square', # 9 \
'a teddy * walking in times square', # 8 \
'a teddy * walking in times square', # 7 \
'a teddy * walking in times square', # 6 \
'a teddy * walking in times square', # 5 \
'a teddy * walking in times square', # 4 \
'a teddy * walking in times square', # 3 \
'a teddy walking in times square', # 2 \
'a teddy walking in times square', # 1 generation starts\
], \
model = model,\
)
with ProSpect:

Layout-aware T2I generation
main(prompt = '*', \
ddim_steps = 50, \
strength = 0.6, \
seed=41, \
height = 512, \
width = 512, \
prospect_words = ['a corgi sits on the table', # 10 generation ends\
'a corgi sits on the table', # 9 \
'a corgi sits on the table', # 8 \
'a corgi sits on the table', # 7 \
'a corgi sits on the table', # 6 \
'a corgi sits on the table', # 5 \
'a corgi sits on the table', # 4 \
'a corgi sits on the table', # 3 \
'a corgi sits on the table', # 2 \
'a corgi sits on the table *', # 1 generation starts\
], \
model = model,\
)
with ProSpect:

without ProSpect:

Material-aware T2I generation
main(prompt = '*', \
ddim_steps = 50, \
strength = 0.6, \
seed=42, \
height = 512, \
width = 768, \
prospect_words = ['a * dog on the table', # 10 generation ends\
'a * dog on the table', # 9 \
'a
Related Skills
product-manager-skills
50PM skill for Claude Code, Codex, Cursor, and Windsurf: diagnose SaaS metrics, critique PRDs, plan roadmaps, run discovery, and coach PM career transitions.
devplan-mcp-server
3MCP server for generating development plans, project roadmaps, and task breakdowns for Claude Code. Turn project ideas into paint-by-numbers implementation plans.
