Diffusiondb
A large-scale text-to-image prompt gallery dataset based on Stable Diffusion
Install / Use
/learn @poloclub/DiffusiondbREADME
DiffusionDB <a href="https://huggingface.co/datasets/poloclub/diffusiondb"><picture><source media="(prefers-color-scheme: dark)" srcset="https://i.imgur.com/yGxUUlX.png"><img src="favicon.ico" align="right" src="favicon.ico" height="40px"></picture>
<!-- [](https://doi.org/10.1145/3491101.3519653) --> <img width="100%" src="https://user-images.githubusercontent.com/15007159/201762588-f24db2b8-dbb2-4a94-947b-7de393fc3d33.gif">DiffusionDB is the first large-scale text-to-image prompt dataset. It contains 14 million images generated by Stable Diffusion using prompts and hyperparameters specified by real users. The unprecedented scale and diversity of this human-actuated dataset provide exciting research opportunities in understanding the interplay between prompts and generative models, detecting deepfakes, and designing human-AI interaction tools to help users more easily use these models.
Get Started
DiffusionDB is available at 🤗 Hugging Face Datasets.
Two Subsets
DiffusionDB provides two subsets (DiffusionDB 2M and DiffusionDB Large) to support different needs.
|Subset|Num of Images|Num of Unique Prompts|Size|Image Directory|Metadata Table|
|:--|--:|--:|--:|--:|--:|
|DiffusionDB 2M|2M|1.5M|1.6TB|images/|metadata.parquet|
|DiffusionDB Large|14M|1.8M|6.5TB|diffusiondb-large-part-1/ diffusiondb-large-part-2/|metadata-large.parquet|
Key Differences
- Two subsets have a similar number of unique prompts, but DiffusionDB Large has much more images. DiffusionDB Large is a superset of DiffusionDB 2M.
- Images in DiffusionDB 2M are stored in
pngformat; images in DiffusionDB Large use a losslesswebpformat.
Dataset Structure
We use a modularized file structure to distribute DiffusionDB. The 2 million images in DiffusionDB 2M are split into 2,000 folders, where each folder contains 1,000 images and a JSON file that links these 1,000 images to their prompts and hyperparameters. Similarly, the 14 million images in DiffusionDB Large are split into 14,000 folders.
# DiffusionDB 2M
./
├── images
│ ├── part-000001
│ │ ├── 3bfcd9cf-26ea-4303-bbe1-b095853f5360.png
│ │ ├── 5f47c66c-51d4-4f2c-a872-a68518f44adb.png
│ │ ├── 66b428b9-55dc-4907-b116-55aaa887de30.png
│ │ ├── [...]
│ │ └── part-000001.json
│ ├── part-000002
│ ├── part-000003
│ ├── [...]
│ └── part-002000
└── metadata.parquet
# DiffusionDB Large
./
├── diffusiondb-large-part-1
│ ├── part-000001
│ │ ├── 0a8dc864-1616-4961-ac18-3fcdf76d3b08.webp
│ │ ├── 0a25cacb-5d91-4f27-b18a-bd423762f811.webp
│ │ ├── 0a52d584-4211-43a0-99ef-f5640ee2fc8c.webp
│ │ ├── [...]
│ │ └── part-000001.json
│ ├── part-000002
│ ├── part-000003
│ ├── [...]
│ └── part-010000
├── diffusiondb-large-part-2
│ ├── part-010001
│ │ ├── 0a68f671-3776-424c-91b6-c09a0dd6fc2d.webp
│ │ ├── 0a0756e9-1249-4fe2-a21a-12c43656c7a3.webp
│ │ ├── 0aa48f3d-f2d9-40a8-a800-c2c651ebba06.webp
│ │ ├── [...]
│ │ └── part-010001.json
│ ├── part-010002
│ ├── part-010003
│ ├── [...]
│ └── part-014000
└── metadata-large.parquet
These sub-folders have names part-0xxxxx, and each image has a unique name generated by UUID Version 4. The JSON file in a sub-folder has the same name as the sub-folder. Each image is a PNG file (DiffusionDB 2M) or a lossless WebP file (DiffusionDB Large). The JSON file contains key-value pairs mapping image filenames to their prompts and hyperparameters. For example, below is the image of f3501e05-aef7-4225-a9e9-f516527408ac.png and its key-value pair in part-000001.json.
{
"f3501e05-aef7-4225-a9e9-f516527408ac.png": {
"p": "geodesic landscape, john chamberlain, christopher balaskas, tadao ando, 4 k, ",
"se": 38753269,
"c": 12.0,
"st": 50,
"sa": "k_lms"
},
}
The data fields are:
- key: Unique image name
p: Promptse: Random seedc: CFG Scale (guidance scale)st: Stepssa: Sampler
Dataset Metadata
To help you easily access prompts and other attributes of images without downloading all the Zip files, we include two metadata tables metadata.parquet and metadata-large.parquet for DiffusionDB 2M and DiffusionDB Large, respectively.
The shape of metadata.parquet is (2000000, 13) and the shape of metatable-large.parquet is (14000000, 13). Two tables share the same schema, and each row represents an image. We store these tables in the Parquet format because Parquet is column-based: you can efficiently query individual columns (e.g., prompts) without reading the entire table.
Below are three random rows from metadata.parquet.
| image_name | prompt | part_id | seed | step | cfg | sampler | width | height | user_name | timestamp | image_nsfw | prompt_nsfw | |:-----------------------------------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------:|-----------:|-------:|------:|----------:|--------:|---------:|:-----------------------------------------------------------------|:--------------------------|-------------:|--------------:| | 0c46f719-1679-4c64-9ba9-f181e0eae811.png | a small liquid sculpture, corvette, viscous, reflective, digital art | 1050 | 2026845913 | 50 | 7 | 8 | 512 | 512 | c2f288a2ba9df65c38386ffaaf7749106fed29311835b63d578405db9dbcafdb | 2022-08-11 09:05:00+00:00 | 0.0845108 | 0.00383462 | | a00bdeaa-14eb-4f6c-a303-97732177eae9.png | human sculpture of lanky tall alien on a romantic date at italian restaurant with smiling woman, nice restaurant, photography, bokeh | 905 | 1183522603 | 50 | 10 | 8 | 512 | 768 | df778e253e6d32168eb22279a9776b3cde107cc82da05517dd6d114724918651 | 2022-08-19 17:55:00+00:00 | 0.692934 | 0.109437 | | 6e5024ce-65ed-47f3-b296-edb2813e3c5b.png | portrait of barbaric spanish conquistador, symmetrical, by yoichi hatakenaka, studio ghibli and dan mumford | 286 | 1713292358 | 50 | 7 | 8 | 512 | 640 | 1c2e93cfb1430adbd956be9c690705fe295cbee7d9ac12de1953ce5e76d89906 | 2022-08-12 03:26:00+00:00 | 0.0773138 | 0.0249675 |
Metadata Schema
metadata.parquet and metatable-large.parquet share the same schema.
|Column|Type|Description|
|:---|:---|:---|
|image_name|string|Image UUID filename.|
|prompt|string|The text prompt used to generate this image.|
|part_id|uint16|Folder ID of this image.|
|seed|uint32| Random seed used to generate this image.|
|step|uint16| Step count (hyperparameter).|
|cfg|float32| Guidance scale (hyperparameter).|
|sampler|uint8| Sampler method (hyperparameter). Mapping: {1: "ddim", 2: "plms", 3: "k_euler", 4: "k_euler_ancestral", 5: "k_heun", 6: "k_dpm_2", 7: "k_dpm_2_ancestral", 8: "k_lms", 9: "others"}.
|width|uint16|Image width.|
|height|uint16|Image height.|
|user_name|string|The unique discord ID's SHA256 hash of the user who generated this image. For example, the hash for xiaohk#3146 is e285b7ef63be99e9107cecd79b280bde602f17e0ca8363cb7a0889b67f0b5ed0. "deleted_account" refer to users who have deleted their accounts. None means the image has been deleted before we scrape it for the second time
