Nips2016
Learning What and Where to Draw
Install / Use
/learn @reedscot/Nips2016README
###<a href="http://www.scottreed.info/files/nips2016.pdf">Learning What and Where to Draw</a> Scott Reed, Zeynep Akata, Santosh Mohan, Samuel Tenka, Bernt Schiele, Honglak Lee
This is the code for our NIPS 2016 paper on text- and location-controllable image synthesis using conditional GANs. Much of the code is adapted from reedscot/icml2016 and dcgan.torch.
<img src="images/bbox_network.jpg" width="900px" height="220px"/> <img src="images/keypoint_network.jpg" width="900px" height="220px"/>####Setup Instructions
You will need to install Torch, CuDNN, stnbhwd and the display package.
####How to train a text to image model:
- Download the data including captions, location annotations and pretrained models.
- Download the birds and humans image data.
- Modify the
CONFIGfile to point to your data. - Run one of the training scripts, e.g.
./scripts/train_cub_keypoints.sh
####How to generate samples:
./scripts/run_all_demos.sh.- html files will be generated with results like the following:
Moving the bird's position via bounding box:
<img src="images/cub_move_bbox.jpg" width="600px" height="300px"/>Moving the bird's position via keypoints:
<img src="images/cub_move_kp.jpg" width="600px" height="300px"/>Birds text to image with ground-truth keypoints:
<img src="images/cub_keypoints_given.jpg" width="600px" height="300px"/>Birds text to image with generated keypoints:
<img src="images/cub_keypoints_gen.jpg" width="600px" height="300px"/>Humans text to image with ground-truth keypoints:
<img src="images/mhp_kp_given.jpg" width="600px" height="300px"/>Humans text to image with generated keypoints:
<img src="images/mhp_kp_gen.jpg" width="600px" height="300px"/>####Citation
If you find this useful, please cite our work as follows:
@inproceedings{reed2016learning,
title={Learning What and Where to Draw},
author={Scott Reed and Zeynep Akata and Santosh Mohan and Samuel Tenka and Bernt Schiele and Honglak Lee},
booktitle={Advances in Neural Information Processing Systems},
year={2016}
}
Related Skills
YC-Killer
2.7kA library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.
groundhog
398Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).
isf-agent
a repo for an agent that helps researchers apply for isf funding
last30days-skill
17.6kAI agent skill that researches any topic across Reddit, X, YouTube, HN, Polymarket, and the web - then synthesizes a grounded summary
