Textvae
Theano code for experiments in the paper "A Hybrid Convolutional Variational Autoencoder for Text Generation."
Install / Use
/learn @ssemeniuta/TextvaeREADME
A Hybrid Convolutional Variational Autoencoder for Text Generation.
Theano code for experiments in the paper A Hybrid Convolutional Variational Autoencoder for Text Generation.
Preparation
First, run makedata.sh. This will download the ptb dataset, split, and preprocess it.
PTB Experiments
Files prefixed with ''lm_'' contain experiments on the ptb dataset. We provide scripts for training of non-VAE, baseline LSTM VAE, and our models and a script to greedily sample from a trained model. ''defs'' subfolder contains definitions of grid searches we have used to generate data for figures and tables in the paper. Running one search is done by:
python -u nn/scripts/grid_search.py -grid defs/gridname.json
To train our model on samples 60 characters long with alpha=0.2 run:
python -u lm_vae_lstm.py -alpha 0.2 -sample_size 60
Twitter Experiments
Code for these experiments is in files starting with ''twitter_''. We do not release the dataset we have used to train our model, but provide both a script to train one and a pretrained model. To use the script on custom data, create a file ''data/tweets.txt'' containing one data sample per line. By default, the first 10k samples will be used for validation and everything else for training, but no more than ~1M samples. In addition, it will only use tweets with up to 128 characters. This is done only for convenience when down- and upsampling. Training on tweets with up to 140 characters will require a little bit of care when handling spatial dimension.
License
MIT
Related Skills
node-connect
331.2kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
81.5kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
331.2kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
commit-push-pr
81.5kCommit, push, and open a PR
