SkillAgentSearch skills...

Ubottu

Next Utterance Classification

Install / Use

/learn @npow/Ubottu
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

Ubottu

This repository contains the source code for the models used in the following paper:

The Ubuntu Dialogue Corpus: A Large Dataset for Research in Unstructured Multi-Turn Dialogue Systems arXiv:1506.08909.

Dependencies

  • Python 2.7
  • Theano bleeding-edge
  • Lasagne
  • Pyprind

Usage

Fetch the pickled data:

cd src
wget http://dataset.cs.mcgill.ca/ubuntu-corpus-1.0/ubuntu_blobs.tgz
tar zxvf blobs.tgz

To reproduce the results in the original paper, use the following incantations.

RNN:

python main.py --encoder rnn --batch_size=512 --hidden_size=50 --optimizer adam --lr 0.001 --fine_tune_W=True --fine_tune_M=True --input_dir dataset_1MM

LSTM:

python main.py --encoder lstm --batch_size=256 --hidden_size=300 --optimizer adam --lr 0.001 --fine_tune_W=True --fine_tune_M=True --input_dir dataset_1MM

TFIDF:

python tfidf.py
View on GitHub
GitHub Stars136
CategoryDevelopment
Updated4mo ago
Forks44

Languages

Python

Security Score

77/100

Audited on Nov 3, 2025

No findings