GRU4Rec
GRU4Rec is the original Theano implementation of the algorithm in "Session-based Recommendations with Recurrent Neural Networks" paper, published at ICLR 2016 and its follow-up "Recurrent Neural Networks with Top-k Gains for Session-based Recommendations". The code is optimized for execution on the GPU.
Install / Use
/learn @hidasib/GRU4RecREADME
GRU4Rec
This is the original Theano implementation of the algorithm of the paper "Session-based Recommendations With Recurrent Neural Networks", with the extensions introduced in the paper "Recurrent Neural Networks with Top-k Gains for Session-based Recommendations".
Make sure to always use the latest version as baseline and cite both papers when you do so!
The code was optimized for fast execution on the GPU (up to 1500 mini-batch per second on a GTX 1080Ti). According to the Theano profiler, training spends 97.5% of the time on the GPU (0.5% on CPU and 2% moving data between the two). Running on the CPU is not supported, but it is possible with some modificatons to the code.
If you are afraid of using Theano, the following official reimplementations are also available.
NOTE: These have been validated against the original, but due to how more modern deep learning frameworks operate, they are 1.5-4x slower than this version. Other reimplementations might be available in the future, depending on the research community's interest level.
IMPORTANT! Avoid using unofficial reimplementations. We thorougly examined 6 third party reimplementations (PyTorch/Tensorflow, standalone/framework) in "The Effect of Third Party Implementations on Reproducibility" and all of them were flawed and/or missed important features, that resulted in up to 99% lower recommendation accuracy and up to 335 times longer training times. Other reimplementations we have found since then are no better.
You can train and evaluate the model on your own session data easily using run.py. Usage information below.
Scroll down for infromation on reproducing results on public datasets and hyperparameter tuning!
LICENSE: See license.txt for details. Main guidelines: for research and education purposes the code is and always will be free to use. Using the code or parts of it in commercial systems requires a licence. If you've been using the code or any of its derivates in a commercial system, contact me!
CONTENTS:
Requirements
Theano configuration
Usage
Execute experiments using run.py
Examples
Using GRU4Rec in code or the interpreter
Notes on sequence-aware and session-based models
Notes on parameter settings
Speed of training
Reproducing results on public datasets
Hyperparameter tuning
Executing on CPU
Major updates
Requirements
- python --> Use python
3.6.3or newer. The code was mostly tested on3.6.3,3.7.6and3.8.12, but was briefly tested on other versions. Python 2 is NOT supported. - numpy -->
1.16.4or newer. - pandas -->
0.24.2or newer. - CUDA --> Needed for the GPU support of Theano. The latest CUDA version Theano was tested with (to the best of my knowledge) is
9.2. It works fine with more recent versions, e.g.11.8. - libgpuarray --> Required for the GPU support of Theano, use the latest version.
- theano -->
1.0.5(last stable release) or newer (occassionally it is still updated with minor stuff). GPU support should be installed. - optuna --> (optional) for hyperparameter optimization, code was tested with
3.0.3
IMPORTANT: cuDNN --> More recent versions produce a warning, but 8.2.1 still work for me. GRU4Rec doesn't rely heavily on the part of Theano that utilizes cuDNN. Unfortunately, cudnnReduceTensor in cuDNN v7 and newer is seriously bugged, which makes operators based on this function slow and even occasionally unstable (incorrect computations or segfault) when cuDNN is used (e.g. see here). Therefore it is best not to use cuDNN. If you already have it installed, you can easily configure Theano to exclude cuDNN based operators (see below).
*This bug is not related to Theano and can be reproduced from CUDA/C++. Unfortunately it hasn't been fixed for more than 6 years.
Theano configuration
This code was optimized for GPU execution. Executing the code will fail if you try to run it on CPU (if you really want to mess with it, check out the relevant section of this readme). Therefore Theano configuration must be set in a way to use the GPU. If you use run.py for runnning experiments, the code sets this configuration for you. You might want to change some of the preset configuration (e.g. execute on a specified GPU instead of the one with the lowest Id). You can do this in the THEANO_FLAGS environment variable or edit .theanorc_gru4rec.
If you don't use run.py, it is possible that the preset config won't have any effect (this happens if theano is imported before gru4rec either directly or by another module). In this case, you must set your own config by either editing your .theanorc or setting up the THEANO_FLAGS environment variable. Please refer to the documentation of Theano.
Important config parameters
device--> must always be a CUDA capable GPU (e.g.cuda0).floatX--> must always befloat32mode--> should beFAST_RUNfor fast executionoptimizer_excluding--> should belocal_dnn_reduction:local_cudnn_maxandargmax:local_dnn_argmaxto tell Theano not to use cuDNN based operators, because itscudnnReduceTensorfunction has been bugged sincev7
Usage
Execute experiments using run.py
run.py is an easy way to train, evaluate and save/load GRU4Rec models.
Execute with the -h argument to take a look at the parameters.
$ python run.py -h
Output:
usage: run.py [-h] [-ps PARAM_STRING] [-pf PARAM_PATH] [-l] [-s MODEL_PATH] [-t TEST_PATH [TEST_PATH ...]] [-m AT [AT ...]] [-e EVAL_TYPE] [-ss SS] [--sample_store_on_cpu] [-g GRFILE] [-d D] [-ik IK] [-sk SK] [-tk TK]
[-pm METRIC] [-lpm]
PATH
Train or load a GRU4Rec model & measure recall and MRR on the specified test set(s).
positional arguments:
PATH Path to the training data (TAB separated file (.tsv or .txt) or pickled pandas.DataFrame object (.pickle)) (if the --load_model parameter is NOT provided) or to the serialized model (if the
--load_model parameter is provided).
optional arguments:
-h, --help show this help message and exit
-ps PARAM_STRING, --parameter_string PARAM_STRING
Training parameters provided as a single parameter string. The format of the string is `param_name1=param_value1,param_name2=param_value2...`, e.g.: `loss=bpr-
max,layers=100,constrained_embedding=True`. Boolean training parameters should be either True or False; parameters that can take a list should use / as the separator (e.g. layers=200/200).
Mutually exclusive with the -pf (--parameter_file) and the -l (--load_model) arguments and one of the three must be provided.
-pf PARAM_PATH, --parameter_file PARAM_PATH
Alternatively, training parameters can be set using a config file specified in this argument. The config file must contain a single OrderedDict named `gru4rec_params`. The parameters must have
the appropriate type (e.g. layers = [100]). Mutually exclusive with the -ps (--parameter_string) and the -l (--load_model) arguments and one of the three must be provided.
-l, --load_model Load an already trained model instead of training a model. Mutually exclusive with the -ps (--parameter_string) and the -pf (--parameter_file) arguments and one of the three must be provided.
-s MODEL_PATH, --save_model MODEL_PATH
Save the trained model to the MODEL_PATH. (Default: don't save model)
-t TEST_PATH [TEST_PATH ...], --test TEST_PATH [TEST_PATH ...]
Path to the test data set(s) located at TEST_PATH. Multiple test sets can be provided (separate with spaces). (Default: don't evaluate the model)
-m AT [AT ...], --measure AT [AT ...]
Measure recall & MRR at the defined recommendation list length(s). Multiple values can be provided. (Default: 20)
-e EVAL_TYPE, --eval_type EVAL_TYPE
Sets how to handle if multiple items in the ranked list have the same prediction score (which is usually due to saturation or an error). See the documentation of evaluate_gpu() in evaluation.py
for further details. (Default: standard)
-ss SS, --sample_store_size SS
GRU4Rec uses a buffer for negative samples during training to maximize GPU utilization. This parameter sets the buffer length. Lower values require more frequent recomputation, higher values
use more (GPU) memory.
Related Skills
node-connect
335.4kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
82.5kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
335.4kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
commit-push-pr
82.5kCommit, push, and open a PR
