SkillAgentSearch skills...

ComiRec

Source code and dataset for KDD 2020 paper "Controllable Multi-Interest Framework for Recommendation"

Install / Use

/learn @THUDM/ComiRec
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

Controllable Multi-Interest Framework for Recommendation

Original implementation for paper Controllable Multi-Interest Framework for Recommendation.

Yukuo Cen, Jianwei Zhang, Xu Zou, Chang Zhou, Hongxia Yang, Jie Tang

Accepted to KDD 2020 ADS Track!

Prerequisites

  • Python 3
  • TensorFlow-GPU >= 1.8 (< 2.0)
  • Faiss-GPU

Getting Started

Installation

  • Install TensorFlow-GPU 1.x

  • Install Faiss-GPU based on the instructions here: https://github.com/facebookresearch/faiss/blob/master/INSTALL.md

  • Clone this repo git clone https://github.com/THUDM/ComiRec.

Dataset

  • Original links of datasets are:

    • http://jmcauley.ucsd.edu/data/amazon/index.html
    • https://tianchi.aliyun.com/dataset/dataDetail?dataId=649&userId=1
  • Two preprocessed datasets can be downloaded through:

    • Tsinghua Cloud: https://cloud.tsinghua.edu.cn/f/e5c4211255bc40cba828/?dl=1
    • Dropbox: https://www.dropbox.com/s/m41kahhhx0a5z0u/data.tar.gz?dl=1
  • You can also download the original datasets and preprocess them by yourself. You can run python preprocess/data.py {dataset_name} and python preprocess/category.py {dataset_name} to preprocess the datasets.

Training

Training on the existing datasets

You can use python src/train.py --dataset {dataset_name} --model_type {model_name} to train a specific model on a dataset. Other hyperparameters can be found in the code. (If you share the server with others or you want to use the specific GPU(s), you may need to set CUDA_VISIBLE_DEVICES.)

For example, you can use python src/train.py --dataset book --model_type ComiRec-SA to train ComiRec-SA model on Book dataset.

When training a ComiRec-DR model, you should set --learning_rate 0.005.

Training on your own datasets

If you want to train models on your own dataset, you should prepare the following three(or four) files:

  • train/valid/test file: Each line represents an interaction, which contains three numbers <user_id>,<item_id>,<time_stamp>.
  • category file (optional): Each line contains two numbers <item_id>,<cate_id> used for computing diversity..

Common Issues

<details> <summary> The computation of NDCG score. </summary> <br/> I'm so sorry that the computation of NDCG score in the original version (now in `paper` branch) is not consistent with the definition in the paper, as mentioned in the issue #6. I have updated the computation of NDCG score in the `master` branch according to the correct definition. For reproducing the NDCG scores reported in the paper, please use the `paper` branch. By the way, I personally recommend to use the reported results of recall and hit rate only. </details>

If you have ANY difficulties to get things working in the above steps, feel free to open an issue. You can expect a reply within 24 hours.

Acknowledgement

The structure of our code is based on MIMN.

Cite

Please cite our paper if you find this code useful for your research:

@inproceedings{cen2020controllable,
  title = {Controllable Multi-Interest Framework for Recommendation},
  author = {Cen, Yukuo and Zhang, Jianwei and Zou, Xu and Zhou, Chang and Yang, Hongxia and Tang, Jie},
  booktitle = {Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining},
  year = {2020},
  pages = {2942–2951},
  publisher = {ACM},
}
View on GitHub
GitHub Stars310
CategoryDevelopment
Updated8h ago
Forks78

Languages

Python

Security Score

85/100

Audited on Apr 1, 2026

No findings