SkillAgentSearch skills...

RecAlgorithm

主流推荐系统Rank算法的实现

Install / Use

/learn @tangxyw/RecAlgorithm

README

主流推荐系统Rank算法的实现

Python TensorFlow Versions

项目简介

  • 实现推荐系统中主要使用的Rank算法,并使用公开数据集评测,所有算法均已跑通并完成完整的训练,最终生成saved_modelcheckpointtf-serving部署;
  • 使用微信视频号推荐算法比赛数据集,数据详情请见 ./dataset/README.md
  • 为了贴合工业界使用情况,使用TensorFlow Estimator框架,数据format为Tfrecord
  • 算法实现在./algrithm下,每个算法单独一个文件夹,名字为普遍接受的大写算法名称,训练入口为文件夹下对应的小写算法名称py文件,如DIN文件夹下的din.py文件为训练DIN模型的入口,具体请见末尾的示例部分;
  • 每个算法都实现了自己的model_fn,没有使用Keras高阶API,只使用TensorFlow的中低阶API构造静态图;
  • 算法超参数可由--parameter_name=parameter_value方式传入训练入口脚本,超参数定义请见训练入口脚本tf.app.flags部分;
  • 单任务模型使用数据集因变量中的read_comemnt评测,多任务模型使用read_commet like click_avatar三个任务评测;

单任务Models列表

| Model | Paper | *Best_read_comment_Auc | |:------------:|:----------------------------------------------------------------------------------------------------------------------------------------------------------:|:----------------------:| | FFM | [2016] Field-aware Factorization Machines for CTR Prediction | 0.8911285 | | DeepCrossing | [2016] Deep Crossing - Web-Scale Modeling without Manually Crafted Combinatorial Features | 0.9185908 | | PNN | [2016] Product-based neural networks for user response prediction | 0.9065931 | | Wide & Deep | [2016] Wide & Deep Learning for Recommender Systems | 0.9133482 | | DeepFM | [2017] DeepFM: A Factorization-Machine based Neural Network for CTR Prediction | 0.8529998 | | DCN | [2017] Deep & Cross Network for Ad Click Predictions | 0.9183242 | | AFM | [2017] Attentional Factorization Machines: Learning the Weight of Feature Interactions via Attention Networks | 0.9117872 | | xDeepFM | [2018] xDeepFM: Combining Explicit and Implicit Feature Interactions for Recommender Systems | 0.9152467 | | FwFM | [2018] Field-weighted Factorization Machines for Click-Through Rate Prediction in Display Advertising | 0.9118794 | | DIN | [2018] Deep Interest Network for Click-Through Rate Prediction | 0.9116896 | | DIEN | [2018] Deep Interest Evolution Network for Click-Through Rate Prediction | - | | FiBiNet | [2019] FiBiNET: Combining Feature Importance and Bilinear feature Interaction for Click-Through Rate Prediction | 0.9149044 | | BST | [2019] Behavior sequence transformer for e-commerce recommendation in Alibaba | 0.9165866 |

*Best_read_comment_Auc为每个model各自调参后的测试集最大Auc,每个model各自的评测见每个model路径下的result.md</br> *DIEN不适用于微信视频号数据集,故只实现了静态图,并没有评测。

多任务Models列表

| Model | Paper | *Best_read_commet_AUC | *Best_like_AUC | *Best_click_avatar_AUC | |:-----:|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------:|:---------------------:|:--------------:|:----------------------:| | ESMM | [2018] Entire Space Multi-Task Model: An Effective Approach for Estimating Post-Click Conversion Rate | - | - | - | | MMOE | [2018] Modeling Task Relationships in Multi-task Learning with Multi-gate Mixture-of-Experts | 0.91860557 | 0.8126400 | 0.8139362 | | PLE | [2020] Progressive Layered Extraction (PLE): A Novel Multi-Task Learning (MTL) Model for Personalized Recommendations | 0.91965175 | 0.8136461 | 0.8154559 |

*Best_xx_AUC为所有超参数组合中的最高值,横向的三个AUC可能不在同一组超参数中。</br> *由于ESMM的结构特殊性,不适用于微信视频号数据集,故只实现了静态图,并没有评测。

示例

# 先执行以下命令确保生成了tfrecord
# cd ./dataset/wechat_algo_data1
# python DataGenerator.py && cd ..
cd ./DIN
# 训练时可自定义参数
python din.py --use_softmax=True 

To Do List

  • 增加多任务学习Trick: Uncertainty, GradNorm, PCGrad, etc.
  • 增加AutoInt, FLEN, etc.
  • 重构特征工程部分, 包括配置化输入等, 参考https://github.com/Shicoder/Deep_Rec

欢迎提issue,或直接勾搭

<img src="./docs/Wechat.jpeg" alt="pic" width="220" height="220">
View on GitHub
GitHub Stars287
CategoryEducation
Updated12d ago
Forks58

Languages

Python

Security Score

95/100

Audited on Mar 16, 2026

No findings