DeepRec

DeepRec is a high-performance recommendation deep learning framework based on TensorFlow. It is hosted in incubation in LF AI & Data Foundation.

Generate Convert Improve

Install / Use

/learn @DeepRec-AI/DeepRec

About this skill

Quality Score

0/100

README

DeepRec Logo

Introduction

DeepRec is a high-performance recommendation deep learning framework based on TensorFlow 1.15, Intel-TensorFlow and NVIDIA-TensorFlow. It is hosted in incubation in LF AI & Data Foundation.

Background

Recommendation models have huge commercial values for areas such as retailing, media, advertisements, social networks and search engines. Unlike other kinds of models, recommendation models have large amount of non-numeric features such as id, tag, text and so on which lead to huge parameters.

DeepRec has been developed since 2016, which supports core businesses such as Taobao Search, recommendation and advertising. It precipitates a list of features on basic frameworks and has excellent performance in recommendation models training and inference. So far, in addition to Alibaba Group, dozens of companies have used DeepRec in their business scenarios.

Key Features

DeepRec has super large-scale distributed training capability, supporting recommendation model training of trillion samples and over ten trillion parameters. For recommendation models, in-depth performance optimization has been conducted across CPU and GPU platform. It contains list of features to improve usability and performance for super-scale scenarios.

Embedding & Optimizer

Embedding Variable.
Dynamic Dimension Embedding Variable.
Adaptive Embedding Variable.
Multiple Hash Embedding Variable.
Multi-tier Hybrid Embedding Storage.
Group Embedding.
AdamAsync Optimizer.
AdagradDecay Optimizer.

Training

Asynchronous Distributed Training Framework (Parameter Server), such as grpc+seastar, FuseRecv, StarServer etc.
Synchronous Distributed Training Framework (Collective), such as HybridBackend, Sparse Operation Kits (SOK) etc.
Runtime Optimization, such as Graph Aware Memory Allocator (GAMMA), Critical-path based Executor etc.
Runtime Optimization (GPU), GPU Multi-Stream Engine which support multiple CUDA compute stream and CUDA Graph.
Operator level optimization, such as BF16 mixed precision optimization, embedding operator optimization and EmbeddingVariable on PMEM and GPU, new hardware feature enabling, etc.
Graph level optimization, such as AutoGraphFusion, SmartStage, AutoPipeline, Graph Template Engine, Sample-awared Graph Compression, MicroBatch etc.
Compilation optimization, support BladeDISC, XLA etc.

Deploy and Serving

Delta checkpoint loading and exporting.
Super-scale recommendation model distributed serving.
Multi-tier hybrid storage and multi backend supported.
Online deep learning with low latency.
High performance inference framework SessionGroup (share-nothing), with multiple threadpool and multiple CUDA stream supported.
Model Quantization.

Installation

Prepare for installation

CPU Platform

alideeprec/deeprec-build:deeprec-dev-cpu-py38-ubuntu20.04

GPU Platform

alideeprec/deeprec-build:deeprec-dev-gpu-py38-cu116-ubuntu20.04

How to Build

Configure

$ ./configure

Compile for CPU and GPU defaultly

$ bazel build -c opt --config=opt //tensorflow/tools/pip_package:build_pip_package

Compile for CPU and GPU: ABI=0

$ bazel build --cxxopt="-D_GLIBCXX_USE_CXX11_ABI=0" --host_cxxopt="-D_GLIBCXX_USE_CXX11_ABI=0" -c opt --config=opt //tensorflow/tools/pip_package:build_pip_package

Compile for CPU optimization: oneDNN + Unified Eigen Thread pool

$ bazel build -c opt --config=opt --config=mkl_threadpool //tensorflow/tools/pip_package:build_pip_package

Compile for CPU optimization and ABI=0

$ bazel build --cxxopt="-D_GLIBCXX_USE_CXX11_ABI=0" --host_cxxopt="-D_GLIBCXX_USE_CXX11_ABI=0" -c opt --config=opt --config=mkl_threadpool //tensorflow/tools/pip_package:build_pip_package

Create whl package

$ ./bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg

Install whl package

$ pip3 install /tmp/tensorflow_pkg/tensorflow-1.15.5+${version}-cp38-cp38m-linux_x86_64.whl

Latest Release Images

Image for CPU

alideeprec/deeprec-release:deeprec2402-cpu-py38-ubuntu20.04

Image for GPU CUDA11.6

alideeprec/deeprec-release:deeprec2402-gpu-py38-cu116-ubuntu20.04

Continuous Build Status

Official Build

| Build Type | Status | | ------------- | ------------------------------------------------------------ | | Linux CPU | | | Linux GPU | | | Linux CPU Serving | | | Linux GPU Serving | |

Official Unit Tests

| Unit Test Type | Status | | -------------- | ------ | | Linux CPU C | | | Linux CPU CC | | | Linux CPU Contrib | | | Linux CPU Core | | | Linux CPU Examples | | | Linux CPU Java | | | Linux CPU JS | | | Linux CPU Python | | | Linux CPU Stream Executor | | | Linux GPU C | | | Linux GPU CC | | | Linux GPU Contrib | | | Linux GPU Core | | | Linux GPU Examples | | | Linux GPU Java | | | Linux GPU JS | | | Linux GPU Python | | | Linux GPU Stream Executor | | | Linux CPU Serving UT | | | Linux GPU Serving UT | |

User Document

Chinese: https://deeprec.readthedocs.io/zh/latest/

English: https://deeprec.readthedocs.io/en/latest/

Contact Us

Join the Official Discussion Group on DingTalk

Join the Official Discussion Group on WeChat

License

Apache License 2.0

Related Skills

claude-opus-4-5-migration

107.8k

Migrate prompts and code from Claude Sonnet 4.0, Sonnet 4.5, or Opus 4.1 to Opus 4.5

model-usage

347.0k

Use CodexBar CLI local cost usage to summarize per-model usage for Codex or Claude, including the current (most recent) model or a full model breakdown. Trigger when asked for model-level usage/cost data from codexbar, or when you need a scriptable per-model summary from codexbar cost JSON.

TrendRadar

50.8k

⭐AI-driven public opinion & trend monitor with multi-platform aggregation, RSS, and smart alerts.🎯 告别信息过载，你的 AI 舆情监控助手与热点筛选工具！聚合多平台热点 + RSS 订阅，支持关键词精准筛选。AI 智能筛选新闻 + AI 翻译 + AI 分析简报直推手机，也支持接入 MCP 架构，赋能 AI 自然语言对话分析、情感洞察与趋势预测等。支持 Docker ，数据本地/云端自持。集成微信/飞书/钉钉/Telegram/邮件/ntfy/bark/slack 等渠道智能推送。

mcp-for-beginners

15.8k

This open-source curriculum introduces the fundamentals of Model Context Protocol (MCP) through real-world, cross-language examples in .NET, Java, TypeScript, JavaScript, Rust and Python. Designed for developers, it focuses on practical techniques for building modular, scalable, and secure AI workflows from session setup to service orchestration.

DeepRec-AI

View profile

View on GitHub

GitHub Stars1.2k

CategoryEducation

Updated7h ago

Forks361

DeepRec-AI/DeepRec

Languages

C++

Security Score

100/100

Audited on Apr 3, 2026

No findings