SkillAgentSearch skills...

HugeCTR

HugeCTR is a high efficiency GPU framework designed for Click-Through-Rate (CTR) estimating training

Install / Use

/learn @NVIDIA-Merlin/HugeCTR
About this skill

Quality Score

0/100

Category

Design

Supported Platforms

Universal

README

HugeCTR

Version LICENSE Documentation SOK Documentation

HugeCTR is a GPU-accelerated recommender framework designed for training and inference of large deep learning models.

Design Goals:

  • Fast: HugeCTR performs outstandingly in recommendation benchmarks including MLPerf.
  • Easy: Regardless of whether you are a data scientist or machine learning practitioner, we've made it easy for anybody to use HugeCTR with plenty of documents, notebooks and samples.
  • Domain Specific: HugeCTR provides the essentials, so that you can efficiently deploy your recommender models with very large embedding.

NOTE: If you have any questions in using HugeCTR, please file an issue or join our Slack channel to have more interactive discussions.

Table of Contents

Core Features

HugeCTR supports a variety of features, including the following:

To learn about our latest enhancements, refer to our release notes.

Getting Started

If you'd like to quickly train a model using the Python interface, do the following:

  1. Build the HugeCTR Docker image: From version 25.03, HugeCTR only provides the Dockerfile source, and users need to build the image by themselves. To build the hugectr image, use the Dockerfile located at tools/dockerfiles/Dockerfile.base with the following command:

    docker build --build-arg RELEASE=true -t hugectr:release -f tools/dockerfiles/Dockerfile.base .
    
    
  2. Start the container with your local host directory (/your/host/dir mounted) by running the following command:

    docker run --gpus=all --rm -it --cap-add SYS_NICE -v /your/host/dir:/your/container/dir -w /your/container/dir -it -u $(id -u):$(id -g) hugectr:release
    

    NOTE: The /your/host/dir directory is just as visible as the /your/container/dir directory. The /your/host/dir directory is also your starting directory.

    NOTE: HugeCTR uses NCCL to share data between ranks, and NCCL may requires shared memory for IPC and pinned (page-locked) system memory resources. It is recommended that you increase these resources by issuing the following options in the docker run command.

    -shm-size=1g -ulimit memlock=-1
    
  3. Write a simple Python script to generate a synthetic dataset:

    # dcn_parquet_generate.py
    import hugectr
    from hugectr.tools import DataGeneratorParams, DataGenerator
    data_generator_params = DataGeneratorParams(
      format = hugectr.DataReaderType_t.Parquet,
      label_dim = 1,
      dense_dim = 13,
      num_slot = 26,
      i64_input_key = False,
      source = "./dcn_parquet/file_list.txt",
      eval_source = "./dcn_parquet/file_list_test.txt",
      slot_size_array = [39884, 39043, 17289, 7420, 20263, 3, 7120, 1543, 39884, 39043, 17289, 7420, 
                         20263, 3, 7120, 1543, 63, 63, 39884, 39043, 17289, 7420, 20263, 3, 7120,
                         1543 ],
      dist_type = hugectr.Distribution_t.PowerLaw,
      power_law_type = hugectr.PowerLaw_t.Short)
    data_generator = DataGenerator(data_generator_params)
    data_generator.generate()
    
  4. Generate the Parquet dataset for your DCN model by running the following command:

    python dcn_parquet_generate.py
    

    NOTE: The generated dataset will reside in the folder ./dcn_parquet, which contains training and evaluation data.

  5. Write a simple Python script for training:

    # dcn_parquet_train.py
    import hugectr
    from mpi4py import MPI
    solver = hugectr.CreateSolver(max_eval_batches = 1280,
                                  batchsize_eval = 1024,
                                  batchsize = 1024,
                                  lr = 0.001,
                                  vvgpu = [[0]],
                                  repeat_dataset = True)
    reader = hugectr.DataReaderParams(data_reader_type = hugectr.DataReaderType_t.Parquet,
                                     source = ["./dcn_parquet/file_list.txt"],
                                     eval_source = "./dcn_parquet/file_list_test.txt",
                                     slot_size_array = [39884, 39043, 17289, 7420, 20263, 3, 7120, 1543, 39884, 39043, 17289, 7420, 
                                                       20263, 3, 7120, 1543, 63, 63, 39884, 39043, 17289, 7420, 20263, 3, 7120, 1543 ])
    optimizer = hugectr.CreateOptimizer(optimizer_type = hugectr.Optimizer_t.Adam,
                                        update_type = hugectr.Update_t.Global)
    model = hugectr.Model(solver, reader, optimizer)
    model.add(hugectr.Input(label_dim = 1, label_name = "label",
                            dense_dim = 13, dense_name = "dense",
                            data_reader_sparse_param_array =
                            [hugectr.DataReaderSparseParam("data1", 1, True, 26)]))
    model.add(hugectr.SparseEmbedding(embedding_type = hugectr.Embedding_t.DistributedSlotSparseEmbeddingHash,
                               workspace_size_per_gpu_in_mb = 75,
                               embedding_vec_size = 16,
                               combiner = "sum",
                               sparse_embedding_name = "sparse_embedding1",
                               bottom_name = "data1",
                               optimizer = optimizer))
    model.add(hugectr.DenseLayer(layer_type = hugectr.Layer_t.Reshape,
                               bottom_names = ["sparse_embedding1"],
                               top_names = ["reshape1"],
                               leading_dim=416))
    model.add(hugectr.DenseLayer(layer_type = hugectr.Layer_t.Concat,
                               bottom_names = ["reshape1", "dense"], top_names = ["concat1"]))
    model.add(hugectr.DenseLayer(layer_type = hugectr.Layer_t.MultiCross,
                               bottom_names = ["concat1"],
                               top_names = ["multicross1"],
                               num_layers=6))
    model.add(hugectr.DenseLayer(layer_type = hugectr.Layer_t.InnerProduct,
                               bottom_names = ["concat1"],
                               top_names = ["fc1"],
                               num_output=1024))
    model.add(hugectr.DenseLayer(layer_type = hugectr.Layer_t.ReLU,
                               bottom_names = ["fc1"],
                               top_names = ["relu1"]))
    model.add(hugectr.DenseLayer(layer_type = hugectr.Layer_t.Dropout,
                               bottom_names = ["relu1"],
                               top_names = ["dropout1"],
                               dropout_rate=0.5))
    model.add(hugectr.DenseLayer(layer_type = hugectr.Layer_t.Concat,
                               bottom_names = ["dropout1", "multicross1"],
                               top_names = ["concat2"]))
    model.add(hugectr.DenseLayer(layer_type = hugectr.Layer_t.InnerProduct,
                               bottom_names = ["concat2"],
                               top_names = ["fc2"],
                               num_output=1))
    model.add(hugectr.DenseLayer(layer_type = hugectr.Layer_t.BinaryCrossEntropyLoss,
                               bottom_names = ["fc2", "label"],
                               top_names = ["loss"]))
    model.compile()
    model.summary()
    model.graph_to_json(graph_config_file = "dcn.json")
    model.fit(max_iter = 5120, display = 200, eval_interval = 1000, snapshot = 5000, snapshot_prefix = "dcn")
    

    NOTE: Ensure that the paths to the synthetic datasets are correct with respect to this Python script. data_reader_type, check_type, label_dim, dense_dim, and data_reader_sparse_param_array should be consistent with the generated dataset.

  6. Train the model by running the following command:

View on GitHub
GitHub Stars1.1k
CategoryDesign
Updated4d ago
Forks204

Languages

C++

Security Score

100/100

Audited on Mar 26, 2026

No findings