SkillAgentSearch skills...

CppNet

A high-performance C++ deep learning library for building and training neural networks

Install / Use

/learn @LoqmanSamani/CppNet
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

CppNet

<div align="center"> <img src="imgs/___log2.png" alt="CppNet Logo" width="400"/> </div> <p align="center"> <b>CppNet</b> is a high-performance C++17 deep learning library for building and training neural networks from scratch.<br/> Built on <a href="https://eigen.tuxfamily.org">Eigen</a> for fast tensor operations, <a href="https://www.openmp.org/">OpenMP</a> for CPU parallelism, and <a href="https://developer.nvidia.com/cuda-zone">CUDA</a> for GPU acceleration. </p> <p align="center"> <a href="#installation"><img src="https://img.shields.io/badge/C%2B%2B-17-blue.svg" alt="C++17"/></a> <a href="#installation"><img src="https://img.shields.io/badge/CMake-%E2%89%A53.18-blue.svg" alt="CMake"/></a> <a href="LICENSE"><img src="https://img.shields.io/badge/License-MIT-green.svg" alt="MIT License"/></a> <a href="#gpu-acceleration"><img src="https://img.shields.io/badge/CUDA-optional-yellowgreen.svg" alt="CUDA"/></a> <a href="https://loqmansamani.github.io/CppNet/"><img src="https://img.shields.io/badge/Docs-Website-58a6ff.svg" alt="Website"/></a> </p>

Table of Contents


Features

  • High Performance — Vectorized tensor operations via Eigen, multi-threaded with OpenMP, full CUDA GPU backend for all layers, activations, losses, and optimizers.
  • Rich Layer Library — Linear, Conv2D, MaxPool2D, RNN, LSTM, GRU, Multi-Head Attention, Dropout, BatchNorm, Embedding, Residual, GlobalPool, MeanPool1D, Flatten.
  • Multiple Backends — Per-layer compute backend selection: "cpu-eigen" (Eigen contractions), "cpu" (OpenMP loops), "gpu" (CUDA kernels).
  • Complete CUDA Coverage — 41 CUDA kernel files covering all layers, activations, losses, and optimizers for end-to-end GPU training.
  • Modular Architecture — Clean separation of layers, activations, losses, optimizers, metrics, regularizations, and utilities.
  • Training Utilities — DataLoader with batching & shuffling, learning rate schedulers, early stopping callbacks, gradient clipping, model serialization.
  • Visualization — Built-in TrainingLogger for tracking metrics and exporting training history to CSV.
  • Extensible — Abstract base classes for layers, losses, and optimizers make it straightforward to add custom components.
  • Single-Header Access#include <CppNet/CppNet.hpp> brings in the entire library.

Installation

Prerequisites

| Dependency | Version | Required | |:-----------|:--------|:---------| | C++ compiler (GCC, Clang, MSVC) | C++17 support | Yes | | CMake | ≥ 3.18 | Yes | | Eigen3 | ≥ 3.3 | Yes | | OpenMP | any | Optional (CPU parallelism) | | CUDA Toolkit | any | Optional (GPU acceleration) |

Build from Source

git clone https://github.com/LoqmanSamani/CppNet.git
cd CppNet
mkdir build && cd build
cmake .. -DCMAKE_BUILD_TYPE=Release
make -j$(nproc)

Install System-Wide

sudo make install

This installs headers to /usr/local/include/CppNet/ and the static library to /usr/local/lib/.

Use in Your CMake Project

find_package(CppNet REQUIRED)
target_link_libraries(your_target PRIVATE CppNet::CppNet)

Quick Start

A minimal binary classification example:

#include <CppNet/CppNet.hpp>
#include <iostream>

int main() {
    // Define layers
    CppNet::Layers::Linear layer1(30, 64, "fc1", true, true, "cpu-eigen", "xavier");
    CppNet::Layers::Linear layer2(64, 1,  "fc2", true, true, "cpu-eigen", "xavier");
    CppNet::Activations::ReLU relu("cpu-eigen");
    CppNet::Activations::Sigmoid sigmoid;

    // Loss & optimizer
    CppNet::Losses::BinaryCrossEntropy loss_fn("mean");
    CppNet::Optimizers::Adam optimizer;
    float lr = 0.001;

    // Training loop
    for (int epoch = 0; epoch < 100; ++epoch) {
        auto h = relu.forward(layer1.forward(X_train));
        auto pred = sigmoid.forward(layer2.forward(h));

        float loss = loss_fn.forward(pred, Y_train);
        auto grad = loss_fn.backward(pred, Y_train);

        grad = layer2.backward(sigmoid.backward(grad));
        layer1.backward(relu.backward(grad));

        layer2.step(optimizer, lr);
        layer1.step(optimizer, lr);

        std::cout << "Epoch " << epoch << " — Loss: " << loss << std::endl;
    }
    return 0;
}

API Overview

Layers

All layers inherit from CppNet::Layers::Layer and implement forward(), backward(), step(), freeze(), unfreeze(), and print_layer_info().

| Layer | Description | Key Parameters | |:------|:------------|:---------------| | Linear | Fully connected layer | in_size, out_size, bias, device, weight_init | | Conv2D | 2D convolution | in_channels, out_channels, kernel_size, stride, padding | | MaxPool2D | 2D max pooling | kernel_size, stride | | Flatten | Reshape to 2D | — | | RNN | Vanilla recurrent layer | input_size, hidden_size | | LSTM | Long Short-Term Memory | input_size, hidden_size | | GRU | Gated Recurrent Unit | input_size, hidden_size | | MultiHeadAttention | Scaled dot-product multi-head attention | embed_dim, num_heads | | Dropout | Dropout regularization | drop_rate | | BatchNorm | Batch normalization | num_features | | Embedding | Embedding lookup table | vocab_size, embed_dim | | Residual | Residual (skip) connection wrapper | — | | GlobalPool | Global average/max pooling | — | | MeanPool1D | Mean pooling over sequence dimension | — |

Activations

| Activation | Function | |:-----------|:---------| | ReLU | $\max(0, x)$ | | LeakyReLU | $\max(\alpha x, x)$ | | Sigmoid | $\sigma(x) = \frac{1}{1 + e^{-x}}$ | | Tanh | $\tanh(x)$ | | Softmax | $\frac{e^{x_i}}{\sum_j e^{x_j}}$ |

All activations support both 2D and 4D tensor inputs and run on all three backends (cpu-eigen, cpu, gpu).

Losses

| Loss | Typical Use | |:-----|:------------| | MSE | Regression | | MAE | Regression | | Huber | Robust regression | | BinaryCrossEntropy | Binary classification | | CategoricalCrossEntropy | Multi-class classification | | SoftmaxCrossEntropy | Multi-class (fused softmax + CE) |

All support configurable reduction modes ("mean", "sum") and CUDA GPU acceleration.

Optimizers

| Optimizer | Description | |:----------|:------------| | SGD | Stochastic Gradient Descent | | Adam | Adaptive Moment Estimation (default: $\beta_1=0.9$, $\beta_2=0.999$, $\epsilon=10^{-8}$) | | Adagrad | Adaptive gradient accumulation | | Momentum | SGD with momentum | | RMSProp | Root Mean Square Propagation |

All optimizers have dedicated CUDA kernels for GPU-side weight updates.

Metrics

CppNet::Metrics::accuracy(predictions, targets);
CppNet::Metrics::binary_accuracy(predictions, targets, 0.5);
CppNet::Metrics::precision(predictions, targets, 0.5);
CppNet::Metrics::recall(predictions, targets, 0.5);
CppNet::Metrics::f1_score(predictions, targets, 0.5);

Regularizations

CppNet::Regularizations::l1_penalty(weights, lambda);
CppNet::Regularizations::l2_penalty(weights, lambda);
CppNet::Regularizations::elastic_net_penalty(weights, lambda, l1_ratio);
// Corresponding gradient functions: l1_gradient, l2_gradient, elastic_net_gradient

Utilities

| Utility | Description | |:--------|:------------| | DataLoader | Batched iteration with shuffling. Supports range-based for loops. | | Weight Init | Xavier (uniform/normal), He (uniform/normal), constant, custom. | | Gradient Clipping | clip_by_value() and clip_by_norm(). | | Serialization | save_model() / load_model() for full model persistence; tensor-level binary I/O. | | LR Schedulers | StepLR, ExponentialLR, CosineAnnealingLR. | | Callbacks | EarlyStopping with configurable patience, delta, and mode. | | Elapsed Time | Training duration measurement. |

DataLoader example:

CppNet::Utils::DataLoader loader(X, Y, /*batch_size=*/32, /*shuffle=*/true);
for (auto& [x_batch, y_batch] : loader) {
    // forward / backward / step
}
loader.reset(); // re-shuffle for next epoch

Learning rate scheduler example:

CppNet::Schedulers::CosineAnnealingLR scheduler(/*initial_lr=*/0.01, /*T_max=*/100);
for (int epoch = 0; epoch < 100; ++epoch) {
    float lr = scheduler.step();
    // ... train with lr
}

Visualization

CppNet::Visualizations::TrainingLogger logger;
// Inside training loop:
logger.log("train_loss", loss);
logger.log("val_accuracy", val_acc);
logger.next_epoch();
// After training:
logger.print_epoch_summary();
logger.export_csv("training_history.csv");

Examples

The examples/ directory contains complete, self-contained deep learning programs that train on synthetic data — no downloads required. Each example generates its own dataset, trains a model, and reports final metrics.

| Example | Architecture | Dataset | Key Components | Result | |:--------|:------------|:--------|:---------------|:-------| | mlp_classification.cpp | Linear→ReLU→Linear→ReLU→Linear | 3-class spiral (600 samples, 2D) | ReLU, SoftmaxCrossEntropy, Adam | ~75% accuracy | | cnn_image_classification.cpp | Conv2D→ReLU→MaxPool2D→Flatt

Related Skills

View on GitHub
GitHub Stars6
CategoryEducation
Updated27d ago
Forks0

Languages

C++

Security Score

85/100

Audited on Mar 6, 2026

No findings