HEonGPU
HEonGPU is a high-performance library that optimizes Fully Homomorphic Encryption (FHE) on GPUs. Leveraging GPU parallelism, it reduces computational load through concurrent execution. Its multi-stream architecture minimizes data transfer overhead, making it ideal for large-scale encrypted computations with reduced latency.
Install / Use
/learn @Alisah-Ozcan/HEonGPUREADME
🚀 HEonGPU - A GPU Based Homomorphic Encryption Library
HEonGPU is a high-performance library designed to optimize Fully Homomorphic Encryption (FHE) operations on GPUs. By leveraging the parallel processing power of GPUs, it significantly reduces the computational load of FHE through concurrent execution of complex operations. Its multi-stream architecture enables efficient parallel processing and minimizes the overhead of data transfers between the CPU and GPU. These features make HEonGPU ideal for large-scale encrypted computations, offering reduced latency and improved performance.
The goal of HEonGPU is to provide:
- A high-performance framework for executing FHE schemes, specifically
BFV,CKKSandTFHE, by leveraging the parallel processing capabilities of CUDA. - A user-friendly C++ interface that requires no prior knowledge of GPU programming, with all CUDA kernels encapsulated in easy-to-use classes.
- An optimized multi-stream architecture that ensures efficient memory management and concurrent execution of encrypted computations on the GPU.
For more information about HEonGPU:
- https://eprint.iacr.org/2024/1543
- https://heongpu.readthedocs.io/
Current HEonGPU Capabilities and Schemes
<div align="center">| Capability / Scheme | HEonGPU | |:------------------------------:|:---------:| | BFV | ✓ | | CKKS | ✓ | | BGV | Very Soon | | TFHE | ✓ | | CKKS Regular Bootstrapping | ✓ | | CKKS Slim Bootstrapping | ✓ | | CKKS Bit Bootstrapping | ✓ | | CKKS Gate Bootstrapping | ✓ | | TFHE Gate Bootstrapping | ✓ | | Multiparty Computation (MPC) | ✓ | | Collective Bootstrapping (MPC) | ✓ |
</div>News
🚨 New Feature: Collective(Distributed) Bootstrapping
The HEonGPU library now delivers Collective Bootstrapping for both BFV and CKKS, drawing on the designs introduced by Mouchet et al. and Balle et al. A streamlined CUDA path merges share creation and re-encryption into a single launch, allowing deep multi-party workloads to keep running entirely on the GPU without pausing to reset noise.
🚨 New Scheme: TFHE (Torus Fully Homomorphic Encryption)
The HEonGPU library now includes support for the TFHE (Torus Fully Homomorphic Encryption) scheme with GPU acceleration. This enables efficient evaluation of Boolean circuits using fast gate bootstrapping and low-latency parallel execution on modern CUDA-enabled GPUs.
Currently, the implementation supports a fixed parameter set targeting 128-bit security. In upcoming releases, we plan to:
- Make the parameters fully configurable,
- Provide default parameter sets for 128-bit, 192-bit, and 256-bit security levels.
- Further optimize and accelerate the TFHE implementation with improved CUDA kernels and parallelism strategies,
- Introduce native support for homomorphic unsigned arithmetic via new types:
huint8,huint16,huint32,huint64,huint128, andhuint256.
| | uint8 | uint16 | uint32 | uint64 | uint128 | uint256 | |-----------|-------|--------|--------|--------|---------|---------| | TFHE-rs | 31.53 | 31.54 | 31.55 | 32.03 | 33.74 | 58.32 | | Literature | 18.63 | 18.61 | 18.87 | 24.23 | 29.97 | 58.30 | | HEonGPU | 12.72 | 12.75 | 13.60 | 15.88 | 23.10 | 38.24 | | Speedup (vs TFHE-rs) | 2.48× | 2.47× | 2.32× | 2.02× | 1.46× | 1.52× | | Speedup (vs Literature) | 1.46× | 1.46× | 1.39× | 1.53× | 1.30× | 1.52× |
</div>Table: Latency (ms) comparison between TFHE-rs, Literature, and HEonGPU for different bit sizes (on GPU), based on the STD128 parameter set.
All benchmarks were performed on an NVIDIA RTX 4090 GPU.
The last two rows show the speedup of HEonGPU with respect to TFHE-rs and Literature.
🚨 New Feature: Serialization Module
The new serializer module has been successfully integrated into HEonGPU. It provides high-performance serialization and deserialization of homomorphic encryption objects (Context, Secretkey, Publickey, Relinkey, Galoiskey, Plaintext, Ciphertext, etc.) in raw binary or optional Zlib-compressed formats. This enhancement enables blazing-fast disk I/O, seamless client-server transfers, and up to a 60% reduction in storage and bandwidth.
🚨 New Feature: Integration RNGonGPU
RNGonGPU has been successfully integrated into HEonGPU. RNGonGPU features a secure Deterministic Random Bit Generator (DRBG) designed according to NIST Recommendation for Random Number Generation Using Deterministic Random Bit Generators. This integration enhances GPU-based random number generation, ensuring both high performance and robust security.
🚨 New Application: Private Information Retrieval on GPU (PIRonGPU)
PIRonGPU is a high-performance library that enhances secure data retrieval through Private Information Retrieval (PIR) on GPUs. By modifying the SealPIR protocol with HEonGPU, it achieves rapid, confidential querying, offering an efficient and scalable solution for privacy-sensitive applications.
🚨 New Feature: Logic Operation and 3 More CKKS Bootstrapping Types
HEonGPU now provides comprehensive support for logic operations across both the BFV and CKKS encryption schemes. In addition, the latest update introduces three new CKKS Bootstrapping types; two of which leverage Bit Bootstrapping and Gate Bootstrapping techniques, while the third employs Slim Bootstrapping, a method that is significantly more efficient than Regular Bootstrapping. These enhancements not only broaden HEonGPU’s functionality but also significantly improve its performance in managing noise and enabling efficient, secure computations on GPU platforms.
The Logic Operations supported:
- NOT, AND, NAND, OR, NOR, XOR, XNOR
3 More CKKS Bootstrapping Types:
SlimBootstrapping (supports only Real Numbers)BitBootstrapping (supports only Binary Numbers)GateBootstrapping (supports only Binary Numbers)- AND, NAND, OR, NOR, XOR, XNOR
Execution times of the HEonGPU Bootstrapping Operations (on RTX 4090)
| Bootstrapping Type | N | Slot Count| LKM | Remaining Level | Total Time | Amortized Time | |:--------------------:|:----:|:----:|:---:|:---------------:|:----------:|:--------------:| | Slim Bootstrapping | 2^16 | 2^15 | ON | 0 Level | 99.12 ms | 3.02 µs | | | 2^16 | 2^15 | ON | 2 Level | 114.13 ms | 3.48 µs | | | 2^16 | 2^15 | ON | 4 Level | 164.20 ms | 5.01 µs | | Bit Bootstrapping | 2^15 | 2^14 | OFF | 0 Level | 33.74 ms | 2.06 µs | | | 2^15 | 2^14 | OFF | 2 Level | 39.36 ms | 2.40 µs | | | 2^15 | 2^14 | OFF | 4 Level | 46.54 ms | 2.84 µs | | | 2^15 | 2^14 | OFF | 6 Level | 55.66 ms | 3.40 µs | | | 2^16 | 2^15 | OFF | 0 Level | 86.69 ms | 2.73 µs | | | 2^16 | 2^15 | OFF | 2 Level | 100.72 ms | 3.07 µs | | | 2^16 | 2^15 | OFF | 4 Level | 115.88 ms | 3.53 µs | | Gate Bootstrapping* | 2^15 | 2^14 | OFF | 0 Level | 27.03 ms | 1.64 µs | | | 2^16 | 2^15 | OFF | 0 Level | 70.73 ms | 2.16 µs |
</div> LKM: Less Key Mode is a bootstrapping optimization in HEonGPU. Its purpose is to reduce the required amount of Galois keys by 30% while sacrificing 15–20% performance. This is useful in cases where GPU memory is insufficient.*: For all gates
🚨 New Feature: CKKS Regular Bootstrapping
HEonGPU now includes support for CKKS Regular Bootstrapping, enabling efficient evaluation of deep computational circuits with high precision and security. On an NVIDIA RTX 4090, it performs CKKS Regular Bootstrapping for
N=65536 in under 170 ms.
🚨 New Feature: Multiparty Computation (MPC) Support
HEonGPU now includes support for Multiparty Computation (MPC) protocols, providing a secure and collaborative framework for encrypted computations. By incorporating Multiparty Homomorphic Encryption (MHE) capabilities, the library enables distributed computations with threshold encryption models such as N-out-of-N. The implementation is fully optimized for GPU environments, delivering minimal latency and maximum performance in collaborative settings.
Installation
Requirements
- CMake >=3.30.4
- GCC
- GMP
- CUDA Toolkit >=11.4
- OpenSSL >= 1.1.0
- [
