SkillAgentSearch skills...

OneDNN

oneAPI Deep Neural Network Library (oneDNN)

Install / Use

/learn @uxlfoundation/OneDNN

README

UXL Foundation Logo

oneAPI Deep Neural Network Library (oneDNN)

OpenSSF Best Practices OpenSSF Scorecard

oneAPI Deep Neural Network Library (oneDNN) is an open-source cross-platform performance library of basic building blocks for deep learning applications. oneDNN project is part of the UXL Foundation and is an implementation of the oneAPI specification for oneDNN component.

The library is optimized for Intel 64/AMD64 architecture based processors, Arm(R) 64-bit Architecture (AArch64)-based processors, and Intel Graphics. oneDNN has experimental support for the following architectures: NVIDIA* GPU, AMD* GPU, OpenPOWER* Power ISA (PPC64), IBMz* (s390x), and RISC-V.

oneDNN is intended for deep learning applications and framework developers interested in improving application performance on CPUs and GPUs.

Deep learning practitioners should use one of the applications enabled with oneDNN:

Table of Contents

Documentation

  • oneDNN Developer Guide and Reference explains the programming model, supported functionality, implementation details, and includes annotated examples.
  • API Reference provides a comprehensive reference of the library API.
  • Release Notes explain the new features, performance optimizations, and improvements implemented in each version of oneDNN.

System Requirements

oneDNN supports platforms based on the following architectures:

WARNING

Power ISA (PPC64), IBMz (s390x), and RISC-V (RV64) support is experimental with limited testing validation.

The library is optimized for the following CPUs:

  • Intel 64/AMD64 architecture
    • Intel Atom(R) processor (at least Intel SSE4.1 support is required)
    • Intel Core(TM) processor (at least Intel SSE4.1 support is required)
    • Intel Xeon(R) processor E3, E5, and E7 family (formerly Sandy Bridge, Ivy Bridge, Haswell, and Broadwell)
    • Intel Xeon Scalable processor (formerly Skylake, Cascade Lake, Cooper Lake, Ice Lake, Sapphire Rapids, and Emerald Rapids)
    • Intel Xeon CPU Max Series (formerly Sapphire Rapids HBM)
    • Intel Core Ultra processors (formerly Meteor Lake, Arrow Lake, Lunar Lake, and Panther Lake)
    • Intel Xeon 6 processors (formerly Sierra Forest and Granite Rapids)
    • future Intel Core processor with Intel AVX10.2 instruction set support (code name Nova Lake)
    • future Intel Xeon processor with Intel AVX10.2 instruction set support (code name Diamond Rapids)
  • AArch64 architecture
    • Arm Neoverse(TM) N1 and V1 processors

On a CPU based on Intel 64 or on AMD64 architecture, oneDNN detects the instruction set architecture (ISA) at runtime and uses just-in-time (JIT) code generation to deploy the code optimized for the latest supported ISA. Future ISAs may have initial support in the library disabled by default and require the use of run-time controls to enable them. See CPU dispatcher control for more details.

WARNING

On macOS, applications that use oneDNN may need to request special entitlements if they use the hardened runtime. See the Linking Guide for more details.

The library is optimized for the following GPUs:

  • Intel discrete GPUs:
    • Intel Iris Xe MAX Graphics (formerly DG1)
    • Intel Arc(TM) A-Series Graphics (formerly Alchemist)
    • Intel Data Center GPU Flex Series (formerly Arctic Sound)
    • Intel Data Center GPU Max Series (formerly Ponte Vecchio)
    • Intel Arc B-Series Graphics and Intel Arc Pro B-Series Graphics (formerly Battlemage)
  • Intel Graphics integrated with:
    • 11th-14th Generation Intel Core Processors
    • Intel Graphics for Intel Core Ultra Series 1 processors (formerly Meteor Lake)
    • Intel Graphics for Intel Core Ultra Series 2 processors (formerly Arrow Lake and Lunar Lake)
    • Intel Graphics for Intel Core Ultra Series 3 processors (formerly Panther Lake)

Requirements for Building from Source

oneDNN supports systems meeting the following requirements:

  • Operating system with Intel 64/AMD64, AArch 64, PPC64, or s390x architecture support
  • C++ compiler with C++11 standard support
  • CMake 3.13 or later

The following tools are required to build oneDNN documentation:

Configurations of CPU and GPU engines may introduce additional build time dependencies.

CPU Engine

oneDNN CPU engine is used to execute primitives on Intel 64/AMD64 based processors, 64-bit Arm Architecture (AArch64) processors, 64-bit Power ISA (PPC64) processors, IBMz (s390x), and compatible devices.

The CPU engine is built by default but can be disabled at build time by setting ONEDNN_CPU_RUNTIME to NONE. In this case, GPU engine must be enabled. The CPU engine can be configured to use the OpenMP, TBB or SYCL runtime. The following additional requirements apply:

Some implementations rely on OpenMP 4.0 SIMD extensions. For the best performance results on Intel Architecture Processors we recommend using the Intel C++ Compiler.

On a CPU based on Arm AArch64 architecture, oneDNN CPU engine can be built with Arm Compute Library (ACL) integration. ACL is an open-source library for machine learning applications and provides AArch64 optimized implementations of core functions. This functionality currently requires that ACL is downloaded and built separately. See [Build from Source] section of the Developer Guide for details. The minimum supported version of ACL is 52.4.0.

GPU Engine

oneDNN GPU engine is used to execute primitives on various accelerators including Intel integrated and discrete GPUs, NVIDIA GPUs, AMD GPUs, and other devices supporting SYCL programming language. The GPU engine is disabled in the default build configuration and can be enabled by setting ONEDNN_GPU_RUNTIME build option to value other than NONE. Target accelerator vendor must be selected at build time using ONEDNN_GPU_VENDOR build option.

WARNING

Linux will reset GPU when kernel runtime exceeds several seconds. The user can prevent this behavior by [disabling hangcheck] for Intel GPU driver. Windows has built-in [timeout detection and recovery] mechanism that results in similar behavior. The user can prevent this behavior by increasing the [TdrDelay] value.

The following additional requirements apply for Intel integrated and discrete GPUs:

  • With OpenCL(TM) runtime:
    • OpenCL SDK (with OpenCL 1.2 support)
    • Intel Graphics Driver with support for OpenCL C 2.0, Intel subgroups support, and USM extensions support
  • With SYCL runtime:
    • Intel oneAPI DPC++/C++ Compiler
    • OpenCL SDK (with OpenCL 3.0 support)
    • [oneAPI Level Zero]
    • Intel Graphics Driver with support for OpenCL C 2.0, Intel subgroup

Related Skills

View on GitHub
GitHub Stars4.0k
CategoryEducation
Updated3h ago
Forks1.1k

Languages

C++

Security Score

100/100

Audited on Mar 23, 2026

No findings