Hwloc.jl
A Julia API for hwloc
Install / Use
/learn @JuliaParallel/Hwloc.jlREADME
Portable Hardware Locality (Hwloc)
Hwloc.jl is a high-level wrapper of the hwloc library. It examines the current machine's hardware topology (memories, caches, cores, etc.) and provides Julia functions to visualize and access this information conveniently.
Taken from the hwloc website:
The Portable Hardware Locality (hwloc) software package provides a portable abstraction (across OS, versions, architectures, ...) of the hierarchical topology of modern architectures, including NUMA memory nodes, sockets, shared caches, cores and simultaneous multithreading. It also gathers various system attributes such as cache and memory information as well as the locality of I/O devices such as network interfaces, InfiniBand HCAs or GPUs.
hwloc primarily aims at helping applications with gathering information about increasingly complex parallel computing platforms so as to exploit them accordingly and efficiently.
Usage
Perhaps the most important function is Hwloc.topology() which
displays a tree structure describing the system topology. This
roughly corresponds to the output of the lstopo program (non-GUI version).
On my laptop this gives the following output:
julia> using Hwloc
julia> topology()
Machine (31.05 GB)
Package L#0 P#0 (31.05 GB)
NUMANode (31.05 GB)
L3 (12.0 MB)
L2 (1.25 MB) + L1 (48.0 kB) + Core L#0 P#0
PU L#0 P#0
PU L#1 P#4
L2 (1.25 MB) + L1 (48.0 kB) + Core L#1 P#1
PU L#2 P#1
PU L#3 P#5
L2 (1.25 MB) + L1 (48.0 kB) + Core L#2 P#2
PU L#4 P#2
PU L#5 P#6
L2 (1.25 MB) + L1 (48.0 kB) + Core L#3 P#3
PU L#6 P#3
PU L#7 P#7
HostBridge
PCI 00:02.0 (VGA)
GPU "renderD128"
GPU "card0"
PCIBridge
PCI 01:00.0 (NVMExp)
Block(Disk) "nvme0n1"
PCIBridge
PCI 72:00.0 (Network)
Net "wlp114s0"
PCIBridge
PCI 73:00.0 (Other)
Block "mmcblk0"
Often, one is only interested in a summary of this topology.
The function topology_info() provides such a compact description, which is loosely similar to the output of the hwloc-info command-line application.
julia> topology_info()
Machine: 1 (31.05 GB)
Package: 1 (31.05 GB)
NUMANode: 1 (31.05 GB)
L3Cache: 1 (12.0 MB)
L2Cache: 4 (1.25 MB)
L1Cache: 4 (48.0 kB)
Core: 4
PU: 8
Bridge: 6
PCI_Device: 22
OS_Device: 13
If you prefer a more verbose graphical visualization you may consider using topology_graphical():
julia> topology_graphical()
┌───────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ Machine (31GB total) │
│ │
│ ┌────────────────────────────────────────────────────────────────────┐ ├┤╶─┬─────┬─────────────┐ │
│ │ Package L#0 │ │ │ PCI 00:02.0 │ │
│ │ │ │ └─────────────┘ │
│ │ ┌────────────────────────────────────────────────────────────────┐ │ │ │
│ │ │ NUMANode L#0 P#0 (31GB) │ │ ├─────┼┤╶───────┬───────────────────┐ │
│ │ └────────────────────────────────────────────────────────────────┘ │ │3.9 3.9 │ PCI 01:00.0 │ │
│ │ │ │ │ │ │
│ │ ┌────────────────────────────────────────────────────────────────┐ │ │ │ ┌───────────────┐ │ │
│ │ │ L3 (12MB) │ │ │ │ │ Block nvme0n1 │ │ │
│ │ └────────────────────────────────────────────────────────────────┘ │ │ │ │ │ │ │
│ │ │ │ │ │ 953 GB │ │ │
│ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │ │ └───────────────┘ │ │
│ │ │ L2 (1280KB) │ │ L2 (1280KB) │ │ L2 (1280KB) │ │ L2 (1280KB) │ │ │ └───────────────────┘ │
│ │ └─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘ │ │ │
│ │ │ ├─────┼┤╶───────┬──────────────────┐ │
│ │ ┌────────────┐ ┌────────────┐ ┌────────────┐ ┌────────────┐ │ │0.6 0.6 │ PCI 72:00.0 │ │
│ │ │ L1d (48KB) │ │ L1d (48KB) │ │ L1d (48KB) │ │ L1d (48KB) │ │ │ │ │ │
│ │ └────────────┘ └────────────┘ └────────────┘ └────────────┘ │ │ │ ┌──────────────┐ │ │
│ │ │ │ │ │ Net wlp114s0 │ │ │
│ │ ┌────────────┐ ┌────────────┐ ┌────────────┐ ┌────────────┐ │ │ │ └──────────────┘ │ │
│ │ │ L1i (32KB) │ │ L1i (32KB) │ │ L1i (32KB) │ │ L1i (32KB) │ │ │ └──────────────────┘ │
│ │ └────────────┘ └────────────┘ └────────────┘ └────────────┘ │ │ │
│ │ │ └─────┼┤╶───────┬───────────────┐ │
│ │ ┌────────────┐ ┌────────────┐ ┌────────────┐ ┌────────────┐ │ 1.0 │ Block mmcblk0 │ │
│ │ │ Core L#0 │ │ Core L#1 │ │ Core L#2 │ │ Core L#3 │ │ │ │ │
│ │ │ │ │ │ │ │ │ │ │ │ 238 GB │ │
│ │ │ ┌────────┐ │ │ ┌────────┐ │ │ ┌────────┐ │ │ ┌────────┐ │ │ └───────────────┘ │
│ │ │ │ PU L#0 │ │ │ │ PU L#2 │ │ │ │ PU L#4 │ │ │ │ PU L#6 │ │ │ │
│ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │
│ │ │ │ P#0 │ │ │ │ P#1 │ │ │ │ P#2 │ │ │ │ P#3 │ │ │ │
│ │ │ └────────┘ │ │ └────────┘ │ │ └────────┘ │ │ └────────┘ │ │ │
│ │ │ ┌────────┐ │ │ ┌────────┐ │ │ ┌────────┐ │ │ ┌────────┐ │ │ │
│ │ │ │ PU L#1 │ │ │ │ PU L#3 │ │ │ │ PU L#5 │ │ │ │ PU L#7 │ │ │ │
│ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │
│ │ │ │ P#4 │ │ │ │ P#5 │ │ │ │ P#6 │ │ │ │ P#7 │ │ │ │
│ │ │ └────────┘ │ │ └────────┘ │ │ └────────┘ │ │ └────────┘ │ │ │
│ │ └────────────┘ └────────────┘ └────────────┘ └────────────┘ │ │
│ └────────────────────────────────────────────────────────────────────┘ │
└───────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
(Note that as of now this may produce colorful output on some systems.)
Obtaining particular information:
Number of cores, NUMA nodes, and sockets
Hwloc exports a few convenience functions for obtaining particularly import information,
such as the number of physical and virtual cores (i.e. processing units), NUMA nodes, and sockets / packages:
julia> num_physical_cores()
6
julia> num_virtual_cores()
12
julia> num_numa_nodes()
1
julia> num_packages()
1
One may also use getinfo() to programmatically access some of the output of topology_info():
julia> getinfo()
Dict{Symbol, Int64} with 11 entries:
:Package => 1
:PU => 8
:OS_Device => 13
:Core => 4
:L3Cache => 1
:Machine => 1
:PCI_Device => 22
:L2Cache => 4
:NUMANode => 1
:Bridge => 6
:L1Cache => 4
Cache properties
Assuming that multiple caches of the same level (e.g. L1) have identical properties, one can use the convenience functions cachesize() and cachelinesize() to obtain the relevant sizes in Bytes:
julia> cachesize()
(L1 = 32768, L2 = 262144, L3 = 12582912)
julia> cachelinesize()
(L1 = 64, L2 = 64, L3 = 64)
Otherwise, there are the following more specific functions available:
julia> @show Hwloc.l1cache_sizes();
@show Hwloc.l2cache_sizes();
@show Hwloc.l3cache_sizes();
Hwloc.l1cache_sizes() = [32768, 32768, 32768, 32768, 32768, 32768]
Hwloc.l2cache_sizes() = [262144, 262144, 262144, 262144, 262144, 262144]
Hwloc.l3cache_sizes() = [12582912]
Different kind of CPU cores
Some systems have CPU cores of differents kinds, like, e.g., efficiency and performance cores. With Hwloc.jl, you can query the number of different kinds and the count of CPU cores for each kind. For example, on Mac mini M1 (4 efficiency and 4 performance cores):
julia> using Hwloc
julia> num_cpu
