Overlaybd
Overlaybd: a block based remote image format. The storage backend of containerd/accelerated-container-image.
Install / Use
/learn @containerd/OverlaybdREADME
Overlaybd
Overlaybd (overlay block device) is a novel layering block-level image format, which is design for container, secure container and applicable to virtual machine. And it is an open-source implementation of paper DADI: Block-Level Image Service for Agile and Elastic Application Deployment. USENIX ATC'20".
<img src="https://github.com/containerd/overlaybd/blob/main/docs/assets/Scaling_up.jpg" width="400px">Scaling up Without Slowing Down: Accelerating Pod Start Time. KubeCon+CloudNativeCon Europe 2024
Overlaybd is based on PhotonLibOS, which is a high-efficiency LibOS framework.
Overlaybd has 2 core component:
-
Overlaybd is a block-device based image format, provideing a merged view of a sequence of block-based layers as a virtual block device. The LBA lookup algorithm employs a linearized B+ tree and AVX-512 to optimize performance, significantly accelerating search speed up to 10X. Lookup Performance
-
Zfile is a compression file format which support seekalbe online decompression.
This repository is an implementation of overlaybd based on TCMU.
Overlaybd can be used as the storage backend of Accelerated Container Image, which is a solution of remote container image by fetching image data on-demand without downloading and unpacking the whole image before the container starts.
Benefits from the universality of block-device, overlaybd is also a widely applicable image format for most runtime, including qemu/kvm and any other runtime supporting block or scsi api.
Overlaybd is a non-core sub-project of containerd.
Setup
System Requirements
Overlaybd provides virtual block devices through TCMU, so the TCMU kernel module is required. TCMU is implemented in the Linux kernel and supported by most Linux distributions.
Check and load the target_core_user module.
modprobe target_core_user
Install From RPM/DEB
You may download our RPM/DEB packages form Release and install.
The binaries are install to /opt/overlaybd/bin/.
Run /opt/overlaybd/bin/overlaybd-tcmu and the log is stored in /var/log/overlaybd.log.
It is better to run overlaybd-tcmu as a service so that it can be restarted after unexpected crashes.
Build From Source
Requirements
To build overlaybd from source code, the following dependencies are required:
-
CMake >= 3.14
-
gcc/g++ >= 7
-
Libaio, libcurl, libnl3, glib2 and openssl runtime and development libraries.
- CentOS 7/Fedora:
sudo yum install libaio-devel libcurl-devel openssl-devel libnl3-devel libzstd-static e2fsprogs-devel - CentOS 8:
sudo yum install libaio-devel libcurl-devel openssl-devel libnl3-devel libzstd-devel e2fsprogs-devel - Debian/Ubuntu:
sudo apt install libcurl4-openssl-dev libssl-dev libaio-dev libnl-3-dev libnl-genl-3-dev libgflags-dev libzstd-dev libext2fs-dev pkg-config automake libtool # libgtest-dev // for test - Mariner/AzureLinux:
sudo yum install libaio-devel libcurl-devel openssl-devel libnl3-devel e2fsprogs-devel glibc-devel libzstd-devel binutils ca-certificates-microsoft build-essential
- CentOS 7/Fedora:
Build
You need git to checkout the source code:
git clone https://github.com/containerd/overlaybd.git
cd overlaybd
git submodule update --init
The whole project is managed by CMake. Binaries and resource files will be installed to /opt/overlaybd/.
mkdir build
cd build
cmake .. # -DCMAKE_BUILD_TYPE=Debug -DCMAKE_EXPORT_COMPILE_COMMANDS=true -DBUILD_TESTING=true
make -j
sudo make install
Considering some libcurl and libopenssl has API changes, if want to build a make-sured compatible version libcurl and openssl, and link to executable as static library.
Noticed that building libcurl and openssl depends on autoconf automake and libtool.
cmake -D BUILD_CURL_FROM_SOURCE=1 ..
If you want to use the original libext2fs instead of our customized libext2fs.
cmake -D ORIGIN_EXT2FS=1 ..
For more information about ORIGIN_EXT2FS go to USERSPACE_CONVERTOR.
If you want to use DSA hardware to accelerate CRC calculation.
cmake -D ENABLE_DSA=1 ..
If you want to use avx512 to accelerate CRC calculation.
cmake -D ENABLE_ISAL=1 ..
If you want to use QAT to accelerate compression/decompression.
cmake -D ENABLE_QAT=1 ..
For more information go to overlaybd/src/overlaybd/zfile/README.md.
Finally, setup a systemd service for overlaybd-tcmu backstore.
sudo systemctl enable /opt/overlaybd/overlaybd-tcmu.service
sudo systemctl start overlaybd-tcmu
Configuration
overlaybd config
Default configure file overlaybd.json is installed to /etc/overlaybd/.
{
"logConfig": {
"logLevel": 1,
"logPath": "/var/log/overlaybd.log"
},
"cacheConfig": {
"cacheType": "file",
"cacheDir": "/opt/overlaybd/registry_cache",
"cacheSizeGB": 4
},
"gzipCacheConfig": {
"enable": true,
"cacheDir": "/opt/overlaybd/gzip_cache",
"cacheSizeGB": 4
},
"credentialConfig": {
"mode": "file",
"path": "/opt/overlaybd/cred.json"
},
"ioEngine": 0,
"download": {
"enable": true,
"delay": 600,
"delayExtra": 30,
"maxMBps": 100
},
"p2pConfig": {
"enable": false,
"address": "localhost:19145/dadip2p"
},
"exporterConfig": {
"enable": false,
"uriPrefix": "/metrics",
"port": 9863,
"updateInterval": 60000000
},
"enableAudit": true,
"auditPath": "/var/log/overlaybd-audit.log",
"serviceConfig": {
"enable": false,
"address": "http://127.0.0.1:9862"
}
}
| Field | Description |
|---------------------|-------------------------------------------------------------------------------------------------------|
| logConfig.logLevel | The log level for log file, 0 - DEBUG, 1 - INFO, 2 - WARN, 3 - ERROR |
| logConfig.logPath | The path for log file, /var/log/overlaybd.log is the default value. |
| logConfig.logSizeMB | The size limit for log file, in MB, 10 is default (10 MB). |
| logConfig.logRotateNum | The rotate number for log file, 3 is default. |
| ioEngine | IO engine used to open local files: psync 0, libaio 1, posix aio 2. |
| cacheConfig.cacheType | Cache type used, file, ocf and download are supported. |
| cacheConfig.cacheDir | The cache directory for remote image data. |
| cacheConfig.cacheSizeGB | The max size of cache, in GB. |
| cacheConfig.refillSize | The refill size from source, in byte. 262144 is default (256 KB). |
| gzipCacheConfig.enable | Whether decompressed gzip file cache is enabled or not. |
| gzipCacheConfig.cacheDir | The cache directory for decompressed gzip data. |
| gzipCacheConfig.cacheSizeGB | The max size of cache, in GB. |
| gzipCacheConfig.refillSize | The refill size from source, in byte. 262144 is default (256 KB). |
| credentialFilePath(legacy) | The credential used for fetching images on registry. /opt/overlaybd/cred.json is the default value. |
| credentialConfig.mode | Authentication mode for lazy-loading. <br> - file means reading credential from credentialConfig.path. <br> - http means sending an http request to credentialConfig.path |
| credentialConfig.path | credential file path or url which is determined by mode |
| download.enable | Whether background downloading is enabled or not. |
| download.delay | The seconds waiting to start downloading task after the overlaybd device launched. |
| download.delayExtra | A random extra delay is attached to delay, avoiding too many tasks started at the same time. |
| download.maxMBps | The speed limit in MB/s for a downloading task. |
| download.blockSize | The download block size from source, in byte. 262144 is default (256 KB). |
| p2pConfig.enable | Whether p2p proxy is enabled or not. |
| p2pConfig.address | The proxy for p2p download, the format is localhost:<P2PConfig.Port>/<P2PConfig.APIKey>, depending on dadip2p.yaml |
| exporterConfig.enable | whether or not create a server to show Prometheus metrics. |
| exporterConfig.uriPrefix | URI prefix for export metrics. |
| exporterConfig.port | port for http server to
