TriCache
A User-Transparent Block Cache Enabling High-Performance Out-of-Core Processing with In-Memory Programs
Install / Use
/learn @thu-pacman/TriCacheREADME
TriCache Artifact Evaluation
Hardware & Software Recommendations
- CPU: 2x AMD EPYC 7742 CPUs with hyper-threading
- Memory: >= 512GB
- Testing Storage: 8x Intel P4618 DC SSDs
- Data & System Storage: Any PCI-e attached SSD without MD RAIDs
- OS: Debian 11.1 with Linux kernel 5.10
System Setup
Clone TriCache
git clone -b osdi-ae https://github.com/thu-pacman/TriCache.git
cd TriCache
# add to ~/.bashrc
export TRICACHE_ROOT=$HOME/TriCache
git submodule update --init --recursive
cd ..
Install dependency
# Tested under Debian 11.1
# Boost: 1.74.0
# Thrift: 0.13.0
# TBB: 2020.3
sudo apt install vim-nox tmux rsync wget htop numactl time \
xfsprogs psmisc
sudo apt install build-essential cmake git pkg-config libnuma-dev \
libboost-all-dev libaio-dev libhwloc-dev libatlas-base-dev \
zlib1g-dev thrift-compiler libthrift-dev libtbb-dev \
libgflags-dev openjdk-11-jdk maven gdisk kexec-tools \
python3 python3-matplotlib python3-numpy
# Install clang-13 from LLVM source
sudo apt install lsb-release software-properties-common
wget https://apt.llvm.org/llvm.sh
chmod +x llvm.sh
sudo ./llvm.sh 13
sudo apt install libomp-13-dev
# Build & install SPDK with TriCache's patch
git clone -b v21.04 https://github.com/spdk/spdk.git
cd spdk
git submodule update --init
git apply $TRICACHE_ROOT/deps/spdk.patch
sudo ./scripts/pkgdep.sh
./configure --prefix=$HOME/spdk-install --with-shared
make -j
make install -j
cd ..
# add to ~/.bashrc
export \
PKG_CONFIG_PATH=$HOME/spdk-install/lib/pkgconfig:$PKG_CONFIG_PATH
export LD_LIBRARY_PATH=$HOME/spdk-install/lib:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=$HOME/spdk/dpdk/build/lib:$LD_LIBRARY_PATH
export SPDK_ROOT=$HOME/spdk
# testing SPDK examples (spdk_nvme_identify, spdk_nvme_perf)
# Install 4.14 kernel for FastMap
$TRICACHE_ROOT/install_4.14_kernel.sh
OS configurations
# Making sudo password-free
# add to /etc/sudoers
%sudo ALL=(ALL) NOPASSWD:ALL
# VM configures for SPDK and tunning swapping in-memory performance
# add following lines to /etc/sysctl.conf
vm.max_map_count=262144
vm.dirty_ratio=99
vm.dirty_background_ratio=99
vm.dirty_expire_centisecs=360000
vm.dirty_writeback_centisecs=360000
vm.swappiness=1
# ulimit configures for SPDK
# add following lines to /etc/security/limits.conf
* hard memlock unlimited
* soft memlock unlimited
* soft nofile 1048576
# disable transparent_hugepage for SPDK
# edit /etc/default/grub
# add transparent_hugepage=never to GRUB_CMDLINE_LINUX
sudo update-grub
# Support key-based ssh to localhost
ssh-keygen
ssh-copy-id localhost
# test
ssh localhost
# reboot
sudo reboot
Build TriCache
cd $TRICACHE_ROOT
scripts/build.sh
Configure TriCache
Edit $TRICACHE_ROOT/scripts/config.sh to configure ssds for TriCache.
SSDArraylists disk ids for testing disks.ls -lh /dev/disk/by-idwill list all disk ids and their block id (nvmeXnY).SSDPCIelists PCIe addresses testing disks.cat /sys/block/nvmeXnY/device/addresswill show PCIe the address.CACHE_16_SERVER_CONFIGlists disks and cores used by TriCache. Each item is formed likeserver-core-id,disk-pci-address,nsid,disk-offset.CACHE_16_SERVER_CORESlists cores used by TriCache.CACHE_32_SERVER_CONFIGandCACHE_32_SERVER_CORESa 32-server configurations.- All the above disks will be formatted multiple times. YOU WILL LOSE THEIR DATA!
- All the above disks will be heavily written. THEY MAY BE BROKEN!
- Guide for selecting cores:
- Distribute the servers as evenly as possible among multiple numa-nodes
- Use hyperthreading to bind servers to as few physical cores as possible while avoiding over-subscribe for servers.
- For each server, find the closest SSDs binding to it.
Test TriCache
# setup SPDK, with 4GB DMA memory
$TRICACHE_ROOT/scripts/setup_spdk.sh 4
# "hello world" testing
$TRICACHE_ROOT/scripts/test_build.sh
# it should exit without error
# Reference log:
# Init 000 at 000 CPU, 00 Node
# Init 001 at 128 CPU, 00 Node
# Init 002 at 016 CPU, 01 Node
# Init 003 at 144 CPU, 01 Node
# Init 006 at 048 CPU, 03 Node
# Init 007 at 176 CPU, 03 Node
# Init 005 at 160 CPU, 02 Node
# Init 004 at 032 CPU, 02 Node
# Init 013 at 224 CPU, 06 Node
# Init 012 at 096 CPU, 06 Node
# Init 014 at 112 CPU, 07 Node
# Init 015 at 240 CPU, 07 Node
# Init 008 at 064 CPU, 04 Node
# Init 009 at 192 CPU, 04 Node
# Init 010 at 080 CPU, 05 Node
# Init 011 at 208 CPU, 05 Node
# Deconstructing SPDK Env
# reset devices from SPDK
$TRICACHE_ROOT/scripts/reset_spdk.sh
Generating and Preprocessing Datasets
- The experiments about graph processing use the
uk-2014dataset which is available here. - The experiments about key-value stores use the mixgraph (prefix-dist) workload and the
db_benchtool from RocksDB. - The experiments about big-data analytics use terasort datasets generated by
teragenfrom Hadoop. - The experiments about graph database use LDBC SNB interactive benchmark generated by LDBC SNB Datagen.
The preprocessed datasets to reproduce the experimental results are available at: https://r2.grep.top/TriCacheAEData.tar.zst or https://pacman.cs.tsinghua.edu.cn/public/TriCacheAEData.tar.zst, (MD5: 1560459277ffec508ff515dfbf931686).
# Unzip the datasets and put them into /mnt/data/TriCache
# /mnt/data/TriCache
# ├── flashgraph
# ├── ligra
# ├── livegraph
# ├── rocksdb
# └── terasort
sudo mkdir -p /mnt/data/TriCache
sudo tar --zstd -xf TriCache.tar.zst -C /mnt/data/TriCache
Reproduce Evaluations
- All the configured disked will be formatted multiple times. YOU WILL LOSE THEIR DATA!
- MAKE SURE there is no md raid like
/dev/md*. MD RAIDS WILL BE BROKEN! /mnt/data/TriCache/temp,/mnt/raidand/mnt/ssd[0-15]will be used as temporary directories, please MAKE SURE no important data is stored in these directories.- Please execute scripts in
tmuxorscreen - We recommend plotting the figures by following the later Plot Figures Section after running each part of the experiments.
Make an empty directory for storing logs, such as $HOME/results, and cd into it.
mkdir -p $HOME/results
cd $HOME/results
"Quick & Small" Experiments
It covers important cases in our evaluations. We think these results can support our claims within a short period.
# Run "Quick & Small" Experiments
# Taking about 7 hours
$TRICACHE_ROOT/scripts/run_all_small.sh
"One-click" Script for All Experiments
It combines all experiments (except FastMap and long-running parts) in one script. The following sections will describe how to run each experiment separately.
# Run "one-click" script
# Taking about 60 hours
$TRICACHE_ROOT/scripts/run_all.sh
Graph Processing (Section 4.1)
All-in-one
It combines all experiments (except long-running experiments) in one script.
# Taking about 11 hours 30 minutes
# It will process PageRank, WCC, BFS for uk-2014 dataset with
# Ligra (TriCache), Ligra (Swapping), and FlashGraph.
# We move 64GB, 32GB, and 16GB with Ligra (Swapping)
# to Long-running Experiments
# Logs are placed in results_ligra_cache, results_ligra_swap,
# and results_flashgraph
$TRICACHE_ROOT/scripts/run_all.sh graph_processing
One-by-one
# Ligra (TriCache)
# Taking about 4 hours 30 minutes
# Logs are placed in results_ligra_cache
$TRICACHE_ROOT/scripts/run_all.sh graph_processing ligra_cache
# Ligra (Swapping)
# Taking about 5 hours 30 minutes
# Logs are placed in results_ligra_swap
$TRICACHE_ROOT/scripts/run_all.sh graph_processing ligra_swap
# FlashGraph
# Taking about 1 hour 30 minutes
# Logs are placed in results_flashgraph
$TRICACHE_ROOT/scripts/run_all.sh graph_processing flashgraph
Long-running Experiments
It will force to execute long-time experiments. These experiments will take more than 16 hours, so PLEASE skip them first.
# Ligra (Swapping) Long-running Parts
# Logs are placed in results_ligra_swap
$TRICACHE_ROOT/scripts/run_all.sh graph_processing ligra_swap_slow
Key-Value Stores (RocksDB: Section 4.2)
All-in-one
# Taking about 5 hours
# It will execute mixgraph workload with db_bench of RocksDB
# PlainTable (TriCache), BlockBasedTable and PlainTable (mmap).
# We reduce some requests number to limit total execution time
# for long-running cases with BlockBasedTable and PlainTable (mmap)
# Logs are placed in results_rocksdb_cache, results_rocksdb_block,
# and results_rocksdb_mmap
$TRICACHE_ROOT/scripts/run_all.sh rocksdb
One-by-one
# PlainTable (TriCache)
# Taking about 2 hours
# Logs are placed in results_rocksdb_cache
$TRICACHE_ROOT/scripts/run_all.sh rocksdb rocksdb_cache
# BlockBasedTable
# Taking about 1 hour
# Logs are placed in results_rocksdb_block
$TRICACHE_ROOT/scripts/run_all.sh rocksdb rocksdb_block
# PlainTable (mmap)
# Taking about 2 hours
# Logs are placed in results_rocksdb_mmap
$TRICACHE_ROOT/scripts/run_all.sh rocksdb rocksdb_mmap
Big-Data Analytics (Terasort: Section 4.3)
All-in-one
# Taking about 14 hours
# It will execute 150GB and 400GB Terasort with
# ShuffleSort (TriCache), GNUSort (TriCache), and Spark,
# and will execute 150GB dataset Terasort with
# ShuffleSort (Swapping), GNUSort (Swapping).
# We move the 400GB with ShuffleSort/GNUSort (Swapping)
# to Long-running Experiments
# Logs are placed in results_terasort_cache, results_terasort_swap,
# and results_terasort_spark
$TRICACHE_ROOT/scripts/run_all.sh terasort
One-by-one
# ShuffleSort (TriCache) and GNUSort (TriCache)
# Taking about 3 hours
# Logs a
Related Skills
node-connect
343.3kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
92.1kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
343.3kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
343.3kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
