TriCache

A User-Transparent Block Cache Enabling High-Performance Out-of-Core Processing with In-Memory Programs

Generate Convert Improve

Install / Use

/learn @thu-pacman/TriCache

About this skill

Quality Score

0/100

README

TriCache Artifact Evaluation

Hardware & Software Recommendations

CPU: 2x AMD EPYC 7742 CPUs with hyper-threading
Memory: >= 512GB
Testing Storage: 8x Intel P4618 DC SSDs
Data & System Storage: Any PCI-e attached SSD without MD RAIDs
OS: Debian 11.1 with Linux kernel 5.10

System Setup

Clone TriCache

git clone -b osdi-ae https://github.com/thu-pacman/TriCache.git
cd TriCache
# add to ~/.bashrc
export TRICACHE_ROOT=$HOME/TriCache
git submodule update --init --recursive
cd ..

Install dependency

# Tested under Debian 11.1
# 	Boost: 1.74.0
# 	Thrift: 0.13.0
# 	TBB: 2020.3
sudo apt install vim-nox tmux rsync wget htop numactl time \
	xfsprogs psmisc
sudo apt install build-essential cmake git pkg-config libnuma-dev \
	libboost-all-dev libaio-dev libhwloc-dev libatlas-base-dev \
	zlib1g-dev thrift-compiler libthrift-dev libtbb-dev \
	libgflags-dev openjdk-11-jdk maven gdisk kexec-tools \
	python3 python3-matplotlib python3-numpy

# Install clang-13 from LLVM source
sudo apt install lsb-release software-properties-common
wget https://apt.llvm.org/llvm.sh
chmod +x llvm.sh
sudo ./llvm.sh 13
sudo apt install libomp-13-dev

# Build & install SPDK with TriCache's patch
git clone -b v21.04 https://github.com/spdk/spdk.git
cd spdk
git submodule update --init
git apply $TRICACHE_ROOT/deps/spdk.patch
sudo ./scripts/pkgdep.sh
./configure --prefix=$HOME/spdk-install --with-shared
make -j
make install -j
cd ..

# add to ~/.bashrc
export \
	PKG_CONFIG_PATH=$HOME/spdk-install/lib/pkgconfig:$PKG_CONFIG_PATH
export LD_LIBRARY_PATH=$HOME/spdk-install/lib:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=$HOME/spdk/dpdk/build/lib:$LD_LIBRARY_PATH
export SPDK_ROOT=$HOME/spdk

# testing SPDK examples (spdk_nvme_identify, spdk_nvme_perf)

# Install 4.14 kernel for FastMap
$TRICACHE_ROOT/install_4.14_kernel.sh

OS configurations

# Making sudo password-free
# add to /etc/sudoers 
%sudo   ALL=(ALL) NOPASSWD:ALL

# VM configures for SPDK and tunning swapping in-memory performance
# add following lines to /etc/sysctl.conf
vm.max_map_count=262144
vm.dirty_ratio=99
vm.dirty_background_ratio=99
vm.dirty_expire_centisecs=360000
vm.dirty_writeback_centisecs=360000
vm.swappiness=1

# ulimit configures for SPDK
# add following lines to /etc/security/limits.conf
* hard memlock unlimited
* soft memlock unlimited
* soft nofile 1048576

# disable transparent_hugepage for SPDK
# edit /etc/default/grub
# add transparent_hugepage=never to GRUB_CMDLINE_LINUX 
sudo update-grub

# Support key-based ssh to localhost
ssh-keygen
ssh-copy-id localhost
# test
ssh localhost

# reboot
sudo reboot

Build TriCache

cd $TRICACHE_ROOT
scripts/build.sh

Configure TriCache

Edit $TRICACHE_ROOT/scripts/config.sh to configure ssds for TriCache.

SSDArray lists disk ids for testing disks. ls -lh /dev/disk/by-id will list all disk ids and their block id (nvmeXnY).
SSDPCIe lists PCIe addresses testing disks. cat /sys/block/nvmeXnY/device/address will show PCIe the address.
CACHE_16_SERVER_CONFIG lists disks and cores used by TriCache. Each item is formed like server-core-id,disk-pci-address,nsid,disk-offset.
CACHE_16_SERVER_CORES lists cores used by TriCache.
CACHE_32_SERVER_CONFIG and CACHE_32_SERVER_CORES a 32-server configurations.
All the above disks will be formatted multiple times. YOU WILL LOSE THEIR DATA!
All the above disks will be heavily written. THEY MAY BE BROKEN!
Guide for selecting cores:
1. Distribute the servers as evenly as possible among multiple numa-nodes
2. Use hyperthreading to bind servers to as few physical cores as possible while avoiding over-subscribe for servers.
3. For each server, find the closest SSDs binding to it.

Test TriCache

# setup SPDK, with 4GB DMA memory
$TRICACHE_ROOT/scripts/setup_spdk.sh 4

# "hello world" testing
$TRICACHE_ROOT/scripts/test_build.sh
# it should exit without error
# Reference log:
# Init 000 at 000 CPU, 00 Node
# Init 001 at 128 CPU, 00 Node
# Init 002 at 016 CPU, 01 Node
# Init 003 at 144 CPU, 01 Node
# Init 006 at 048 CPU, 03 Node
# Init 007 at 176 CPU, 03 Node
# Init 005 at 160 CPU, 02 Node
# Init 004 at 032 CPU, 02 Node
# Init 013 at 224 CPU, 06 Node
# Init 012 at 096 CPU, 06 Node
# Init 014 at 112 CPU, 07 Node
# Init 015 at 240 CPU, 07 Node
# Init 008 at 064 CPU, 04 Node
# Init 009 at 192 CPU, 04 Node
# Init 010 at 080 CPU, 05 Node
# Init 011 at 208 CPU, 05 Node
# Deconstructing SPDK Env

# reset devices from SPDK
$TRICACHE_ROOT/scripts/reset_spdk.sh

Generating and Preprocessing Datasets

The experiments about graph processing use the uk-2014 dataset which is available here.
The experiments about key-value stores use the mixgraph (prefix-dist) workload and the db_bench tool from RocksDB.
The experiments about big-data analytics use terasort datasets generated by teragen from Hadoop.
The experiments about graph database use LDBC SNB interactive benchmark generated by LDBC SNB Datagen.

The preprocessed datasets to reproduce the experimental results are available at: https://r2.grep.top/TriCacheAEData.tar.zst or https://pacman.cs.tsinghua.edu.cn/public/TriCacheAEData.tar.zst, (MD5: 1560459277ffec508ff515dfbf931686).

# Unzip the datasets and put them into /mnt/data/TriCache
# /mnt/data/TriCache
# ├── flashgraph
# ├── ligra
# ├── livegraph
# ├── rocksdb
# └── terasort

sudo mkdir -p /mnt/data/TriCache
sudo tar --zstd -xf TriCache.tar.zst -C /mnt/data/TriCache

Reproduce Evaluations

All the configured disked will be formatted multiple times. YOU WILL LOSE THEIR DATA!
MAKE SURE there is no md raid like /dev/md*. MD RAIDS WILL BE BROKEN!
/mnt/data/TriCache/temp, /mnt/raid and /mnt/ssd[0-15] will be used as temporary directories, please MAKE SURE no important data is stored in these directories.
Please execute scripts in tmux or screen
We recommend plotting the figures by following the later Plot Figures Section after running each part of the experiments.

Make an empty directory for storing logs, such as $HOME/results, and cd into it.

mkdir -p $HOME/results
cd $HOME/results

"Quick & Small" Experiments

It covers important cases in our evaluations. We think these results can support our claims within a short period.

# Run "Quick & Small" Experiments
# Taking about 7 hours
$TRICACHE_ROOT/scripts/run_all_small.sh

"One-click" Script for All Experiments

It combines all experiments (except FastMap and long-running parts) in one script. The following sections will describe how to run each experiment separately.

# Run "one-click" script
# Taking about 60 hours
$TRICACHE_ROOT/scripts/run_all.sh

Graph Processing (Section 4.1)

All-in-one

It combines all experiments (except long-running experiments) in one script.

# Taking about 11 hours 30 minutes
# It will process PageRank, WCC, BFS for uk-2014 dataset with 
# 	Ligra (TriCache), Ligra (Swapping), and FlashGraph.
# We move 64GB, 32GB, and 16GB with Ligra (Swapping)
# 	to Long-running Experiments
# Logs are placed in results_ligra_cache, results_ligra_swap, 
# 	and results_flashgraph
$TRICACHE_ROOT/scripts/run_all.sh graph_processing

One-by-one

# Ligra (TriCache)
# Taking about 4 hours 30 minutes
# Logs are placed in results_ligra_cache
$TRICACHE_ROOT/scripts/run_all.sh graph_processing ligra_cache

# Ligra (Swapping)
# Taking about 5 hours 30 minutes
# Logs are placed in results_ligra_swap
$TRICACHE_ROOT/scripts/run_all.sh graph_processing ligra_swap

# FlashGraph
# Taking about 1 hour 30 minutes
# Logs are placed in results_flashgraph
$TRICACHE_ROOT/scripts/run_all.sh graph_processing flashgraph

Long-running Experiments

It will force to execute long-time experiments. These experiments will take more than 16 hours, so PLEASE skip them first.

# Ligra (Swapping) Long-running Parts
# Logs are placed in results_ligra_swap
$TRICACHE_ROOT/scripts/run_all.sh graph_processing ligra_swap_slow

Key-Value Stores (RocksDB: Section 4.2)

All-in-one

# Taking about 5 hours
# It will execute mixgraph workload with db_bench of RocksDB 
# 	PlainTable (TriCache), BlockBasedTable and PlainTable (mmap).
# We reduce some requests number to limit total execution time
# 	for long-running cases with BlockBasedTable and PlainTable (mmap)
# Logs are placed in results_rocksdb_cache, results_rocksdb_block, 
# 	and results_rocksdb_mmap
$TRICACHE_ROOT/scripts/run_all.sh rocksdb

One-by-one

# PlainTable (TriCache)
# Taking about 2 hours
# Logs are placed in results_rocksdb_cache
$TRICACHE_ROOT/scripts/run_all.sh rocksdb rocksdb_cache

# BlockBasedTable
# Taking about 1 hour
# Logs are placed in results_rocksdb_block
$TRICACHE_ROOT/scripts/run_all.sh rocksdb rocksdb_block

# PlainTable (mmap)
# Taking about 2 hours
# Logs are placed in results_rocksdb_mmap
$TRICACHE_ROOT/scripts/run_all.sh rocksdb rocksdb_mmap

Big-Data Analytics (Terasort: Section 4.3)

All-in-one

# Taking about 14 hours
# It will execute 150GB and 400GB Terasort with 
# 	ShuffleSort (TriCache), GNUSort (TriCache), and Spark,
# 	and will execute 150GB dataset Terasort with
# 	ShuffleSort (Swapping), GNUSort (Swapping).
# We move the 400GB with ShuffleSort/GNUSort (Swapping)
# 	to Long-running Experiments
# Logs are placed in results_terasort_cache, results_terasort_swap, 
# 	and results_terasort_spark
$TRICACHE_ROOT/scripts/run_all.sh terasort

One-by-one

# ShuffleSort (TriCache) and GNUSort (TriCache)
# Taking about 3 hours
# Logs a

Related Skills

node-connect

343.3k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

92.1k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

343.3k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

qqbot-media

343.3k

QQBot 富媒体收发能力。使用 <qqmedia> 标签，系统根据文件扩展名自动识别类型（图片/语音/视频/文件）。