Pagecache
🧩 PageCache - A high-performance, production-grade implementation of a Linux kernel-inspired page cache and buffered I/O subsystem in modern C++. This is a systems engineering prototype demonstrating core operating system memory management and I/O scheduling concepts in user space.
Install / Use
/learn @abhishekprajapatt/PagecacheREADME
PageCache: A User-Space OS Page Cache & I/O Subsystem
A high-performance, production-grade implementation of a Linux kernel-inspired page cache and buffered I/O subsystem in modern C++. This is a systems engineering prototype demonstrating core operating system memory management and I/O scheduling concepts in user space.
What is PageCache?
PageCache implements the memory management layer that sits between application I/O requests and physical disk storage. It transparently caches file contents in memory, eliminating redundant disk reads, managing memory pressure through intelligent eviction policies, and coordinating writes through a background writeback engine.
The system is inspired by the Linux kernel's page cache subsystem and demonstrates how production operating systems efficiently manage file-backed memory at scale.
Architecture Overview
Core Components
Page Cache - Fixed 4KB pages with reference counting, state management (Clean/Dirty/Locked), and LRU/CLOCK eviction policies. Handles cache hits/misses and automatic loading from disk.
File & Inode Layer - POSIX-style file abstraction with per-file page indexing. Multiple open file handles share the same cached pages transparently.
Buffered I/O Paths - Distinct read and write paths that integrate with the page cache. Reads serve from cache when available; writes update cached pages and mark them dirty.
Writeback Engine - Background thread that flushes dirty pages to disk based on thresholds and time. Implements fsync semantics for data consistency.
Page Eviction - Pluggable eviction strategies (LRU, CLOCK algorithm). Clean pages evicted first; dirty pages flushed before reclamation.
Readahead - Detects sequential access patterns and prefetches pages asynchronously, reducing latency for predictable workloads.
Metrics & Monitoring - Atomic counters track cache hits/misses, I/O throughput, eviction rates, and writeback activity.
Thread Pool - Concurrent I/O scheduler with thread-safe work queues for parallel page loading and writeback operations.
Key Features
- Reference Counting & Locking - Pages track open references and can be pinned in memory. Prevents premature eviction of in-use pages.
- Fine-Grained Concurrency - Per-file mutexes and atomic operations minimize lock contention.
- Pluggable Eviction - Choose between LRU and CLOCK algorithms at runtime.
- Dirty Tracking - Efficient dirty bit management with threshold-based flushing.
- Sequential Prefetch - Automatically detects sequential access and prefetches upcoming pages.
- POSIX Semantics - Familiar open/close/read/write/seek/sync interface.
Build & Run
Prerequisites
- GCC 7+ or Clang 5+ (C++17 support)
- CMake 3.16+
- POSIX-compliant system (Linux, macOS, BSD)
- pthread development libraries
Build
mkdir build && cd build
cmake ..
make -j$(nproc)
Run Tests
cd build
./test_page_cache
./test_eviction
Run Benchmarks
./build/benchmark
Benchmark Results
Typical performance on modern hardware (Intel Xeon, 8 cores):
Benchmark Throughput Latency Hit Ratio
======================== ============== =========== ===========
Sequential Read 2450.35 MB/s 4.089 us 99.200 %
Random Read 185.67 MB/s 43.752 us 78.900 %
Mixed Read/Write 875.42 MB/s 11.411 us 85.600 %
Results demonstrate:
- Sequential reads approach hardware limits when cache-resident
- Random workloads show reasonable hit ratios with proper cache sizing
- Mixed workloads balance reads and writes efficiently
Code Quality
- No external dependencies (standard library only)
- Clean abstractions - Page, File, Inode, PageCache separate concerns clearly
- Minimal comments - Code structure and naming convey intent
- Thread-safe - Uses mutexes, atomics, and lock-free patterns appropriately
- Production naming - Clear, professional variable and function names
Architecture Diagram
User Application
|
v
UserAPI (PageCacheSystem singleton)
|
+---+---+---+----+
| | | | |
v v v v v
File Readahead Writeback Metrics
| | | |
+------+---------+--------+
|
v
PageCache
|
+----+-----+
| | |
v v v
LRU CLOCK Eviction
|
v
Page Storage
|
v
Disk I/O
Limitations & Future Work
Current Limitations:
- No persistent write caching beyond single process
- Eviction policies do not account for page size variations
- Readahead is sequential-only; no adaptive window sizing
- No compression or deduplication
- Memory accounting is basic (no NUMA awareness)
Future Enhancements:
- Adaptive readahead with ML prediction
- 2Q and ARC eviction policies
- Page compression for memory-constrained environments
- NUMA-aware memory placement
- Persistent journal for crash recovery
- Network I/O support
Contributing
This is a reference implementation intended for education and systems understanding. Contributions that improve clarity, add new eviction strategies, or demonstrate performance optimizations are welcome.
License
MIT License - see LICENSE file
Citation
If you find this implementation useful for education or research, please cite:
PageCache: A User-Space OS Page Cache & I/O Subsystem
GitHub: https://github.com/abhishekprajapatt/pagecache
2025
Acknowledgments
Design inspired by:
- Linux kernel page cache (mm/page_cache.c)
- Andrew S. Tanenbaum's "Modern Operating Systems"
- FreeBSD UVM memory management
- Production systems engineering best practices
A serious systems engineering project worth starring. ⭐
Related Skills
node-connect
347.9kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
108.7kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
347.9kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
347.9kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
