SkillAgentSearch skills...

Pagecache

🧩 PageCache - A high-performance, production-grade implementation of a Linux kernel-inspired page cache and buffered I/O subsystem in modern C++. This is a systems engineering prototype demonstrating core operating system memory management and I/O scheduling concepts in user space.

Install / Use

/learn @abhishekprajapatt/Pagecache
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

PageCache: A User-Space OS Page Cache & I/O Subsystem

A high-performance, production-grade implementation of a Linux kernel-inspired page cache and buffered I/O subsystem in modern C++. This is a systems engineering prototype demonstrating core operating system memory management and I/O scheduling concepts in user space.

What is PageCache?

PageCache implements the memory management layer that sits between application I/O requests and physical disk storage. It transparently caches file contents in memory, eliminating redundant disk reads, managing memory pressure through intelligent eviction policies, and coordinating writes through a background writeback engine.

The system is inspired by the Linux kernel's page cache subsystem and demonstrates how production operating systems efficiently manage file-backed memory at scale.

Architecture Overview

Core Components

Page Cache - Fixed 4KB pages with reference counting, state management (Clean/Dirty/Locked), and LRU/CLOCK eviction policies. Handles cache hits/misses and automatic loading from disk.

File & Inode Layer - POSIX-style file abstraction with per-file page indexing. Multiple open file handles share the same cached pages transparently.

Buffered I/O Paths - Distinct read and write paths that integrate with the page cache. Reads serve from cache when available; writes update cached pages and mark them dirty.

Writeback Engine - Background thread that flushes dirty pages to disk based on thresholds and time. Implements fsync semantics for data consistency.

Page Eviction - Pluggable eviction strategies (LRU, CLOCK algorithm). Clean pages evicted first; dirty pages flushed before reclamation.

Readahead - Detects sequential access patterns and prefetches pages asynchronously, reducing latency for predictable workloads.

Metrics & Monitoring - Atomic counters track cache hits/misses, I/O throughput, eviction rates, and writeback activity.

Thread Pool - Concurrent I/O scheduler with thread-safe work queues for parallel page loading and writeback operations.

Key Features

  • Reference Counting & Locking - Pages track open references and can be pinned in memory. Prevents premature eviction of in-use pages.
  • Fine-Grained Concurrency - Per-file mutexes and atomic operations minimize lock contention.
  • Pluggable Eviction - Choose between LRU and CLOCK algorithms at runtime.
  • Dirty Tracking - Efficient dirty bit management with threshold-based flushing.
  • Sequential Prefetch - Automatically detects sequential access and prefetches upcoming pages.
  • POSIX Semantics - Familiar open/close/read/write/seek/sync interface.

Build & Run

Prerequisites

  • GCC 7+ or Clang 5+ (C++17 support)
  • CMake 3.16+
  • POSIX-compliant system (Linux, macOS, BSD)
  • pthread development libraries

Build

mkdir build && cd build
cmake ..
make -j$(nproc)

Run Tests

cd build
./test_page_cache
./test_eviction

Run Benchmarks

./build/benchmark

Benchmark Results

Typical performance on modern hardware (Intel Xeon, 8 cores):

Benchmark                 Throughput      Latency      Hit Ratio
========================  ==============  ===========  ===========
Sequential Read           2450.35 MB/s    4.089 us     99.200 %
Random Read                185.67 MB/s   43.752 us     78.900 %
Mixed Read/Write           875.42 MB/s   11.411 us     85.600 %

Results demonstrate:

  • Sequential reads approach hardware limits when cache-resident
  • Random workloads show reasonable hit ratios with proper cache sizing
  • Mixed workloads balance reads and writes efficiently

Code Quality

  • No external dependencies (standard library only)
  • Clean abstractions - Page, File, Inode, PageCache separate concerns clearly
  • Minimal comments - Code structure and naming convey intent
  • Thread-safe - Uses mutexes, atomics, and lock-free patterns appropriately
  • Production naming - Clear, professional variable and function names

Architecture Diagram

User Application
        |
        v
    UserAPI (PageCacheSystem singleton)
        |
    +---+---+---+----+
    |   |   |   |    |
    v   v   v   v    v
  File  Readahead Writeback Metrics
    |      |         |        |
    +------+---------+--------+
           |
           v
       PageCache
           |
      +----+-----+
      |   |     |
      v   v     v
   LRU CLOCK  Eviction
      |
      v
   Page Storage
      |
      v
   Disk I/O

Limitations & Future Work

Current Limitations:

  • No persistent write caching beyond single process
  • Eviction policies do not account for page size variations
  • Readahead is sequential-only; no adaptive window sizing
  • No compression or deduplication
  • Memory accounting is basic (no NUMA awareness)

Future Enhancements:

  • Adaptive readahead with ML prediction
  • 2Q and ARC eviction policies
  • Page compression for memory-constrained environments
  • NUMA-aware memory placement
  • Persistent journal for crash recovery
  • Network I/O support

Contributing

This is a reference implementation intended for education and systems understanding. Contributions that improve clarity, add new eviction strategies, or demonstrate performance optimizations are welcome.

License

MIT License - see LICENSE file

Citation

If you find this implementation useful for education or research, please cite:

PageCache: A User-Space OS Page Cache & I/O Subsystem
GitHub: https://github.com/abhishekprajapatt/pagecache
2025

Acknowledgments

Design inspired by:

  • Linux kernel page cache (mm/page_cache.c)
  • Andrew S. Tanenbaum's "Modern Operating Systems"
  • FreeBSD UVM memory management
  • Production systems engineering best practices

A serious systems engineering project worth starring.

Related Skills

View on GitHub
GitHub Stars23
CategoryDevelopment
Updated2mo ago
Forks0

Languages

C++

Security Score

95/100

Audited on Jan 29, 2026

No findings