Caissa
Strong chess engine
Install / Use
/learn @Witek902/CaissaREADME
Caissa Chess Engine

Overview
Caissa is a strong, UCI-compatible chess engine written from scratch in C++ since early 2021. It features a custom neural network evaluation system trained on over 17 billion self-play positions, achieving ratings of 3600+ ELO on major chess engine rating lists, placing it at around top-10 spot.
The engine is optimized for:
- Regular Chess - Standard chess rules
- FRC (Fischer Random Chess) - Chess960 variant
- DFRC (Double Fischer Random Chess) - Extended FRC variant
Table of Contents
- Playing Strength
- Features
- Quick Start
- Compilation
- Architecture Variants
- Custom Commands
- UCI Options
- History & Originality
- Project Structure
- License
Playing Strength
Caissa consistently ranks among the top chess engines on major rating lists:
CCRL (Computer Chess Rating Lists)
| List | Rating | Rank | Version | Notes | |------|--------|------|---------|-------| | CCRL 40/2 FRC | 4022 | #6 | 1.23 | Fischer Random Chess | | CCRL Chess324 | 3770 | #6 | 1.23 | Chess324 variant | | CCRL 40/15 | 3622 | #9 | 1.23 | 4 CPU | | CCRL Blitz | 3755 | #10 | 1.22 | 8 CPU |
SPCC (Schachprogramm-Computer-Chess)
| List | Rating | Rank | Version | |------|--------|------|---------| | SPCC UHO-Top15 | 3697 | #10 | Caissa 1.24 avx512 |
IpMan Chess
| List | Rating | Rank | Version | Architecture | |------|--------|------|---------|--------------| | 10+1 (R9-7945HX) | 3542 | #16 | 1.24 | AVX-512 | | 10+1 (i9-7980XE) | 3526 | #14 | 1.21 | AVX-512 | | 10+1 (i9-13700H) | 3544 | #17 | 1.22 | AVX2-BMI2 |
CEGT (Chess Engine Grand Tournament)
| List | Rating | Rank | Version | |------|--------|------|---------| | CEGT 40/20 | 3576 | #8 | 1.24 | | CEGT 40/4 | 3614 | #8 | 1.22 | | CEGT 5+3 | 3618 | #5 | 1.22 |
Note: The rankings above may be outdated.
Features
General
- ✅ UCI Protocol - Full Universal Chess Interface support
- ✅ Neural Network Evaluation - Custom NNUE-style evaluation
- ✅ Endgame Tablebases - Syzygy and Gaviota support
- ✅ Chess960 Support - Fischer Random Chess (FRC) and Double FRC
Search Algorithm
- ✅ Negamax with alpha-beta pruning
- ✅ Iterative Deepening with aspiration windows
- ✅ Principal Variation Search (PVS)
- ✅ Quiescence Search for tactical positions
- ✅ Transposition Table with large pages support
- ✅ Multi-PV Search - Analyze multiple lines simultaneously
- ✅ Multithreaded Search - Parallel search with shared TT
- ✅ Late Move Reductions (LMR)
- ✅ Null-Move Pruning
- ✅ Singular Extensions
- ✅ Correction History - Pawn and non-pawn correction tables improve static eval accuracy
- ✅ Cuckoo Hashing for fast repetition detection
Neural Network Evaluation
- Architecture: (32×768→1024)×2→1 — dual-perspective (one accumulator per king), 32 king buckets, 768 features per perspective (12 piece types × 64 squares)
- Incremental Updates - Efficiently updated first layer
- Vectorized Code - Manual SIMD optimization for:
- AVX-512 (fastest)
- AVX2
- SSE2
- ARM NEON
- Activation: Clipped-ReLU
- Variants: 8 variants of last layer weights (piece count dependent)
- Features: Absolute piece coordinates with horizontal symmetry, 32 king buckets
- Special Endgame Routines - Enhanced endgame evaluation
Neural Network Trainer
- Custom CPU-based Trainer using Adam algorithm
- Highly Optimized - Exploits AVX instructions, multithreading, and network sparsity
- Self-Play Training - Trained on 17+ billion positions from self-generated games
- Progressive Training - Older games purged, networks trained on latest engine versions
Performance Optimizations
- Magic Bitboards - Efficient move generation
- Large Pages - Transposition table uses large pages for better performance
- Node Caching - Evaluation result caching
- Accumulator Caching - Neural network accumulator caching
- NUMA Support - Memory allocation and thread pinning respect NUMA topology on multi-socket systems (Linux, requires
libnuma) - Ultra-Fast - Outstanding performance at ultra-short time controls (sub-second games)
Quick Start
Using Pre-built Binaries
- Download the appropriate executable from the Releases page
- Choose the version matching your CPU:
- AVX-512: Latest Intel Xeon/AMD EPYC (fastest)
- BMI2: Most modern CPUs (recommended)
- AVX2: Older CPUs with AVX2 support
- POPCNT: Older CPUs with SSE4.2
- Legacy: Very old x64 CPUs
- Run the engine with any UCI-compatible chess GUI
Running from Source
See the Compilation section below for detailed build instructions.
Compilation
Prerequisites
- C++ Compiler with C++20 support:
- GCC 10+ or Clang 12+ (Linux)
- Visual Studio 2022 (Windows)
- CMake 3.15 or later
- Make (Linux) or Visual Studio (Windows)
Linux
Using Makefile (Quick Build)
cd src
make -j$(nproc)
Note: This compiles the default AVX2/BMI2 version.
Using CMake (Recommended)
mkdir build
cd build
cmake -DCMAKE_BUILD_TYPE=Final ..
make -j$(nproc)
Build Configurations:
Final- Production build, no asserts, maximum optimizationsRelease- Development build with asserts, optimizations enabledDebug- Development build with asserts, optimizations disabled
Architecture Selection:
To build for a specific architecture, set the TARGET_ARCH variable:
# AVX-512 (requires AVX-512 support)
cmake -DTARGET_ARCH=x64-avx512 -DCMAKE_BUILD_TYPE=Final ..
# BMI2 (recommended for modern CPUs)
cmake -DTARGET_ARCH=x64-bmi2 -DCMAKE_BUILD_TYPE=Final ..
# AVX2
cmake -DTARGET_ARCH=x64-avx2 -DCMAKE_BUILD_TYPE=Final ..
# SSE4-POPCNT
cmake -DTARGET_ARCH=x64-sse4-popcnt -DCMAKE_BUILD_TYPE=Final ..
# Legacy (fallback)
cmake -DTARGET_ARCH=x64-legacy -DCMAKE_BUILD_TYPE=Final ..
Windows
- Run
GenerateVisualStudioSolution.batto generate the Visual Studio solution - Open
build_<arch>/caissa.slnin Visual Studio 2022 - Select the desired configuration (Debug/Release/Final)
- Build the solution (Ctrl+Shift+B)
Note: Visual Studio 2022 is the only tested version. CMake directly in Visual Studio has not been tested.
ARM / AArch64
CMake supports two ARM targets via TARGET_ARCH:
mkdir build && cd build
# Generic AArch64 (no NEON intrinsics)
cmake -DTARGET_ARCH=aarch64 -DCMAKE_BUILD_TYPE=Final ..
# AArch64 with NEON SIMD (recommended on modern ARM hardware)
cmake -DTARGET_ARCH=aarch64-neon -DCMAKE_BUILD_TYPE=Final ..
make -j$(nproc)
Post-Compilation
After compilation, copy the appropriate neural network file from data/neuralNets/ to:
- Linux:
build/bin/ - Windows:
build\bin\x64\<Configuration>\
Architecture Variants
| Variant | CPU Requirements | Performance | Recommended For | |---------|-----------------|-------------|-----------------| | AVX-512 | AVX-512 instruction set | Fastest | Latest Intel Xeon, AMD EPYC | | BMI2 | AVX2 + BMI2 | Fast | Most modern CPUs (2015+) | | AVX2 | AVX2 instruction set | Fast | Intel Haswell, AMD Ryzen | | POPCNT | SSE4.2 + POPCNT | Moderate | Older CPUs (2008-2014) | | Legacy | x64 only | Slowest | Very old x64 CPUs |
Tip: If unsure, try BMI2 first. It's supported by most modern CPUs and offers excellent performance.
Custom Commands
In addition to the standard UCI protocol, the engine supports these non-standard commands useful for development and debugging:
| Command | Description |
|---------|-------------|
| bench [depth] | Run a benchmark / smoke test |
| perft [depth] | Count legal moves to a given depth (move generation test) |
| eval | Display evaluation of the current position |
| print | Pretty-print the current board |
| scoremoves | Show move ordering scores for the current position |
| threats | Show threat information for the current position |
| ttinfo | Print transposition table statistics |
| ttprobe | Probe the transposition table for the current position |
| tbprobe | Probe tablebases for the current position |
| cacheprobe | Probe the node cache for the current position |
| printparams | Print all tunable search/eval parameters (only
