Caissa Chess Engine

ArtImage

<p style='text-align: right;'><em>(image generated with DALL·E 2)</em></p>

Overview

Caissa is a strong, UCI-compatible chess engine written from scratch in C++ since early 2021. It features a custom neural network evaluation system trained on over 17 billion self-play positions, achieving ratings of 3600+ ELO on major chess engine rating lists, placing it at around top-10 spot.

The engine is optimized for:

Regular Chess - Standard chess rules
FRC (Fischer Random Chess) - Chess960 variant
DFRC (Double Fischer Random Chess) - Extended FRC variant

Playing Strength
Features
Quick Start
Compilation
Architecture Variants
Custom Commands
UCI Options
History & Originality
Project Structure
License

Playing Strength

Caissa consistently ranks among the top chess engines on major rating lists:

CCRL (Computer Chess Rating Lists)

| List | Rating | Rank | Version | Notes | |------|--------|------|---------|-------| | CCRL 40/2 FRC | 4022 | #6 | 1.23 | Fischer Random Chess | | CCRL Chess324 | 3770 | #6 | 1.23 | Chess324 variant | | CCRL 40/15 | 3622 | #9 | 1.23 | 4 CPU | | CCRL Blitz | 3755 | #10 | 1.22 | 8 CPU |

SPCC (Schachprogramm-Computer-Chess)

| List | Rating | Rank | Version | |------|--------|------|---------| | SPCC UHO-Top15 | 3697 | #10 | Caissa 1.24 avx512 |

IpMan Chess

| List | Rating | Rank | Version | Architecture | |------|--------|------|---------|--------------| | 10+1 (R9-7945HX) | 3542 | #16 | 1.24 | AVX-512 | | 10+1 (i9-7980XE) | 3526 | #14 | 1.21 | AVX-512 | | 10+1 (i9-13700H) | 3544 | #17 | 1.22 | AVX2-BMI2 |

CEGT (Chess Engine Grand Tournament)

| List | Rating | Rank | Version | |------|--------|------|---------| | CEGT 40/20 | 3576 | #8 | 1.24 | | CEGT 40/4 | 3614 | #8 | 1.22 | | CEGT 5+3 | 3618 | #5 | 1.22 |

Note: The rankings above may be outdated.

Features

General

✅ UCI Protocol - Full Universal Chess Interface support
✅ Neural Network Evaluation - Custom NNUE-style evaluation
✅ Endgame Tablebases - Syzygy and Gaviota support
✅ Chess960 Support - Fischer Random Chess (FRC) and Double FRC

Search Algorithm

✅ Negamax with alpha-beta pruning
✅ Iterative Deepening with aspiration windows
✅ Principal Variation Search (PVS)
✅ Quiescence Search for tactical positions
✅ Transposition Table with large pages support
✅ Multi-PV Search - Analyze multiple lines simultaneously
✅ Multithreaded Search - Parallel search with shared TT
✅ Late Move Reductions (LMR)
✅ Null-Move Pruning
✅ Singular Extensions
✅ Correction History - Pawn and non-pawn correction tables improve static eval accuracy
✅ Cuckoo Hashing for fast repetition detection

Neural Network Evaluation

Architecture: (32×768→1024)×2→1 — dual-perspective (one accumulator per king), 32 king buckets, 768 features per perspective (12 piece types × 64 squares)
Incremental Updates - Efficiently updated first layer
Vectorized Code - Manual SIMD optimization for:
- AVX-512 (fastest)
- AVX2
- SSE2
- ARM NEON
Activation: Clipped-ReLU
Variants: 8 variants of last layer weights (piece count dependent)
Features: Absolute piece coordinates with horizontal symmetry, 32 king buckets
Special Endgame Routines - Enhanced endgame evaluation

Neural Network Trainer

Custom CPU-based Trainer using Adam algorithm
Highly Optimized - Exploits AVX instructions, multithreading, and network sparsity
Self-Play Training - Trained on 17+ billion positions from self-generated games
Progressive Training - Older games purged, networks trained on latest engine versions

Performance Optimizations

Magic Bitboards - Efficient move generation
Large Pages - Transposition table uses large pages for better performance
Node Caching - Evaluation result caching
Accumulator Caching - Neural network accumulator caching
NUMA Support - Memory allocation and thread pinning respect NUMA topology on multi-socket systems (Linux, requires libnuma)
Ultra-Fast - Outstanding performance at ultra-short time controls (sub-second games)

Quick Start

Using Pre-built Binaries

Download the appropriate executable from the Releases page
Choose the version matching your CPU:
- AVX-512: Latest Intel Xeon/AMD EPYC (fastest)
- BMI2: Most modern CPUs (recommended)
- AVX2: Older CPUs with AVX2 support
- POPCNT: Older CPUs with SSE4.2
- Legacy: Very old x64 CPUs
Run the engine with any UCI-compatible chess GUI

Running from Source

See the Compilation section below for detailed build instructions.

Compilation

Prerequisites

C++ Compiler with C++20 support:
- GCC 10+ or Clang 12+ (Linux)
- Visual Studio 2022 (Windows)
CMake 3.15 or later
Make (Linux) or Visual Studio (Windows)

Linux

Using Makefile (Quick Build)

cd src
make -j$(nproc)

Note: This compiles the default AVX2/BMI2 version.

Using CMake (Recommended)

mkdir build
cd build
cmake -DCMAKE_BUILD_TYPE=Final ..
make -j$(nproc)

Build Configurations:

Final - Production build, no asserts, maximum optimizations
Release - Development build with asserts, optimizations enabled
Debug - Development build with asserts, optimizations disabled

Architecture Selection:

To build for a specific architecture, set the TARGET_ARCH variable:

# AVX-512 (requires AVX-512 support)
cmake -DTARGET_ARCH=x64-avx512 -DCMAKE_BUILD_TYPE=Final ..
# BMI2 (recommended for modern CPUs)
cmake -DTARGET_ARCH=x64-bmi2 -DCMAKE_BUILD_TYPE=Final ..
# AVX2
cmake -DTARGET_ARCH=x64-avx2 -DCMAKE_BUILD_TYPE=Final ..
# SSE4-POPCNT
cmake -DTARGET_ARCH=x64-sse4-popcnt -DCMAKE_BUILD_TYPE=Final ..
# Legacy (fallback)
cmake -DTARGET_ARCH=x64-legacy -DCMAKE_BUILD_TYPE=Final ..

Windows

Run GenerateVisualStudioSolution.bat to generate the Visual Studio solution
Open build_<arch>/caissa.sln in Visual Studio 2022
Select the desired configuration (Debug/Release/Final)
Build the solution (Ctrl+Shift+B)

Note: Visual Studio 2022 is the only tested version. CMake directly in Visual Studio has not been tested.

ARM / AArch64

CMake supports two ARM targets via TARGET_ARCH:

mkdir build && cd build

# Generic AArch64 (no NEON intrinsics)
cmake -DTARGET_ARCH=aarch64 -DCMAKE_BUILD_TYPE=Final ..

# AArch64 with NEON SIMD (recommended on modern ARM hardware)
cmake -DTARGET_ARCH=aarch64-neon -DCMAKE_BUILD_TYPE=Final ..

make -j$(nproc)

Post-Compilation

After compilation, copy the appropriate neural network file from data/neuralNets/ to:

Linux: build/bin/
Windows: build\bin\x64\<Configuration>\

Architecture Variants

| Variant | CPU Requirements | Performance | Recommended For | |---------|-----------------|-------------|-----------------| | AVX-512 | AVX-512 instruction set | Fastest | Latest Intel Xeon, AMD EPYC | | BMI2 | AVX2 + BMI2 | Fast | Most modern CPUs (2015+) | | AVX2 | AVX2 instruction set | Fast | Intel Haswell, AMD Ryzen | | POPCNT | SSE4.2 + POPCNT | Moderate | Older CPUs (2008-2014) | | Legacy | x64 only | Slowest | Very old x64 CPUs |

Tip: If unsure, try BMI2 first. It's supported by most modern CPUs and offers excellent performance.

Custom Commands

In addition to the standard UCI protocol, the engine supports these non-standard commands useful for development and debugging:

| Command | Description | |---------|-------------| | bench [depth] | Run a benchmark / smoke test | | perft [depth] | Count legal moves to a given depth (move generation test) | | eval | Display evaluation of the current position | | print | Pretty-print the current board | | scoremoves | Show move ordering scores for the current position | | threats | Show threat information for the current position | | ttinfo | Print transposition table statistics | | ttprobe | Probe the transposition table for the current position | | tbprobe | Probe tablebases for the current position | | cacheprobe | Probe the node cache for the current position | | printparams | Print all tunable search/eval parameters (only

Caissa

Install / Use

README