Sqlfuse
Another SQL fuzzing tool, like SQLSmith
Install / Use
/learn @cyw0ng95/SqlfuseREADME
SQLfuse
Note: This project has been developed with assistance from AI coding assistants, including GitHub Copilot and Claude. While AI tools have helped with code generation and documentation, all implementations have been reviewed and tested.
A high-performance SQL query generator and fuzzer for testing database systems. SQLfuse generates syntactically valid, semantically interesting SQL statements to discover bugs, edge cases, and performance issues in both SQLite-compatible and analytical database implementations.
Overview
SQLfuse is a Go implementation of the SQLsmith approach to database testing through randomized query generation. Unlike traditional fuzzing that generates random bytes, SQLfuse produces valid SQL statements that exercise diverse database features while respecting the constraints and capabilities of different database flavors.
Key Features
- Multi-Flavor Support: Generates SQL compatible with different database implementations (Turso LibSQL, go-sqlite3, DuckDB)
- Intelligent Generation: Uses schema awareness to produce meaningful queries with valid table/column references
- Comprehensive Coverage: Supports diverse SQL features including CTEs, window functions, subqueries, and complex expressions
- Flavor-Aware: Adapts generated SQL to match the capabilities and constraints of the target database
- Parallel Execution: Multi-worker architecture for high-throughput fuzzing
- Web Interface: Vue.js-based frontend for monitoring and controlling fuzzing jobs
- Modular Architecture: Clean separation between generators, executors, and statement builders
- Impedance Matching: Automatically blacklists problematic statement types based on error rates (inspired by original SQLsmith)
- Statistics Tracking: Comprehensive metrics on generation/execution rates, error patterns, and AST complexity
- Depth-Based Generation: Probabilistic recursion control for varied query complexity
Architecture
SQLfuse uses a modular architecture with clear separation of concerns:
┌─────────────────────────────────────────────────────────────┐
│ SQLfuse │
├─────────────────────────────────────────────────────────────┤
│ │
│ Executors ──▶ Generators ──▶ Statement Builders │
│ (Turso, go-sqlite3, DuckDB, HTTP API) │
│ │
│ Dialects ──▶ Frontend │
│ (Feature detection, SQL validation) (Vue.js web UI) │
│ │
└─────────────────────────────────────────────────────────────┘
Core Components:
- Executors: Database-specific implementations that execute generated SQL against target databases
- Generators: Flavor-aware SQL generators using composition-based design with
BaseGenerator - Statement Builders: Database-agnostic SQL construction using factory and builder patterns
- Dialects: FlavorConfig implementations defining database-specific feature support
- Frontend: Vue.js interface for job management and monitoring
Key Design Patterns:
- Flavor-Based Polymorphism: Dialect configurations adapt SQL to database capabilities
- Composition over Inheritance: Generators embed
BaseGeneratorfor shared functionality - Workspace Isolation: Go workspaces keep executor binaries focused (8.9MB to 156MB)
For detailed architecture documentation, see:
- ARCHITECTURE.md: Generator architecture and patterns
- DESIGN_PATTERNS.md: Factory, strategy, and builder patterns
- WORKSPACE.md: Go workspace structure and dependency isolation
Quick Start
Prerequisites
- Go 1.24.9 or later
- C compiler (for go-sqlite3 CGo bindings)
- Optional: Node.js 24+ and pnpm (for web frontend)
Building
# Set required environment variable
export SQLSMITH_GO_CONTAINER_TYPE=test
# Build all executors and server
bash build.sh
# Outputs:
# - output/turso_embedded_executor
# - output/go_sqlite3_embedded_executor
# - output/duckdb_embedded_executor
# - output/server
Running Executors
Turso LibSQL Executor:
# In-memory fuzzing with default schema
./output/turso_embedded_executor --seed 42 --queries 100 --workers 4
# File-based database with custom schema
./output/turso_embedded_executor \
--dsn "file:./test.db" \
--init-sql "./assets/turso/init.sql" \
--queries 1000 \
--verbose
go-sqlite3 Executor:
# Full SQLite3 fuzzing with verbose output
./output/go_sqlite3_embedded_executor \
--dsn "./sqlite3.db" \
--seed 12345 \
--queries 500 \
--workers 8 \
--verbose
DuckDB Executor:
# Analytical database fuzzing with DuckDB
./output/duckdb_embedded_executor \
--dsn "" \
--seed 42 \
--queries 1000 \
--workers 4 \
--verbose
HTTP Server:
# Start the API server
./output/server
# Server listens on :8080 by default
# Configure via config/server.json
Common Flags
| Flag | Description | Default |
|------|-------------|---------|
| --dsn | Database connection string | :memory: |
| --seed | Random seed (0 = random) | 0 |
| --queries | Queries per worker | 10 |
| --workers | Concurrent workers | 1 |
| --verbose | Show executed SQL | false |
| --init-sql | Schema initialization file | [flavor-specific] |
SQL Generation Strategy
Randomness & Reproducibility
SQLfuse uses a Linear Congruential Generator (LCG) for deterministic randomness:
- Seeded: All generation is reproducible given the same seed
- Token-Based: Tracks PRNG consumption for profiling
- Stateless: Each query generation is independent
Statement Selection
Statements are selected based on weighted probabilities defined per flavor:
// Example weights for Turso flavor
weights := map[StmtType]uint64{
StmtSelectBasic: 100,
StmtSelectJoin: 50,
StmtSelectSubquery: 30,
StmtInsert: 20,
StmtUpdate: 10,
StmtPragma: 5,
}
Higher weights = more frequent generation, allowing targeted stress testing of specific features.
Schema-Aware Generation
When a database connection is provided, generators query the schema to produce realistic queries:
-- Generator queries schema
SELECT name FROM sqlite_master WHERE type='table';
PRAGMA table_info(users);
-- Then generates queries like
SELECT u.name, a.balance
FROM users u
INNER JOIN accounts a ON u.id = a.user_id
WHERE a.balance > 100.5;
Flavor-Specific Features
Turso LibSQL (18 PRAGMAs, conservative features):
PRAGMA journal_mode = WAL; -- Only WAL supported
PRAGMA synchronous = FULL; -- Only OFF/FULL
PRAGMA table_info; -- No table name parameter
go-sqlite3 (47 PRAGMAs, full SQLite3):
PRAGMA journal_mode = DELETE; -- All modes available
PRAGMA synchronous = NORMAL; -- All modes available
PRAGMA table_info(users); -- Parameterized
PRAGMA foreign_keys = ON; -- Extended pragmas
PRAGMA auto_vacuum = INCREMENTAL; -- Storage management
DuckDB (Analytical database with extensive SQL features):
-- Rich analytical functions
SELECT *, ROW_NUMBER() OVER (PARTITION BY category ORDER BY price DESC)
FROM products;
-- Advanced window functions with FILTER
SELECT AVG(price) FILTER (WHERE in_stock) OVER (PARTITION BY category)
FROM products;
Project Structure
sqlfuse/
├── cmd/
│ ├── executors/
│ │ ├── turso_embedded/ # Turso LibSQL executor
│ │ ├── go_sqlite3_embedded/ # go-sqlite3 executor
│ │ └── duckdb_embedded/ # DuckDB executor
│ └── server/ # HTTP API server
│
├── internal/
│ ├── common/ # Logger, LCG utilities
│ ├── executors/ # Executor framework
│ ├── generators/ # SQL generators
│ │ ├── dialects/ # Flavor configurations
│ │ ├── base_generator.go # Common generator logic
│ │ ├── turso.go # Turso-specific generator
│ │ ├── go_sqlite3.go # go-sqlite3 generator
│ │ └── duckdb.go # DuckDB generator
│ └── stmts/
│ ├── stmts/ # Statement builders
│ └── types/ # SQL type generators
│
├── assets/ # Database schemas & configs
│ ├── turso/init.sql
│ ├── go_sqlite3/init.sql
│ └── duckdb/init.sql
│
├── config/ # Server & executor configs
├── docs/ # Architecture documentation
├── view/ # Vue.js web frontend
└── go.work # Go workspace definition
Development
Running Tests
# Run all tests with coverage
export SQLSMITH_GO_CONTAINER_TYPE=test
bash build.sh --test
# Coverage report: .cache/coverage.html
Adding a New Database Flavor
To add support for a new database:
- Create dialect configuration in
internal/generators/dialects/mydb.go - Implement generator in
internal/generators/mydb.go - Create executor in
cmd/executors/mydb_embedded/main.go - Add to workspace in
go.work
For detailed step-by-step instructions with code examples, see:
- ARCHITECTURE.md: Generator implementation guide
- DIALECTS_README.md: Dialect configuration guide
- WORKSPACE.md: Workspace setup guide
Use Cases
1. Bug Discovery
Find crashes, assertion failures, and incorrect results:
# High-volume fuzzing to find edge cases
./output/go_sqlite3_embedded_executor \
--queries 10000 \
--workers 16 \
--seed 0
