Simdjson
Parsing gigabytes of JSON per second : used by Facebook/Meta Velox, the Node.js runtime, ClickHouse, WatermelonDB, Apache Doris, Milvus, StarRocks
Install / Use
/learn @simdjson/SimdjsonREADME
[![][license img]][license] [![][licensemit img]][licensemit]
simdjson : Parsing gigabytes of JSON per second
<img src="images/official_logo/logo_noir/SVG/logo_simdjson_noir.svg" width="40%" style="float: right">JSON is everywhere on the Internet. Servers spend a lot of time parsing it. We need a fresh approach. The simdjson library uses commonly available SIMD instructions and microparallel algorithms to parse JSON 4x faster than RapidJSON and 25x faster than JSON for Modern C++.
- Fast: Over 4x faster than commonly used production-grade JSON parsers.
- Record Breaking Features: Minify JSON at 6 GB/s, validate UTF-8 at 13 GB/s, NDJSON at 3.5 GB/s.
- Easy: First-class, easy to use and carefully documented APIs.
- Strict: Full JSON and UTF-8 validation, lossless parsing. Performance with no compromises.
- Automatic: Selects a CPU-tailored parser at runtime. No configuration needed.
- Reliable: From memory allocation to error handling, simdjson's design avoids surprises.
- Peer Reviewed: Our research appears in venues like VLDB Journal, Software: Practice and Experience.
This library is part of the Awesome Modern C++ list.
Table of Contents
- Real-world usage
- Quick Start
- Documentation
- Godbolt
- Performance results
- Packages
- Bindings and Ports of simdjson
- About simdjson
- Funding
- Contributing to simdjson
- License
Real-world usage
- Node.js
- ClickHouse
- Meta Velox
- Google Pax
- milvus
- QuestDB
- Clang Build Analyzer
- Shopify HeapProfiler
- StarRocks
- Microsoft FishStore
- Intel PCM
- WatermelonDB
- Apache Doris
- Dgraph
- UJRPC
- fastgltf
- vast
- ada-url
- fastgron
- WasmEdge
- RonDB
- GreptimeDB
- mamba
- Ladybird Browser
If you are planning to use simdjson in a product, please work from one of our releases.
Quick Start
The simdjson library is easily consumable with a single .h and .cpp file.
-
Prerequisites:
g++(version 7 or better) orclang++(version 6 or better), and a 64-bit system with a command-line shell (e.g., Linux, macOS, freeBSD). We also support programming environments like Visual Studio and Xcode, but different steps are needed. Users of clang++ may need to specify the C++ version (e.g.,c++ -std=c++17) since clang++ tends to default on C++98. -
Pull simdjson.h and simdjson.cpp into a directory, along with the sample file twitter.json. You can download them with the
wgetutility:wget https://raw.githubusercontent.com/simdjson/simdjson/master/singleheader/simdjson.h https://raw.githubusercontent.com/simdjson/simdjson/master/singleheader/simdjson.cpp https://raw.githubusercontent.com/simdjson/simdjson/master/jsonexamples/twitter.json -
Create
quickstart.cpp:
#include <iostream>
#include "simdjson.h"
using namespace simdjson;
int main(void) {
ondemand::parser parser;
padded_string json = padded_string::load("twitter.json");
ondemand::document tweets = parser.iterate(json);
std::cout << uint64_t(tweets["search_metadata"]["count"]) << " results." << std::endl;
}
c++ -o quickstart quickstart.cpp simdjson.cpp./quickstart
100 results.
Documentation
Usage documentation is available:
- Basics is an overview of how to use simdjson and its APIs.
- Builder is an overview of how to efficiently write JSON strings using simdjson.
- Performance shows some more advanced scenarios and how to tune for them.
- Implementation Selection describes runtime CPU detection and how you can work with it.
- API contains the automatically generated API documentation.
- Compile-Time Parsing presents our compile-time parsing function (C++26 only).
Godbolt
Some users may want to browse code along with the compiled assembly. You want to check out the following lists of examples:
- C++26 reflection example
- simdjson examples with errors handled through exceptions
- simdjson examples with errors without exceptions
Performance results
The simdjson library uses three-quarters less instructions than state-of-the-art parser RapidJSON. To our knowledge, simdjson is the first fully-validating JSON parser to run at gigabytes per second (GB/s) on commodity processors. It can parse millions of JSON documents per second on a single core.
The following figure represents parsing speed in GB/s for parsing various files on an Intel Skylake processor (3.4 GHz) using the GNU GCC 10 compiler (with the -O3 flag). We compare against the best and fastest C++ libraries on benchmarks that load and process the data. The simdjson library offers full unicode (UTF-8) validation and exact number parsing.
<img src="doc/rome.png" width="60%">The simdjson library offers high speed whether it processes tiny files (e.g., 300 bytes) or larger files (e.g., 3MB). The following plot presents parsing speed for synthetic files over various sizes generated with a script on a 3.4 GHz Skylake processor (GNU GCC 9, -O3).
<img src="doc/growing.png" width="60%">All our experiments are reproducible.
For NDJSON files, we can exceed 3 GB/s with our multithreaded parsing functions.
Packages
Bindings and Ports of simdjson
We distinguish between "bindings" (which just wrap the C++ code) and a port to another programming language (which reimplements everything).
- ZippyJSON: Swift bindings for the simdjson project.
- libpy_simdjson: high-speed Python bindings for simdjson using libpy.
- pysimdjson: Python bindings for the simdjson project.
- cysimdjson: high-speed Python bindings for the simdjson project.
- simdjson-rs: Rust port.
- simdjson-rust: Rust wrapper (bindings).
- SimdJsonSharp: C# version for .NET Core (bindings and full port).
- simdjson_nodejs: Node.js bindings for the simdjson project.
- simdjson_php: PHP bindings for the simdjson project.
- simdjson_ruby: Ruby bindings for the simdjson project.
- fast_jsonparser: Ruby bindings for the simdjson project.
- simdjson-go: Go port using Golang assembly.
- rcppsimdjson: R bindings.
- simdjson_erlang: erlang bindings.
- simdjsone: erlang bindings.
- lua-simdjson: lua bindings.
- hermes-json: haskell bindings.
- zimdjson: Zig port.
- simdjzon: Zig port.
- JSON-Simd: Raku bindings.
- JSON::SIMD: Perl bindings; fully-featured JSON module that uses simdjson for decoding.
- gemmaJSON: Nim JSON parser based on simdjson bindings.
- simdjson-java: Java port.
- mruby-fast-json: mruby binding with high API coverage.
About simdjson
The simdjson library takes advantage of modern microarchitectures, parallelizing with SIMD vector instructions, reducing branch misprediction, and reducing data dependency to take advanta
Related Skills
node-connect
327.7kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
80.7kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
327.7kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
commit-push-pr
80.7kCommit, push, and open a PR
