Rsonpath
Blazing fast JSONPath query engine written in Rust.
Install / Use
/learn @rsonquery/RsonpathREADME
rsonpath – SIMD-powered JSONPath 🚀 <img src="img/rsonquery-logo.svg" width="50em" align="left" />
Experimental JSONPath engine for querying massive streamed datasets.
The rsonpath crate provides a JSONPath parser and a query execution engine rq,
which utilizes SIMD instructions to provide massive throughput improvements over conventional engines.
Benchmarks of rsonpath against a reference no-SIMD engine on the
Pison dataset. NOTE: Scale is logarithmic!
Usage
To run a JSONPath query on a file execute:
rq '$..a.b' ./file.json
If the file is omitted, the engine reads standard input. JSON can also be passed inline:
$ rq '$..a.b' --json '{"c":{"a":{"b":42}}}'
42
For details, consult rq --help or the rsonbook.
Results
The result of running a query is a sequence of matched values, delimited by newlines.
Alternatively, passing --result count returns only the number of matches, which might be much faster.
For other result modes consult the --help usage page.
Installation
See Releases for precompiled binaries for all first-class support targets.
cargo
Easiest way to install is via cargo.
$ cargo install rsonpath
...
Native CPU optimizations
If maximum speed is paramount, you should install rsonpath with native CPU instructions support.
This will result in a binary that is not portable and might work incorrectly on any other machine,
but will squeeze out every last bit of throughput.
To do this, run the following cargo install variant:
$ RUSTFLAGS="-C target-cpu=native" cargo install rsonpath
...
Check out the relevant chapter in the rsonbook.
Query language
The project is actively developed and currently supports only a subset of the JSONPath query language. A query is a sequence of segments, each containing one or more selectors.
Supported segments
| Segment | Syntax | Supported | Since | Tracking Issue |
|--------------------------------|----------------------------------|-----------|--------|---------------:|
| Child segment (single) | [<selector>] | ✔️ | v0.1.0 | |
| Child segment (multiple) | [<selector1>,...,<selectorN>] | ❌ | | |
| Descendant segment (single) | ..[<selector>] | ✔️ | v0.1.0 | |
| Descendant segment (multiple) | ..[<selector1>,...,<selectorN>]| ❌ | | |
Supported selectors
| Selector | Syntax | Supported | Since | Tracking Issue |
|------------------------------------------|----------------------------------|-----------|--------|---------------:|
| Root | $ | ✔️ | v0.1.0 | |
| Name | .<member>, [<member>] | ✔️ | v0.1.0 | |
| Wildcard | .*, ..*, [*] | ✔️ | v0.4.0 | |
| Index (array index) | [<index>] | ✔️ | v0.5.0 | |
| Index (array index from end) | [-<index>] | ❌ | | |
| Array slice (forward, positive bounds) | [<start>:<end>:<step>] | ✔️ | v0.9.0 | #152 |
| Array slice (forward, arbitrary bounds) | [<start>:<end>:<step>] | ❌ | | |
| Array slice (backward, arbitrary bounds) | [<start>:<end>:-<step>] | ❌ | | |
| Filters – existential tests | [?<path>] | ❌ | | #154 |
| Filters – const atom comparisons | [?<path> <binop> <atom>] | ❌ | | #156 |
| Filters – logical expressions | &&, \|\|, ! | ❌ | | |
| Filters – nesting | [?<expr>[?<expr>]...] | ❌ | | |
| Filters – arbitrary comparisons | [?<path> <binop> <path>] | ❌ | | |
| Filters – function extensions | [?func(<path>)] | ❌ | | |
Supported platforms
The crate is continuously built and tested for all Tier 1 Rust targets. Pre-built binaries are also available for some Tier 2 targets, but without testing. Currently, these are MUSL targets -- if you require other binaries create an issue. SIMD is available on x86 and ARM (64-bit) platforms.
| Target triple | nosimd build | SIMD support | Continuous testing | Tracking issues | |:--------------------------|:-------------|:--------------------|:-------------------|----------------:| | aarch64-apple-darwin | ✔️ | ✔️ | ✔️ | | | aarch64-pc-windows-msvc | ✔️ | ✔️ | ✔️ | | | aarch64-unknown-linux-gnu | ✔️ | ✔️ | ✔️ | | | i686-pc-windows-msvc | ✔️ | ✔️ | ✔️ | | | i686-unknown-linux-gnu | ✔️ | ✔️ | ✔️ | | | x86_64-pc-windows-gnu | ✔️ | ✔️ | ✔️ | | | x86_64-pc-windows-msvc | ✔️ | ✔️ | ✔️ | | | x86_64-unknown-linux-gnu | ✔️ | ✔️ | ✔️ | | | aarch64-unknown-linux-musl| ✔️ | ✔️ | ❌ | | | i686-unknown-linux-musl | ✔️ | ✔️ | ❌ | | | x86_64-unknown-linux-musl | ✔️ | ✔️ | ❌ | |
SIMD support
SIMD support is enabled on a module-by-module basis. Generally, any CPU released in the past decade supports AVX2, which enables all available optimizations. On ARM, we support NEON.
Older CPUs with SSE2 or higher get partial support. You can check what exactly is enabled
with rq --version – check the SIMD support field:
$ rq --version
rq 0.9.1
Commit SHA: c024e1bab89610455537b77aed249d2a05a81ed6
Features: default,simd
Opt level: 3
Target triple: x86_64-unknown-linux-gnu
Codegen flags: link-arg=-fuse-ld=lld
SIMD support: avx2;fast_quotes;fast_popcnt
The fast_quotes capability depends on the pclmulqdq instruction (on x86) or the aes feature (ARM),
and fast_popcnt on the popcnt instruction (always available on ARM).
Caveats and limitations
JSONPath
Not all selectors are supported, see the support table above.
Duplicate keys
The engine assumes that every object in the input JSON has no duplicate keys. Behavior on duplicate keys is not guaranteed to be stable, but currently the engine will simply match the first such key.
$ rq '$.key' --json '{"key":"value","key":"other value"}'
"value"
Unicode
The engine does not parse unicode escape sequences in member names.
This means that a key "a" is different from a key "\u0041", even though semantically they represent the same string.
This is actually as-designed with respect to the current JSONPath spec.
Parsing unicode sequences is costly, so the support for this was postponed
in favour of high performance. This is tracked as #117.
Contributing
The gist is: fork, implement, make a PR back here. More details are in the CONTRIBUTING doc.
Build & test
The dev workflow utilizes just.
Use the included Justfile. It will automatically install Rust for you using the rustup tool if it detects there is no Cargo in your environment.
