Sparser
Sparser: Raw Filtering for Faster Analytics over Raw Data
Install / Use
/learn @stanford-futuredata/SparserREADME
sparser
This code base implements Sparser, raw filtering for faster analytics over raw data. Sparser can parse JSON, Avro, and Parquet data up to 22x faster than the state of the art. For more details, check out our paper published at VLDB 2018.
See the demo-repl directory for a brief example. To run it:
# update rapidjson submodule
git submodule init
git submodule update
cd demo-repl
make
./bench /path/to/large/file.json
Then enter 1 at the Sparser> prompt.
Sparser itself is just a header file and only depends on standard C libraries available on most systems.
Related Skills
feishu-drive
342.5k|
things-mac
342.5kManage Things 3 via the `things` CLI on macOS (add/update projects+todos via URL scheme; read/search/list from the local Things database)
clawhub
342.5kUse the ClawHub CLI to search, install, update, and publish agent skills from clawhub.com
postkit
PostgreSQL-native identity, configuration, metering, and job queues. SQL functions that work with any language or driver
