Zsv

zsv+lib: tabular data swiss-army knife CLI + world's fastest (simd) CSV parser

Generate Convert Improve

Install / Use

/learn @liquidaty/Zsv

About this skill

Quality Score

0/100

README

zsv+lib: the world's fastest (simd) CSV parser, with an extensible CLI

lib + CLI:

npm:

Playground (without sheet viewer command): https://liquidaty.github.io/zsv

zsv+lib is the world's fastest CSV parser library and extensible command-line utility. It achieves high performance using SIMD operations, efficient memory use and other optimization techniques, and can also parse generic-delimited and fixed-width formats, as well as multi-row-span headers.

While zsv is written in C, it can be used in other languages such as ruby. See below for more details.

CLI

The ZSV CLI can be compiled to virtually any target, including WebAssembly, and offers a variety of commands including select, count, direct CSV sql, flatten, serialize, 2json conversion, 2db sqlite3 conversion, stack, pretty, 2tsv, compare, paste, overwrite, check and more.

The ZSV CLI also includes sheet, an in-console interactive grid viewer that includes basic navigation, filtering, and pivot table with drill down, and that supports custom extensions:

Installation

brew (MacOS, Linux):
- brew install zsv
winget (Windows):
- winget.exe install zsv
npm (parser only), nuget, yum, apt, choco and more
- See INSTALL.md
Download
- Pre-built binaries and packages for macOS, Windows, Linux and BSD can be downloaded from the Releases page.
Build
- See BUILD.md to build from source

Language Bindings & Wrappers

Binding contributions are welcome!

Note: These projects are maintained independently. Please file issues related to specific bindings in their respective repositories.

Playground

An online playground is available as well (without the sheet feature due to browser limitations)

If you like zsv+lib, do not forget to give it a star! 🌟

Performance

Summary

We compared a number of CSV parsers on speed; memory was also tracked for informational purposes. The top finalists were: zsv, xan, polars, xsv/qsv and duckdb

Benchmarks use three input profiles: unquoted, sparsely quoted, standard quoted, and non-4180-compliant quoted.

Overall, zsv and xan were the clear top performers in both speed and memory:

count: zsv is fastest across all input types
select: zsv and zan are the fastest, where xan is faster on unquoted and sparsely quoted, and zsv is faster on standard quoted or non-4180-compliant
non-4180-compliant data: zsv is fastest across the board (xan and polars are N/A for this input category)

Benchmarks

See benchmarks

Detailed benchmark tests have been run on MacOS (arm64) and Linux (x86-64). We would expect similar performance on Windows and other Linux flavors.

Contributions of benchmark results for other os/architecture combinations are welcome-- please open an issue!

Fast parser

zsv includes a SIMD-accelerated fast parser (--parser fast) that uses branchless prefix-XOR carry propagation for quote state tracking, available on aarch64 (NEON), x86-64 (AVX2), and x86-64 (SSE2), including Windows (mingw64). wasm (compiled via emscripten) support will be added next.

The fast parser is only designed for input that uses quoting as defined in RFC 4180 (but does not require other limitations of RFC 4180 such as CRLF line ends). Like polars and xan, it does not correctly handle non-standard quoting such as unescaped quotes in unquoted fields (e.g. 12" monitor or say "hello" world). For such data, use the default compat parser which handles all real-world CSV the same way spreadsheet programs do.

Parallel parsing

Either the fast or compat parser can be combined with --parallel for multi-threaded parsing:

# Single-threaded
zsv count data.csv               # any CSV input
zsv count --parser fast data.csv # only for CSV input using standard quoting

# Multi-threaded parser (uses all available cores)
zsv select --parallel data.csv -- 1 2 3               # any CSV input
zsv select --parser fast --parallel data.csv -- 1 2 3 # only for CSV input using standard quoting

Which "CSV"

"CSV" is an ambiguous term. This library uses, by default, the same definition as Excel (the library and app have various options to change this default behavior); a more accurate description of it would be "UTF8 delimited data parser" insofar as it requires UTF8 input and its options support customization of the delimiter and whether to allow quoting.

In addition, zsv provides a row-level (as well as cell-level) API and provides "normalized" CSV output (e.g. input of this"iscell1,"thisis,"cell2 becomes "this""iscell1","thisis,cell2"). Each of these three objectives (Excel compatibility, row-level API and normalized output) has a measurable performance impact; conversely, it is possible to achieve-- which a number of other CSV parsers do-- much faster parsing speeds if any of these requirements (especially Excel compatibility) are dropped.

Examples of input that does not comply with RFC 4180

The following is a list of all input patterns that are non-compliant with RFC 4180, and how zsv (by default) parses each. It is believed to be comprehensive, please log an issue if you think it is missing any pattern:

The above behavior can be altered with various optional flags:

Header rows can be treated differently if options are used to skip rows and/or use multi-row header span -- see documentation for further detail.
Quote support can be turned off, to treat quotes just like any other non- delimiter character
Cell delimiter can be a character other than comma
Row delimiter can be specfied as CRLF only, in which case a standalone CR or LF is simply part of the cell value, even without quoting

Built-in and extensible features

zsv is an extensible CSV utility, which uses zsvlib, for tasks such as slicing and dicing, querying with SQL, combining, serializing, flattening, converting between CSV/JSON/sqlite3 and more.

zsv is streamlined for easy development of custom dynamic extensions.

zsvlib and zsv are written in C, but since zsvlib is a library, and zsv extensions are just shared libraries, you can extend zsv with your own code in any programming language, so long as it has been compiled into a shared library that implements the expected interface.

Key highlights

Available as BOTH a library and an application (coming soon: standalone zsvutil library for common helper functions such as csv writer)
Open-source, permissively licensed
Handles real-world CSV the same way that spreadsheet programs do (including edge cases). Gracefully handles (and can "clean") real-world data that may be "dirty".
Runs on macOS (tested on clang/gcc), Linux (gcc), Windows (mingw), BSD (gcc-only) and in-browser (emscripten/wasm)
High perfo

Related Skills

oracle

344.1k

Best practices for using the oracle CLI (prompt + file bundling, engines, sessions, and file attachment patterns).

prose

344.1k

OpenProse VM skill pack. Activate on any `prose` command, .prose files, or OpenProse mentions; orchestrates multi-agent workflows.

Command Development

96.8k

This skill should be used when the user asks to "create a slash command", "add a command", "write a custom command", "define command arguments", "use command frontmatter", "organize commands", "create command with file references", "interactive command", "use AskUserQuestion in command", or needs guidance on slash command structure, YAML frontmatter fields, dynamic arguments, bash execution in commands, user interaction patterns, or command development best practices for Claude Code.

Plugin Structure

96.8k

This skill should be used when the user asks to "create a plugin", "scaffold a plugin", "understand plugin structure", "organize plugin components", "set up plugin.json", "use ${CLAUDE_PLUGIN_ROOT}", "add commands/agents/skills/hooks", "configure auto-discovery", or needs guidance on plugin directory layout, manifest configuration, component organization, file naming conventions, or Claude Code plugin architecture best practices.

liquidaty

View profile

View on GitHub

GitHub Stars378

CategoryData

Updated2d ago

Forks19

liquidaty/zsv

Languages

Security Score

100/100

Audited on Mar 30, 2026

No findings