Sep
World's Fastest .NET CSV Parser. Modern, minimal, fast, zero allocation, reading and writing of separated values (`csv`, `tsv` etc.). Cross-platform, trimmable and AOT/NativeAOT compatible.
Install / Use
/learn @nietras/SepREADME
Sep - the World's Fastest .NET CSV Parser
<img src="https://raw.githubusercontent.com/nietras/Sep/icon-v2/Icon.png" alt="Icon" align="right" width="128" hspace="2" vspace="2" />
Modern, minimal, fast, zero allocation, reading and writing of separated values
(csv, tsv etc.). Cross-platform, trimmable and AOT/NativeAOT compatible.
Featuring an opinionated API design and pragmatic implementation targeted at
machine learning use cases.
⭐ Please star this project if you like it. ⭐
🌃 Modern - utilizes features such as
Span<T>,
Generic Math
(ISpanParsable<T>/
ISpanFormattable),
ref struct,
ArrayPool<T>
and similar from .NET 7+ and C#
11+ for a modern
and highly efficient implementation.
🔎 Minimal - a succinct yet expressive API with few options and no hidden changes to input or output. What you read/write is what you get. E.g. by default there is no "automatic" escaping/unescaping of quotes or trimming of spaces. To enable this see SepReaderOptions and Unescaping and Trimming. See SepWriterOptions for Escaping.
🚀 Fast - blazing fast with both architecture specific and cross-platform SIMD vectorized parsing incl. 64/128/256/512-bit paths e.g. AVX2, AVX-512 (.NET 8.0+), NEON. Uses csFastFloat for fast parsing of floating points. See detailed benchmarks for cross-platform results.
🌪️ Multi-threaded - unparalleled speed with highly efficient parallel CSV parsing that is up to 35x faster than CsvHelper, see ParallelEnumerate and benchmarks.
🌀 Async support - efficient ValueTask based async/await support.
Requires C# 13.0+ and for .NET 9.0+ includes SepReader implementing
IAsyncEnumerable<>. See Async Support for details.
🗑️ Zero allocation - intelligent and efficient memory management allowing for zero allocations after warmup incl. supporting use cases of reading or writing arrays of values (e.g. features) easily without repeated allocations.
✅ Thorough tests - great code coverage and focus on edge case testing incl. randomized fuzz testing.
🌐 Cross-platform - works on any platform, any architecture supported by NET. 100% managed and written in beautiful modern C#.
✂️ Trimmable and AOT/NativeAOT compatible - no problematic reflection or dynamic code generation. Hence, fully trimmable and Ahead-of-Time compatible. With a simple console tester program executable possible in just a few MBs. 💾
🗣️ Opinionated and pragmatic - conforms to the essentials of RFC-4180, but takes an opinionated and pragmatic approach towards this especially with regards to quoting and line ends. See section RFC-4180.
Example | Naming and Terminology | API | Limitations and Constraints | Comparison Benchmarks | Example Catalogue | RFC-4180 | FAQ | Public API Reference
Example
var text = """
A;B;C;D;E;F
Sep;🚀;1;1.2;0.1;0.5
CSV;✅;2;2.2;0.2;1.5
""";
using var reader = Sep.Reader().FromText(text); // Infers separator 'Sep' from header
using var writer = reader.Spec.Writer().ToText(); // Writer defined from reader 'Spec'
// Use .FromFile(...)/ToFile(...) for files
var idx = reader.Header.IndexOf("B");
var nms = new[] { "E", "F" };
foreach (var readRow in reader) // Read one row at a time
{
var a = readRow["A"].Span; // Column as ReadOnlySpan<char>
var b = readRow[idx].ToString(); // Column to string (might be pooled)
var c = readRow["C"].Parse<int>(); // Parse any T : ISpanParsable<T>
var d = readRow["D"].Parse<float>(); // Parse float/double fast via csFastFloat
var s = readRow[nms].Parse<double>(); // Parse multiple columns as Span<T>
// - Sep handles array allocation and reuse
foreach (ref var v in s) { v *= 10; }
using var writeRow = writer.NewRow(); // Start new row. Row written on Dispose.
writeRow["A"].Set(a); // Set by ReadOnlySpan<char>
writeRow["B"].Set(b); // Set by string
writeRow["C"].Set($"{c * 2}"); // Set via InterpolatedStringHandler, no allocs
writeRow["D"].Format(d / 2); // Format any T : ISpanFormattable
writeRow[nms].Format(s); // Format multiple columns directly
// Columns are added on first access as ordered, header written when first row written
}
var expected = """
A;B;C;D;E;F
Sep;🚀;2;0.6;1;5
CSV;✅;4;1.1;2;15
"""; // Empty line at end is for line ending,
// which is always written.
Assert.AreEqual(expected, writer.ToString());
// Above example code is for demonstration purposes only.
// Short names and repeated constants are only for demonstration.
For more examples, incl. how to write and read objects (e.g. records) with
escape/unescape support, see Example Catalogue.
Naming and Terminology
Sep uses naming and terminology that is not based on RFC-4180, but
is more tailored to usage in machine learning or similar. Additionally, Sep
takes a pragmatic approach towards names by using short names and abbreviations
where it makes sense and there should be no ambiguity given the context. That
is, using Sep for Separator and Col for Column to keep code succinct.
|Term | Description |
|-----|-------------|
|Sep | Short for separator, also called delimiter. E.g. comma (,) is the separator for the separated values in a csv-file. |
|Header | Optional first row defining names of columns. |
|Row | A row is a collection of col(umn)s, which may span multiple lines. Also called record. |
|Col | Short for column, also called field. |
|Line | Horizontal set of characters until a line ending; \r\n, \r, \n. |
|Index | 0-based that is RowIndex will be 0 for first row (or the header if present). |
|Number | 1-based that is LineNumber will be 1 for the first line (as in notepad). Given a row may span multiple lines a row can have a From line number and a ToExcl line number matching the C# range indexing syntax [LineNumberFrom..LineNumberToExcl]. |
Application Programming Interface (API)
Besides being the succinct name of the library, Sep is both the main entry
point to using the library and the container for a validated separator. That is,
Sep is basically defined as:
public readonly record struct Sep(char Separator);
The separator char is validated upon construction and is guaranteed to be
within a limited range and not being a char like " (quote) or similar. This
can be seen in src/Sep/Sep.cs. The separator is constrained
also for internal optimizations, so you cannot use any char as a separator.
⚠ Note that all types are within the namespace nietras.SeparatedValues and not
Sep since it is problematic to have a type and a namespace with the same name.
To get started you can use Sep as the static entry point to building either a
reader or writer. That is, for SepReader:
using var reader = Sep.Reader().FromFile("titanic.csv");
where .Reader() is a convenience method corresponding to:
usin
