ParsecClone
A tiny subset clone of parserc/fparsec combinators that supports string and binary parsing and is in extensible in what stream source is available.
Install / Use
/learn @devshorts/ParsecCloneREADME
ParsecClone
This a fparsec subset clone that works on generalized stream classes. This means you can use combinators on binary streams, strings, or any other custom stream classes you want. Included in the project is a sample CSV parser and a sample MP4 header binary parser.
Table of contents
- Installation
- Target audience for documentation
- When to use and known limitations
- Types and notation
- Generic operators
- String operators
- Binary operators
- Bit parsers
- Bit parsing order
- Computation Expression Syntax
- A note on FParsec vs ParsecClone regarding strings
- Instantiating user states
- Debugging
- Dealing with value restrictions
- A CSV parser example (string)
- An MP4 parser example (binary)
- Bit parsing example
- Improving binary performance
Installation
Install ParsecClone via NuGet
Install-Package ParsecClone
This will install the ParsecClone F# library.
Included in the main ParsecClone.Combinator dll are:
- The general operators are in
ParsecClone.CombinatorBase - The string handling in
ParsecClone.StringCombinator - And the binary operators in
ParsecClone.BinaryCombinator
Target Audience
The documentation below is intended for people who are familiar with combinator libraries. If you are not familiar with FParsec style combinators and notation, you may want to run through their tutorials and explanations first.
While the following documentation is not as robust as theirs, ParsecClone operators are very similar. Once you are familiar with FParsec operator and operator styles the following documentation should be enough to get you on your way.
When to use and known limitations
ParsecClone is well suited for binary parsing which works on stream sources (memory streams, file streams, etc). Not only can you do byte level parsing, but also bit level parsing. Performance of parsecClone is close to native. In my tests it was only 2x times slower than hand written c++. Performance even exceeded C++ if you ran the parser multiple times (since the JIT had already run)!
ParseClone can also parse strings, but doesn't work on string streams. One of the reasons is that to use regular expressions you need to have unlimited lookahead to your stream. With a stream you'd end up having to read the whole stream in anyways! Since FParsec works on streams, I chose to not duplicate that functionality.
If you have strings you can buffer into memory, ParsecClone will work great (so smaller files that you can read all in one go).
More importantly, ParsecClone is great for adding new stream sources to extend its capabilities. To do so just implement the IStreamP interface and hook into the matcher function in the base combinator library.
A few other caveats. Currently ParsecClone's string parsing doesn't do any memoization, so you are stuck reparsing data. However, by default the binary parser does memoize using a custom cache. You can disable this by passing in a None to the cache instantiator.
Types and notation
ParsecClone uses 3 main types for all of its combinators.
type State<'StateType, 'ConsumeType, 'UserState> = IStreamP<'StateType, 'ConsumeType, 'UserState>
type Reply<'Return, 'StateType, 'ConsumeType, 'UserState> = 'Return option * State<'StateType, 'ConsumeType, 'UserState>
type Parser<'Return, 'StateType, 'ConsumeType, 'UserState> = State<'StateType, 'ConsumeType, 'UserState> -> Reply<'Return, 'StateType, 'ConsumeType, 'UserState>
Since the types are kind of nasty, in the following operator examples I will use a shorthand notation of
Parser<'Return> implies Parser<'Return,_,_,_>
If other type information is needed in the signature I'll use the full parser type signature.
Generic Operators
Included operators are
val (>>=) : Parser<'a> -> ('a -> Parser<'b>) -> Parser<'b>
Combiner with function callback
val (>>=?) : Parser<'a> -> ('a -> Parser<'b>) -> Parser<'b>
Combiner with function callback and backtracking
val (>>.) : Parser<'a> -> Parser<'b> -> Parser<'b>
Use result of second combinator
val (.>>) : Parser<'a> -> Parser<'b> -> Parser<'a>
Use result of first combinator
val preturn: 'a -> Parser<'a>
Return a value as a combinator
val pzero: unit -> Parser<'a>
Defines a zero (for use with folds and other starting states). Result is (None, state)
val (|>>) : 'a -> ('a -> 'b) -> Parser<'b>
Pipe value into union or constructor
val |>>%) : 'a -> Parser<'a>
Pipe to zero argument discriminated union
val <|>) : Parser<'a> -> Parser<'b> -> Parser<'c>
Takes two parsers, and returns a new parser. The result is either the result of the first parser (if it succeeds) or the result of the second parser, as long as the or'd parsers don't modify the underlying state.
val .<?>>.) : Parser<'a> -> Parser<'a list> -> Parser<'a list>
Takes a single parser, and a list parser and applies both parers as options.
If the first parser succeeds and the second parser fails, returns a list of the result of the first parser (Some('a)::[]).
If the first parser succeeds and the second parser succeeds returns a cons list of both results (Some('a)::Some('a) list). This operator does not backtrack but will not fail if the first parser doesn't succeed (since its wrapped as an opt).
If the first parser fails, this parser fails.
val .<<?>.) : Parser<'a list> -> Parser<'a> -> Parser<'a list>
The same as .<?>>. except with the arguments inverted. The list parser is first and the single parser is second.
If the first parser fails, this parser fails.
val (>>--): Parser<'a> -> (unit -> 'a) -> Parser<'a>
This operator lets you capture the actual invocation result of a parser. For example, say you want to time how long a parser takes. You can create a time function like this:
let time identifier func =
let start = System.DateTime.Now
let value = func()
printfn "%s Took %s" s ((System.DateTime.Now - start).ToString())
value
And time an operator like
let newParser = parserImpl >>-- time "parserImpl"
Internally the right hand function is delayed and not executed till we actually call the parser:
let (>>--) parser wrapper =
fun state ->
wrapper (fun () -> parser state)
val (>>|.): Parser<'a> -> ('a -> 'b) -> Parser<'b>
Takes a parser and a transformer, applies the result of the parser to the transformer and returns a new parser that returns the transformed result.
val many: Parser<'a> -> Parser<'a list>
Repeats a parser zero or more times, until the parser fails to match or the end of stream is encountered.
val matcher: (State<_, 'ConsumeType, _> -> 'a -> int option) -> 'a -> Parser<'ConsumeType>
Generic match on predicate and executes state modifier to return result
val anyOf: ('a -> Parser<'a>) -> 'a list -> Parser<'a>
Takes a function that maps the list into a bunch of parsers and or's each result together with the <|> combinator. For example: anyOf matchStr ["f";"o";"i";"g";"h";"t";"e";"r";"s";" "]
val choice: Parser<'a> list -> Parser<'a>
Takes a list of parsers and or's them together with <|>
val attempt: Parser<'a> -> Parser<'a>
If no match occurs or an exception happens, backtracks to the beginning of the state of the parser
val takeTill: ('a -> bool) -> Parser<'a> -> Parser<'a list>
Takes a predicate and a parser and consumes until the predicate is true. Then backtracks one element
val takeWhile: ('a -> bool) -> Parser<'a> -> Parser<'a list>
Takes a predicate and a parser and consumes until the predicate is false. Then backtracks one element
val manyN: int -> Parser<'a> -> Parser<'a list>
Takes a number and tries to consume N parsers. If it doesn't consume exactly N it will fail. Aliased by exactly.
val many1: Parser<'a> -> Parser<'a list>
Repeats a parser one or more times (fails if no match found)
val lookahead: Parser<'a> -> Parser<'a>
Returns a result and a new parser, but backtracks the state
val manyTill: Parser<'a> -> Parser<'b> -> Parser<'a list>
Takes a parser and an end parser, and repeats the fir
Related Skills
openhue
344.1kControl Philips Hue lights and scenes via the OpenHue CLI.
sag
344.1kElevenLabs text-to-speech with mac-style say UX.
weather
344.1kGet current weather and forecasts via wttr.in or Open-Meteo
tweakcc
1.5kCustomize Claude Code's system prompts, create custom toolsets, input pattern highlighters, themes/thinking verbs/spinners, customize input box & user message styling, support AGENTS.md, unlock private/unreleased features, and much more. Supports both native/npm installs on all platforms.
