SkillAgentSearch skills...

ParsecClone

A tiny subset clone of parserc/fparsec combinators that supports string and binary parsing and is in extensible in what stream source is available.

Install / Use

/learn @devshorts/ParsecClone
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

ParsecClone

Build status

This a fparsec subset clone that works on generalized stream classes. This means you can use combinators on binary streams, strings, or any other custom stream classes you want. Included in the project is a sample CSV parser and a sample MP4 header binary parser.

Table of contents

Installation

Install ParsecClone NuGet Status via NuGet

Install-Package ParsecClone

This will install the ParsecClone F# library.

Included in the main ParsecClone.Combinator dll are:

  • The general operators are in ParsecClone.CombinatorBase
  • The string handling in ParsecClone.StringCombinator
  • And the binary operators in ParsecClone.BinaryCombinator

[Top]

Target Audience

The documentation below is intended for people who are familiar with combinator libraries. If you are not familiar with FParsec style combinators and notation, you may want to run through their tutorials and explanations first.

While the following documentation is not as robust as theirs, ParsecClone operators are very similar. Once you are familiar with FParsec operator and operator styles the following documentation should be enough to get you on your way.

[Top]

When to use and known limitations

ParsecClone is well suited for binary parsing which works on stream sources (memory streams, file streams, etc). Not only can you do byte level parsing, but also bit level parsing. Performance of parsecClone is close to native. In my tests it was only 2x times slower than hand written c++. Performance even exceeded C++ if you ran the parser multiple times (since the JIT had already run)!

ParseClone can also parse strings, but doesn't work on string streams. One of the reasons is that to use regular expressions you need to have unlimited lookahead to your stream. With a stream you'd end up having to read the whole stream in anyways! Since FParsec works on streams, I chose to not duplicate that functionality.

If you have strings you can buffer into memory, ParsecClone will work great (so smaller files that you can read all in one go).

More importantly, ParsecClone is great for adding new stream sources to extend its capabilities. To do so just implement the IStreamP interface and hook into the matcher function in the base combinator library.

A few other caveats. Currently ParsecClone's string parsing doesn't do any memoization, so you are stuck reparsing data. However, by default the binary parser does memoize using a custom cache. You can disable this by passing in a None to the cache instantiator.

[Top]

Types and notation

ParsecClone uses 3 main types for all of its combinators.

type State<'StateType, 'ConsumeType, 'UserState> = IStreamP<'StateType, 'ConsumeType, 'UserState>

type Reply<'Return, 'StateType, 'ConsumeType, 'UserState> = 'Return option * State<'StateType, 'ConsumeType, 'UserState>

type Parser<'Return, 'StateType, 'ConsumeType, 'UserState> = State<'StateType, 'ConsumeType, 'UserState> -> Reply<'Return, 'StateType, 'ConsumeType, 'UserState>

Since the types are kind of nasty, in the following operator examples I will use a shorthand notation of

Parser<'Return> implies Parser<'Return,_,_,_>

If other type information is needed in the signature I'll use the full parser type signature.

[Top]

Generic Operators

Included operators are

val (>>=) : Parser<'a> -> ('a -> Parser<'b>) -> Parser<'b>

Combiner with function callback


val (>>=?) : Parser<'a> -> ('a -> Parser<'b>) -> Parser<'b>

Combiner with function callback and backtracking


val (>>.) : Parser<'a> -> Parser<'b> -> Parser<'b>

Use result of second combinator


val (.>>) : Parser<'a> -> Parser<'b> -> Parser<'a>

Use result of first combinator


val preturn: 'a -> Parser<'a>

Return a value as a combinator


val pzero: unit -> Parser<'a>

Defines a zero (for use with folds and other starting states). Result is (None, state)


val (|>>) : 'a -> ('a -> 'b) -> Parser<'b>

Pipe value into union or constructor


val |>>%) : 'a -> Parser<'a>

Pipe to zero argument discriminated union


val <|>) : Parser<'a> -> Parser<'b> -> Parser<'c>

Takes two parsers, and returns a new parser. The result is either the result of the first parser (if it succeeds) or the result of the second parser, as long as the or'd parsers don't modify the underlying state.


val .<?>>.) : Parser<'a> -> Parser<'a list> -> Parser<'a list>

Takes a single parser, and a list parser and applies both parers as options.

If the first parser succeeds and the second parser fails, returns a list of the result of the first parser (Some('a)::[]).

If the first parser succeeds and the second parser succeeds returns a cons list of both results (Some('a)::Some('a) list). This operator does not backtrack but will not fail if the first parser doesn't succeed (since its wrapped as an opt).

If the first parser fails, this parser fails.


val .<<?>.) : Parser<'a list> -> Parser<'a> -> Parser<'a list>

The same as .<?>>. except with the arguments inverted. The list parser is first and the single parser is second.

If the first parser fails, this parser fails.


val (>>--): Parser<'a> -> (unit -> 'a) -> Parser<'a>

This operator lets you capture the actual invocation result of a parser. For example, say you want to time how long a parser takes. You can create a time function like this:

let time identifier func =
	let start = System.DateTime.Now
    let value = func()
    printfn "%s Took %s" s ((System.DateTime.Now - start).ToString())
	value

And time an operator like

let newParser = parserImpl >>-- time "parserImpl"

Internally the right hand function is delayed and not executed till we actually call the parser:

let (>>--) parser wrapper = 
        fun state -> 
            wrapper (fun () -> parser state)

val (>>|.): Parser<'a> -> ('a -> 'b) -> Parser<'b>

Takes a parser and a transformer, applies the result of the parser to the transformer and returns a new parser that returns the transformed result.


val many: Parser<'a> -> Parser<'a list>

Repeats a parser zero or more times, until the parser fails to match or the end of stream is encountered.


val matcher: (State<_, 'ConsumeType, _>  -> 'a -> int option) -> 'a -> Parser<'ConsumeType>

Generic match on predicate and executes state modifier to return result


val anyOf: ('a -> Parser<'a>) -> 'a list -> Parser<'a> 

Takes a function that maps the list into a bunch of parsers and or's each result together with the <|> combinator. For example: anyOf matchStr ["f";"o";"i";"g";"h";"t";"e";"r";"s";" "]


val choice: Parser<'a> list -> Parser<'a>

Takes a list of parsers and or's them together with <|>


val attempt: Parser<'a> -> Parser<'a>

If no match occurs or an exception happens, backtracks to the beginning of the state of the parser


val takeTill: ('a -> bool) -> Parser<'a> -> Parser<'a list>

Takes a predicate and a parser and consumes until the predicate is true. Then backtracks one element


val takeWhile: ('a -> bool) -> Parser<'a> -> Parser<'a list>

Takes a predicate and a parser and consumes until the predicate is false. Then backtracks one element


val manyN: int -> Parser<'a> -> Parser<'a list>

Takes a number and tries to consume N parsers. If it doesn't consume exactly N it will fail. Aliased by exactly.


val many1: Parser<'a> -> Parser<'a list>

Repeats a parser one or more times (fails if no match found)


val lookahead: Parser<'a> -> Parser<'a>

Returns a result and a new parser, but backtracks the state


val manyTill: Parser<'a> -> Parser<'b> -> Parser<'a list>

Takes a parser and an end parser, and repeats the fir

Related Skills

View on GitHub
GitHub Stars63
CategoryCustomer
Updated5mo ago
Forks12

Languages

F#

Security Score

92/100

Audited on Oct 15, 2025

No findings