SkillAgentSearch skills...

Opal

Self-contained monadic parser combinators for OCaml

Install / Use

/learn @pyrocat101/Opal
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

Opal: Monadic Parser Combinators for OCaml

Opal is a minimum collection of useful parsers and combinators (~150 loc of OCaml) that makes writing parsers easier. It is designed to be small, self-contained, pure-functional, and only includes most essential parsers, so that one could include single file in the project or just embed it in other OCaml source code files.

I find myself writing lots of recursive-descent parsers from scratch in OCaml when I was solving Hackerrank FP challenges. That's why I wrote opal: to include it in the source code and build parsers on top of parser combinators easily.

Example

Trivial arithmetic calculator:

open Opal

let parens = between (exactly '(') (exactly ')')
let integer = many1 digit => implode % int_of_string
let add = exactly '+' >> return ( + )
let sub = exactly '-' >> return ( - )
let mul = exactly '*' >> return ( * )
let div = exactly '/' >> return ( / )

let rec expr input = chainl1 term (add <|> sub) input
and term input = chainl1 factor (mul <|> div) input
and factor input = (parens expr <|> integer) input

let () =
  let input = LazyStream.of_channel stdin in
  match parse expr input with
  | Some ans -> Printf.printf "%d\n" ans
  | None -> print_endline "ERROR!"

For non-trivial examples, see Hackerrank challenge solutions using opal in examples/.

Documentation

The expressiveness of parser combinators are attributed to higher-order functions and the extensive use of currying. However, due to lack of do syntax, the bind operation of monad would not be as succinct as that in Haskell.

A parser monad is either None (indicates failure), or Some pair of result and unconsumed input, where the result is a user-defined value. The input is a lazy stream of arbitrary token type. A parser is a function that accepts an input and returns a parser monad. Although most parsers in opal is polymorphic over token type and result type, some useful parsers only accepts char as input token type.

Since combinators in opal are roughly based on Haskell's Parsec. The following documentation is somehow a rip-off of Parsec's doc.

Lazy Stream

type 'a LazyStream t

Polymorphic lazy stream type.

val LazyStream.of_stream : 'a Stream.t -> 'a LazyStream.t

Build a lazy stream from stream.

val LazyStream.of_function : (unit -> 'a) -> 'a LazyStream.t

Build a lazy stream from a function f. The elements in the stream is populated by calling f ().

val LazyStream.of_string : string -> char LazyStream.t

Build a char lazy stream from string.

val LazyStream.of_channel : in_channel -> char LazyStream.t

Build a char lazy stream from a input channel.

Utilities

val implode : char list -> bytes

Implode character list into a string. Useful when used with many.

val explode : bytes -> char list

Explode a string into a character list.

val ( % ) : ('a -> 'b) -> ('b -> 'c) -> 'a -> 'c

Infix operator for left-to-right function composition. (f % g % h) x is equivalent to h (g (f x)).

val parse : ('token, 'result) parser -> 'token LazyStream -> 'result option

parse parser input parses input with parser, and returns Some result if succeed, or None on failure.

Primitives

type 'token input = 'token LazyStream.t

type ('token, 'result) monad = ('result * 'token input) option

type ('token, 'result) parser = 'token input -> ('token, 'result) monad

A parser is a function that accepts an input and returns either None on failure, or Some (result, input'), where result is user-defined value and input' is unconsumed input after parsing.

val return : 'result -> 'token input -> ('token, 'result) monad

Accepts a value and an input, and returns a monad.

val ( >>= ) : ('t, 'r) parser -> ('r -> ('t, 'r) monad) -> ('t, 'r) parser

x >>= f returns a new parser that if parser x succeeds, applies function f on monad produced by x, and produces a new monad (a.k.a. bind).

val ( let* ) : ('t, 'r) parser -> ('r -> ('t, 'r) monad) -> ('t, 'r) parser

This operator is the same as >>= but using the let notation. It is usefull to avoid ugly sequences of bindings. For exemple, p >>= fun x -> f x can be rewritten let* x = p in f x. Combined with the return function, we can define complex parsers :

let tuple_parser =
  let* x = digit in
  let* _ = exactly ',' in
  let* y = digit in
  return (x, y)

val ( <|> ) : ('t, 'r) parser -> ('t, 'r) parser -> ('t, 'r) parser

Choice combinator. The parser p <|> q first applies p. If it succeeds, the value of p is returned. If p fails, parser q is tried.

val mzero : 'a -> ('t, 'r) monad

A parser that always fails.

val any : ('t, 'r) parser

The parser succeeds for any token in the input. Consumes a token and returns it.

val satisfy : ('t -> bool) -> ('t, 'r) parser

The parser satisfy test succeeds for any token for which the supplied function test returns true. Returns the token that is actually parsed.

val eof : 'a -> ('t, 'a) parser

The parser eof x succeeds if the input is exhausted. Returns value x.

Derived

val ( => ) : ('t, 'r) parser -> ('r -> 'a) -> ('t, 'a) parser

Map combinator. x => f parses x. If it succeeds, returns the value of x applied with f.

val ( >> ) : ('t, 'a) parser -> ('t, 'b) parser -> ('t, 'b) parser

Ignore-left combinator. x >> y parses x and then y. Returns the value returned by y.

val ( << ) : ('t, 'a) parser -> ('t, 'b) parser -> ('t, 'a) parser

Ignore-right combinator. x >> y parses x and then y. Returns the value returned by x.

val ( <~> ) : ('t, 'r) parser -> ('t, 'r list) parser -> 't input -> ('t, 'r list) monad

Cons combinator. x <~> y parses x and then y. Returns the value of x prepended to the value of y (a list).

let ident = letter <~> many alpha_num

val choice : ('t, 'r) parser list -> ('t, 'r) parser

choice ps tries to apply the parsers in the list ps in order, until one of them succeeds. Returns the value of the succeeding parser.

val count : int -> ('t, 'r) parser -> 't input -> ('t, 'r list) monad

count n parses n occurrences of p. If n is smaller or equal to zero, the parser equals to return []. Returns a list of n values returned by p.

between : ('t, 'a) parser -> ('t, 'b) parser -> ('t, 'r) parser -> ('t, 'r) parser

between open close p parses open, followed by p and close. Returns the value returned by p.

let braces = between (exactly '{') (exactly '}')

val option : 'r -> ('t, 'r) parser -> ('t, 'r) parser

option default p tries to apply parser p. If p fails, it returns the value default, otherwise the value returned by p.

let priority = option 0 (digit => String.make 1 % int_of_string)

val optional : 'r -> ('t, 'r) parser -> ('t, unit) parser

optional p tries to apply parser p. It will parse p or nothing. It only fails if p fails. Discard the result of p.

val skip_many : ('t, 'r) parser -> ('t, unit) parser

skip_many p applies p zero or more times, skipping its result.

let spaces = skip_many space

val skip_many1 : ('t, 'r) parser -> ('t, unit) parser

skip_many1 p applies p one or more times, skipping its result.

val many : ('t, 'r) parser -> 't input -> ('t, 'r list) monad

many p applies the parser p zero or more times. Returns a list of returned values of p.

val many1 : ('t, 'r) parser -> 't input -> ('t, 'r list) monad

many1 p applies the parser p one or more times. Returns a list of returned values of p.

val sep_by : ('t, 'r) parser -> ('t, 'a) parser -> 't input -> ('t, 'r list) monad

sep_by p sep parses zero or more occurrences of p, separated by sep. Returns a list of values returned by p.

let comma_sep p = sep_by p (token ",")

val sep_by1 : ('t, 'r) parser -> ('t, 'a) parser -> 't input -> ('t, 'r list) monad

sep_by1 p sep parses one or more occurrences of p, separated by sep. Returns a list of values returned by p.

val end_by: ('t, 'r) parser -> ('t, 'a) parser -> ('t, 'r) parser

end_by p sep parses zero or more ocurrences of p, separated and ended by sep. Returns a list of values returned by p.

let statements = end_by statement (token ";")

val end_by1: ('t, 'r) parser -> ('t, 'a) parser -> ('t, 'r) parser

end_by1 p sep parses one or more ocurrences of p, separated and ended by sep. Returns a list of values returned by p.

val chainl : ('t, 'r) parser -> ('t, 'r -> 'r -> 'r) parser -> 'r -> ('t, 'r) parser

chainl p op default parses zero or more occurrences of p, separated by op. Returns a value obtained by a left associative application of all functions by op to the values returned by p. If there are zero occurences of p, the value default is returned.

val chainl1 : ('t, 'r) parser -> ('t, 'r -> 'r -> 'r) parser -> ('t, 'r) parser

chainl1 p op parses one or more occurrences of p, separate by op. Returns a value obtained by a left associative application of all functions returned by op to the values returned by p. This parser can be used to eliminate left recursion which typically occurs in expression grammars. See the arithmetic caculator example above.

val chainr : ('t, 'r) parser -> ('t, 'r -> 'r -> 'r) parser -> 'r -> ('t, 'r) parser

chainr p op default parses zero or more occurrences of p, separated by op. Returns a value obtained by right associative application of all functions returned by op to the values returned by p. If there are no occurrences of p, the value x is returned.

val chainr1 : ('t, 'r) parser -> ('t, 'r -> 'r -> 'r) parser -> ('t, 'r) parser

chainr p op parses one or more occurrences of p, separated

Related Skills

View on GitHub
GitHub Stars153
CategoryDevelopment
Updated26d ago
Forks14

Languages

OCaml

Security Score

100/100

Audited on Mar 10, 2026

No findings