Herbert
Clojure library defining a schema for edn values
Install / Use
/learn @miner/HerbertREADME
We turn our backs on confusion and seek the beginning. -- Sevrin
Note: Clojure 1.9 will introduce a new core library, known as clojure.spec, which makes Herbert obsolete.
Herbert
A schema language for edn (Clojure data).
The extensible data notation (edn) defines a useful subset of Clojure data types. As described on edn-format.org:
edn is a system for the conveyance of values. It is not a type system, and has no schemas.
The explicit lack of schemas in edn stands in marked contrast to many serialization libraries which use an interface definition language. The edn values essentially speak for themselves, without the need for a separate description or layer of interpretation. That is not to say that schemas aren't potentially useful, they're just not part of the definition of the edn format.
The goal of the Herbert project is to provide a convenient schema language for defining edn data structures that can be used for documentation and validation. The schema patterns are represented as edn values.
Leiningen
Herbert is available from Clojars. Add the following dependency to your project.clj:
Usage
The main namespace is miner.herbert. The conforms? predicate takes a schema pattern and
a value to test. It returns true if the value conforms to the schema pattern, false
otherwise.
The conform function is used to build a test function. Given a schema, it returns a function of
one argument that will execute a match against the schema pattern and return a map of bindings if
successful or nil for a failed match. If you need to know how the schema bindings matched a value
or you want to test against a schema multiple times, you should use conform to define a test
function.
Quick example:
(require '[miner.herbert :as h])
(h/conforms? '{:a int :b [sym+] :c str} '{:a 42 :b [foo bar baz] :c "foo"})
;=> true
;; For better performance, create a test function with `h/conform`.
(def my-test (h/conform '{:a (:= A int) :b [sym+] :c str}))
(my-test '{:a 42 :b [foo bar baz] :c "foo"})
;=> {A 42}
Test.Check integration
The property function takes a predicate and a schema as arguments and returns a
test.check property suitable for generative testing. (test.check also has a defspec
macro for use with clojure.test.) If you just want the generator for a schema, call
generator. The sample function is similar to test.check version but takes a schema.
(require '[miner.herbert.generators :as hg])
(require '[clojure.test.check :as tc])
;; trivial example
(tc/quick-check 100 (hg/property integer? 'int))
;; confirm the types of the values
(tc/quick-check 100 (hg/property (fn [m] (and (integer? (:int m)) (string? (:str m))))
'{:int int :str str :kw kw}))
;; only care about the 42 in the right place
(tc/quick-check 100 (hg/property (fn [m] (== (get-in m [:v 2 :int]) 42))
'{:v (vec kw kw {:int 42} kw) :str str}))
;; samples from a schema generator
(clojure.test.check.generators/sample (hg/generator '[int*]))
;=> (() (9223372036854775807) [9223372036854775807] () [] (1 1) () [-7] (4) [-5])
;; generate samples directly from a schema (notice the "hg" namespace)
(hg/sample '[int*])
;=> (() [-1 0] () (9223372036854775807) [9223372036854775807] [] () () [0 -5] [9223372036854775807] (7 9223372036854775807) [] (12 -11) (-9223372036854775808) [-12 9223372036854775807] [-10 9223372036854775807] (-11) [2] (-11) [-7])
Notation for Schema Patterns
-
Literal constants match themselves: <BR> nil, true, false, numbers, "strings", :keywords
-
Empty literal collections match themselves: <BR> [], (), {}
-
A simple schema pattern is named by a symbol: <BR>
- int - integer
- float - floating-point
- str - string
- kw - keyword
- sym - symbol
- vec - vector
- list - list or cons (actually anything that satisfies
clojure.core/seq?) - seq - any sequential (including vectors)
- map - map
- char - character
- bool - boolean
- any - anything
-
A few additional schema patterns for numeric sub-types:
- num - any number
- pos - positive number
- neg - negative number
- zero - zero number
- even - even integer
- odd - odd integer
-
A quantified schema pattern: adding a *, + or ? at the end of a symbol for zero-or-more, one-or-more, or zero-or-one (optional): <BR> int*, str+, sym?
-
A quoted expression matches itself without any other interpretation: <BR> 'foo? matches the symbol foo? literally.<BR>
-
A compound schema pattern: using and, or and not <BR>
(or sym+ nil)-- one or more symbols or nil <BR>(or (vec int*) (list kw+))-- either a vector of ints or a list of one or more keywords -
A quantified schema pattern: a list beginning with *, + or ? as the first element. <BR>
(* kw sym)-- zero or more cycles of keywords and symbols -
A named schema expression is written as a list with the first element being the
:=operator, followed by a (non-reserved) symbol as the binding name, and the rest of the list being a schema pattern. The names of predicates and special operators (like and, or, etc.) are not allowed as binding names. The name may be used as a parameter to other schema patterns. Also, the name may be used in the pattern expression to create a recursive pattern.<BR>(:= N int 1 10)-- matches 1 to 10 (inclusive)<BR>(:= A (or :a [:b A]))-- matches [:b [:b [:b :a]]] -
A bound symbol matches an element equal to the value that the name was bound to previously. <BR>
[(:= N int) N N]-- matches [3 3 3] -
A literal vector [in square brackets] matches any sequential (not just a vector) with the contained pattern. <BR>
[(* kw sym)]-- matches (:a foo :b bar :c baz) and [:a foo] -
A literal map in {curly braces} matches any map with the given literal keys and values matching the corresponding schemas. Optional keywords are written with a ? suffix such as :kw?. (Use a quote mark to match a literal keyword ending with ?. ':k? matches :k? literally without any special interpretation of the ? suffix.) For convenience, an optional keyword schema implicitly allows nil for the corresponding value. An empty literal map {} matches exactly the empty map. Use
mapto match any map. <BR>{:a int :b sym :c? [int*]}-- matches {:a 10 :b foo :c [1 2 3]} and {:a 1 :b bar} <BR>{:x? sym ':k? int}-- matches {:k? 10} but not {:k 10} because the keyword was quoted. -
The literal map in {curly braces} may also contains a single pair of patterns with a non-literal key pattern. All keys and and values are required to match in the map value. This kind of pattern is useful for matching "functional" maps. <BR>
{kw int}-- matches {:a 10 :b 20}, but not {:a 1 :b "bar"} -
A literal #{set} with multiple schema patterns denotes the required elements, but does not exclude others. A single element might match multiple patterns. A set with a quantified schema pattern defines the requirement on all elements. <BR>
#{int :a :b}-- matches #{:a :b :c 10}, but not #{:a 10} <BR>#{int+}-- matches #{1 3 5}, but not #{1 :a 3} -
Numeric schema patterns, such as int, even, odd, float, or num, may take optional parameters in a list following the pattern name. Numerics take a low and a high parameter. The value must be between to the low and high (inclusive) for it to match. If only one parameter is given, it defines the high, and the low defaults to 0 in that case. If neither is given, there is no restriction on the high or low values. Quantified numeric patterns apply the high and low to all the matched elements. <BR>
(int 1 10)-- matches 4, but not 12 -
String, symbol and keyword schema patterns (such as str, sym and kw) may take an optional regex argument, specified as a string (for edn compatibility) or a Clojure regular expression (like
#"[Rr]ege?x"). In that case, thepr-strof the element must match the regex. <BR>(kw ":user/.*")-- matches :user/foo -
An inlined schema pattern: a list starting with
&as the first element refers to multiple elements in order (as opposed to being within a collection). It can be useful for addingwhentests where an extra element would not normally be allowed.<BR>{:a (:= N int) :b (& (:= F float) (> N F))}-- matches {:a 4 :b 3.14} -
The
mapschema predicate matches a map. It takes the same arguments as the {curly brace} literal map schema. With no arguments,(map)matches any map, same asmap. Use{}to match the empty map. <BR>(map :a int :b sym :c? [int*])-- matches {:a 10 :b foo :c [1 2 3]} and {:a 1 :b bar} -
The
listschema predicate matches a list or cons. It can take multiple optional arguments to specify the schemas for the ordered elements of the list. <BR>(list sym (* kw int))-- matches (foo :a 42 :b 52 :c 22) -
The
vecschema predicate matches a vector. It can take multiple optional arguments to specify the schemas for the ordered elements of the vector. <BR>(vec int (* sym int))-- matches [4 foo 42 bar 52] -
The
seqschema predicate matches any sequential (vector or list). It's basically the same as using the [square bracket] notation. <BR>(seq kw int sym)-- matches

