Evalfilter
A bytecode-based virtual machine to implement scripting/filtering support in your golang project.
Install / Use
/learn @skx/EvalfilterREADME
- eval-filter
- Sample Usage
- Standalone Use
- Benchmarking
- Fuzz Testing
- API Stability
- See Also
- Github Setup
eval-filter
The evalfilter package provides an embeddable evaluation-engine, which allows simple logic which might otherwise be hardwired into your golang application to be delegated to (user-written) script(s).
There is no shortage of embeddable languages which are available to the golang world, this library is intended to be something that is:
- Simple to embed.
- Simple to use, as there are only three methods you need to call:
- New
- Prepare
- Then either Execute(object) or Run(object) depending upon what kind of return value you would like.
- Simple to understand.
- As fast as it can be, without being too magical.
The scripting language is C-like, and is generally intended to allow you to filter objects, which means you might call the same script upon multiple objects, and the script will return either true or false as appropriate to denote whether some action might be taken by your application against that particular object.
It certainly is possible for you to handle arbitrary return-values from the script(s) you execute, and indeed the script itself could call back into your application to carry out tasks, via the addition of new primitives implemented and exported by your host application, which would make the return value almost irrelevant.
If you go down that route then this repository contains a general-purpose scripting-language, which can be used to execute user-supplied scripts.
My Google GMail message labeller uses the evalfilter in such a standalone manner, executing a script for each new/unread email by default. The script can then add labels to messages based upon their sender/recipients/subjects. etc. The notion of filtering there doesn't make sense, it just wants to execute flexible operations on messages.
However the ideal use-case, for which this was designed, is that your application receives objects of some kind, perhaps as a result of incoming webhook submissions, network events, or similar, and you wish to decide how to handle those objects in a flexible fashion.
Implementation
In terms of implementation the script to be executed is split into tokens by the lexer, then those tokens are parsed into an abstract-syntax-tree. Once the AST exists it is walked by the compiler and a series of bytecode instructions are generated.
Once the bytecode has been generated it can be executed multiple times, there is no state which needs to be maintained, which makes actually executing the script (i.e. running the bytecode) a fast process.
At execution-time the bytecode which was generated is interpreted by a naive stack-based virtual machine, with some runtime support to provide the built-in functions, as well as supporting the addition of host-specific functions.
The bytecode itself is documented briefly in BYTECODE.md, but it is not something you should need to understand to use the library, only if you're interested in debugging a misbehaving script.
Scripting Facilities
Types
The scripting-language this package presents supports the basic types you'd expect:
- Arrays.
- Floating-point numbers.
- Hashes.
- Integers.
- Regular expressions.
- Strings.
- Time / Date values.
- i.e. We can use reflection to handle
time.Timevalues in any structure/map we're operating upon.
- i.e. We can use reflection to handle
The types are supported both in the language itself, and in the reflection-layer which is used to allow the script access to fields in the Golang object/map you supply to it.
Built-In Functions
These are the built-in functions which are always available, though your users can write their own functions within the language (see functions).
You can also easily add new primitives to the engine, by defining a function in your golang application and exporting it to the scripting-environment. For example the print function to generate output from your script is just a simple function implemented in Golang and exported to the environment. (This is true of all the built-in functions, which are registered by default.)
between(value, min, max);- Return true if the specified value is between the specified range (inclusive, so
between(1, 1, 10);will returntrue.)
- Return true if the specified value is between the specified range (inclusive, so
float(value)- Tries to convert the value to a floating-point number, returns Null on failure.
- e.g.
float("3.13").
getenv(value)- Return the value of the named environmental variable, or "" if not found.
int(value)- Tries to convert the value to an integer, returns Null on failure.
- e.g.
int("3").
join(array,deliminator)- Return a string consisting of the array elements joined by the given string.
keys- Returns the available keys in the specified hash, in sorted order.
len(field | value)- Returns the length of the given value, or the contents of the given field.
- For arrays it returns the number of elements, as you'd expect.
lower(field | value)- Return the lower-case version of the given input.
max(a, b)- Return the larger number of the two parameters.
min(a, b)- Return the smaller number of the two parameters.
panic()/panic("Your message here");- These will deliberately stop execution, and return a message to the caller.
print(field|value [, fieldN|valueN] )- Print the given values.
printf("Format string ..", arg1, arg2 .. argN);- Print the given values, with the specified golang format string
- For example
printf("%s %d %t\n", "Steve", 9 / 3 , ! false );
- For example
- Print the given values, with the specified golang format string
replace(input, /regexp/, value)- Perform a replacement with value of the matches of the given regexp in the input-value.
reverse(["Surname", "Forename"]);- Sorts the given array in reverse.
- Add
trueas the second argument to ignore case.
sort(["Surname", "Forename"]);- Sorts the given array.
- Add
trueas the second argument to ignore case.
split("string", "value");- Splits a string into an array, by the given substring.
sprintf("Format string ..", arg1, arg2 .. argN);- Format the given values, using the specified golang format string.
string( )- Converts a value to a string. e.g. "
string(3/3.4)".
- Converts a value to a string. e.g. "
trim(field | string)- Returns the given string, or the contents of the given field, with leading/trailing whitespace removed.
type(field | value)- Returns the type of the given field, as a string.
- For example
string,integer,float,array,boolean, ornull.
- For example
- Returns the type of the given field, as a string.
upper(field | value)- Return the upper-case version of the given input.
hour(field|value),minute(field|value),seconds(field|value)- Allow converting a time to HH:MM:SS.
day(field|value),month(field|value),year(field|value)- Allow converting a time to DD/MM/YYYY.
weekday(field|value)- Allow converting a time to "Saturday", "Sunday", etc.
now()&time()both return the current time.
Conditionals
As you'd expect the facilities are pretty normal/expected:
- Perform comparisons of strings and numbers:
- equality:
- "
if ( Message == "test" ) { return true; }"
- "
- inequality:
- "
if ( Count != 3 ) { return true; }"
- "
- size (
<,<=,>,>=):- "
if ( Count >= 10 ) { return false; }" - "
if ( Hour >= 8 && Hour <= 17 ) { return false; }"
- "
- String matching against a regular expression:
- "
if ( Content ~= /needle/ )" - "
if ( Content ~= /needle/i )"- With case insensitivity
- "
- Does not match a regular expression:
- "
if ( Content !~ /some text we don't want/ )"
- "
- Test if an array contains a value:
- "
return ( Name in [ "Alice", "Bob", "Chris" ] );"
- "
- equality:
- Ternary expressions are also supported - but nesting them is a syntax error!
- "
a = Title ? Title : Subject;" - "
return( result == 3 ? "Three" : "Four!" );"
- "
Loops
Our script implements a golang-style loop, using either for or while as the keyword:
count = 0;
while ( count < 10 ) {
print( "Count: ", count, "\n" );
count++;
}
You could use either statement to iterate over an array contents, but that would be a little inefficient:
items = [ "Some", "Content", "Here" ];
i = 0;
for ( i < len(items) ) {
print( items[i], "\n" );
i++
}
A more efficient and readable approach is to iterate over arrays, and the characters inside a string, via foreach. You can receive both the index and the item at each step of the iteration lik
