Function manipulation toolbox

The foolbox package implements functionality for static analysis of R functions and for manipulating functions by rewriting the components they consist of. The package was written to collect similar functionality from the pmatch and tailr packages, that both have functions for rewriting other functions, but is a general framework for static analysis and function rewriting.

The functionality centres on depth-first traversals of expression trees, typically the body of functions. For example, if you have the function f

f <- function(x) {
  y <- 2 * x
  x + y
}

then its body is an expression, a call to the function { with arguments y <- 2 * x and x + y:

expr <- body(f)
expr[[1]]
#> `{`
expr[[2]]
#> y <- 2 * x
expr[[3]]
#> x + y

The first statement inside f’s body is another call, this time to the <- function, and this call takes two arguments, the symbol y and the expression 2 * x which is yet another call, to *, with the atomic 2 and symbol x as arguments.

With foolbox you can travers such expression structures and rewrite them based on callbacks. You can define callbacks for four base cases for expressions, atomic, pairlist, symbol and primitive, for the recurse call expressions and a callback, called topdown, invoked before the traversal recurses into a call object.

You specify how you want to transform an expression by composing a set of callbacks for a transformation and you apply several transformations by specifying a pipeline of these.

Say, for example, you have functions

f <- function(x) 2 * x
g <- function(y) f(y)

and you want to replace the function call f(y) in the body of g with the function body, 2 * x. You can do this by installing a callback for calls to f and then rewrite with this:

callbacks <- rewrite_callbacks() %>% 
    add_call_callback(f, function(expr, ...) quote(2 * x))

g %>% rewrite() %>% rewrite_with(callbacks)
#> function (y) 
#> 2 * x

Here, I’ve constructed the callbacks first, but a more natural approach might be to provide them inside the pipeline like this:

g %>% rewrite() %>% rewrite_with(
    rewrite_callbacks() %>% add_call_callback(f, function(expr, ...) quote(2 * x))
)
#> function (y) 
#> 2 * x

At least, that is what I find myself doing as I am experimenting with foolbox.

If you have transformations you apply on more than one function, you can of course save them

subst_f <- . %>% rewrite() %>% rewrite_with(
    rewrite_callbacks() %>% add_call_callback(f, function(expr, ...) quote(2 * x))
)

and apply them later

g %>% subst_f
#> function (y) 
#> 2 * x

If you have such saved transformations you can also use them as part of function definition

h <- rewrites[subst_f] < function(x) f(x) + 2 * f(x)
h
#> function (x) 
#> 2 * x + 2 * (2 * x)

You can also put the full definition of a transformation in this syntax, but it is less readable.

The documentation is currently a bit sparse. All functions are documented, but I haven’t written documentation for the overall design. That is on its way. For now, check the examples below.

Installation

# install.packages("devtools")
devtools::install_github("mailund/foolbox")

Examples

Below are a few examples of things I thought up after writing the foolbox. Serendipitous discoveries that I didn’t design the package for, but that are easy to implement using it. I haven’t taken the ideas very far — it is possible to do much more with them, but that would make the exampels harder to follow.

Invariants

Say you want to add an invariant to a variable in a function. Whenever you assign to that variable, you want to make sure the invariant is TRUE. We can insert invariant checking into an existing function using foolbox.

(In this example, I do not include a test at the beginning of function, which I probably should, since that requires that I check if the variable is an argument or not — with foolbox you can easily do this, but I keep it simple).

What you want to do is install an invariant that is called on all assignments, i.e. calls to <- or =. That callback checks if the assignment is to the variable of interest, and if it is, it replaces the assignment with an assignment and a check.

We can write the following function for this. It expects var to be a symbol and predicate to be an expression. It creates a callback for assignments, and it then rewrites using that.

set_invariant <- function(fn, var, predicate) {
    
    var <- rlang::enexpr(var)
    stopifnot(rlang::is_symbol(var))
    
    predicate <- rlang::enexpr(predicate)
    
    set_invariant_callback <- function(expr, ...) {
        if (expr[[2]] == var) {
            rlang::expr({
                !!var <- !!expr[[3]]
                stopifnot(!!predicate)
            })
        } else {
            expr    
        }
    }
    
    fn %>% rewrite() %>% rewrite_with(
        rewrite_callbacks() %>%
        add_call_callback(`<-`, set_invariant_callback) %>%
        add_call_callback(`=`, set_invariant_callback)
    )
}

As an example, we can require that a is always positive in the function below.

f <- function(x, y) {
    a <- x + y
    2 * a^2 + a
}

f %>% set_invariant(a, a > 0)
#> function (x, y) 
#> {
#>     {
#>         a <- x + y
#>         stopifnot(a > 0)
#>     }
#>     2 * a^2 + a
#> }

However, what happens if we have nested functions?

f <- function(x, y) {
    a <- x + y
    g <- function(a) {
        a <- a + 42
        a
    }
    h <- function(x) {
        a <<- -x
        a^2
    }
    x <- h(y)
    2 * a^2 + g(-x)
}

f %>% set_invariant(a, a > 0)
#> function (x, y) 
#> {
#>     {
#>         a <- x + y
#>         stopifnot(a > 0)
#>     }
#>     g <- function(a) {
#>         {
#>             a <- a + 42
#>             stopifnot(a > 0)
#>         }
#>         a
#>     }
#>     h <- function(x) {
#>         a <<- -x
#>         a^2
#>     }
#>     x <- h(y)
#>     2 * a^2 + g(-x)
#> }

Here, we probably don’t want to add the invariant inside the nested functions. An assignment there isn’t an assignment to the variable in the scope of f, after all. But we do want to capture <<- assignments.

(For the <<- assignment, it is a bit tricky to see if it is to the a in the scope of f, in general. It depends on whether that has been assigned to in f before we call the nested function, and in the full generality of functions, we cannot determine this before we run f. I am going to assume that all <<- assignments are to the scope of f for the rest of this example).

We can add <<- as a function to call our callback on, and we can use a topdown callback to pass information on whether we are in a nested function down the recursion.

set_invariant <- function(fn, var, predicate) {
    
    var <- rlang::enexpr(var)
    stopifnot(rlang::is_symbol(var))
    
    predicate <- rlang::enexpr(predicate)
    
    set_invariant_callback <- function(expr, topdown, ...) {
        if (expr[[2]] == var) {
            if ( (expr[[1]] == "<-" || expr[[1]] == "=") &&  !(topdown$nested) ) {
                return(rlang::expr({
                    !!var <- !!expr[[3]]
                    stopifnot(!!predicate)
                }))
            }
            if (expr[[1]] == '<<-') {
                return(rlang::expr({
                    !!var <<- !!expr[[3]]
                    stopifnot(!!predicate)
                }))
            }
        }
        # if we don't return earlier, we keep the expression
        expr    
    }
    nested_functions_callback <- function(expr, skip, topdown, ...) {
        topdown$nested <- TRUE
        topdown
    }
    
    fn %>% rewrite() %>% rewrite_with(
        rewrite_callbacks() %>%
        add_call_callback(`<-`, set_invariant_callback) %>%
        add_call_callback(`=`, set_invariant_callback) %>%
        add_call_callback(`<<-`, set_invariant_callback) %>%
        add_topdown_callback(`function`, nested_functions_callback),
        topdown = list(nested=F

Foolbox

Install / Use

README

Function manipulation toolbox

Installation

Examples

Invariants