Tangent

Source-to-Source Debuggable Derivatives in Pure Python

Generate Convert Improve

Install / Use

/learn @google/Tangent

About this skill

Quality Score

0/100

README

Tangent

Tangent is a new, free, and open-source Python library for automatic differentiation.

Existing libraries implement automatic differentiation by tracing a program's execution (at runtime, like PyTorch) or by staging out a dynamic data-flow graph and then differentiating the graph (ahead-of-time, like TensorFlow). In contrast, Tangent performs ahead-of-time autodiff on the Python source code itself, and produces Python source code as its output. Tangent fills a unique location in the space of machine learning tools.

Autodiff Tool Space

As a result, you can finally read your automatic derivative code just like the rest of your program. Tangent is useful to researchers and students who not only want to write their models in Python, but also read and debug automatically-generated derivative code without sacrificing speed and flexibility.

Tangent works on a large and growing subset of Python, provides extra autodiff features other Python ML libraries don't have, has reasonable performance, and is compatible with TensorFlow and NumPy.

This project is an experimental release, and is under active development. As we continue to build Tangent, and respond to feedback from the community, there might be API changes.

Usage

Note: An interactive notebook with all the code in this page can be found here.

Tangent has a one-function API:

import tangent
df = tangent.grad(f)

If you want to print out derivatives at the time Tangent generates the derivative function:

import tangent
df = tangent.grad(f, verbose=1)

Here's Tangent in action in the IPython console.

Live Derivatives with Tangent

Installing and running

Installation

The easiest way to install Tangent is to use pip.

pip install tangent

We'll have a conda package soon.

Automatic Differentiation

Under the hood, tangent.grad grabs the source code of the Python function you pass it (using inspect.getsource, which is available in the Python standard library), converts the source code into an abstract syntax tree (AST) using ast.parse (also built into the Python standard library), and walks the syntax tree in reverse order.

Tangent has a library of recipes for the derivatives of basic arithmetic (+,-,/,**,*), pieces of syntax (ast.For, ast.If, ast.While) and TensorFlow Eager functions (tf.reduce_sum, tf.exp, tf.matmul, ... ). For each piece of syntax it encounters (for example, c = a + b is a single AST node ast.Assign), tangent.grad looks up the matching backward-pass recipe, and adds it to the end of the derivative function. This reverse-order processing gives the technique its name: reverse-mode automatic differentiation.

TF Eager

Tangent supports differentiating functions that use TensorFlow Eager functions that are composed together.

def f(W,x):
  h1 = tf.matmul(x,W)
  h2 = tf.tanh(h1)
  out = tf.reduce_sum(h2)
  return out

dfdW = tangent.grad(f)

SCT on TF Eager

Subroutines

When model code becomes long, using subroutines makes code more readable and reusable. Tangent handles taking derivatives of models that have user-defined functions.

SCT on Subroutines

Control Flow

Tangent has recipes for auto-generating derivatives for code that contains if statements and loops:

SCT on Conditionals

You'll notice above that we have to modify the user's code to keep track of information that we will need in the backward pass. For instance, we need to save which branch of an if-statement was followed in the forward pass, so that we run the correct branch in the backward pass. We save this information from the forward pass by pushing it onto a stack, which we then pop off in the backward pass. This is an important data structure in ahead-of-time autodiff.

For loops require a little more bookkeeping. Tangent has to save the number of iterations of the loop on the stack. Also, loops usually overwrite the values of variables inside the loop body. In order to generate a correct derivative, Tangent has to keep track of all of the overwritten values, and restore them in the backward pass in the correct order.

SCT on Loops

Custom Gradients

Tangent uses Python's built-in machinery to introspect and transform the abstract syntax tree (AST) of parsed source code at runtime. For each piece of supported Python syntax, we have implemented a rule indicating how to rewrite an AST node into its backward pass equivalent, or "adjoint". We have defined adjoints for function calls to NumPy and TF Eager methods, as well as larger pieces of syntax, such as if-statements and for-loops. The adjoints are stored in function definitions that serve as "templates", or code macros. Another alternative, which we found too cumbersome, would be to use a templating engine like Mustache and store adjoints as plain strings. Our templates also use a special syntax d[x] to refer to the derivative of a variable x.

While differentiating a function, if Tangent encounters a function call, it first checks if it has a gradient registered for that function. If not, it tries to get the function source, and generate a derivative ahead-of-time. But, it's easy to register your own gradients. Here's a toy example of defining the gradient of x^3.

import tangent
from tangent.grads import adjoint

def cube(x):
  return x * x * x
  
# Register the gradient of cube with Tangent
# NOTE! This is not a runnable function, but instead is a code template.
# Tangent will replace the names of the variables `result` and `x` with whatever
# is used in your containing function.
@adjoint(cube)
def dcube(result, x):
  d[x] = d[result] * 3 * x * x
  
def f(val):
    cubed_val = cube(val)
    return cubed_val

print(tangent.grad(f,verbose=1))

Should output something like:

def dfdval(val, bcubed_val=1.0):
    # Grad of: cubed_val = cube(val)
    bval = bcubed_val * 3 * (val * val) # <<<< this is our inlined gradient
    return bval

The signature for the custom gradient of some function

result = orig_function(arg1,arg2)

@adjoint(orig_function)
def grad_orig_function(result, arg1, arg2):
  d[arg1] = d[result]*...
  d[arg2] = d[result]*...

The first argument to the template is always the result of the function call, followed by the function arguments, in order. Tangent captures the variable names of the result and arguments, and then will use them to unquote the gradient template at the appropriate place in the backward pass.

Check out an example gradient definition of a NumPy function and of a TF eager function. Also, see the docstring in grads.py for more info.

Debugging

Because Tangent auto-generates derivative code you can read, you can also easily debug your backward pass. For instance, your NN might be outputting NaNs during training, and you want to find out where the NaNs are being generated in your model. Just insert a breakpoint (e.g., pdb.set_trace()) at the end of your forward pass.

SCT for Debugging

For large models, setting a breakpoint at the beginning of the backward pass and stepping through dozens of lines might be cumbersome. Instead, you might want the breakpoint to be placed later in the derivative calculation. Tangent lets you insert code directly into any location in the backward pass. First, run from tangent import insert_grad_of, then add a with insert_grad_of block containing the code you'd like to insert into the backward pass.


from tangent import insert_grad_of
def f(x):
  ...
  with insert_grad_of(x) as dx:
    print("dc/dx = %2.2f" % dx)
    pdb.set_trace()
  ...

Ad Hoc Gradient Code

Derivative Surgery

You can use the insert_grad_of feature to do more than debugging and logging. Some NN architectures benefit from tricks that directly manipulate the backward pass. For example, recurrent neural networks (RNNs) suffer from the "exploding gradient" problem, where gradients grow exponentially. This prevents the model from training properly. A typical solution is to force the derivatives inside of an RNN to not exceed a certain value by directly clipping them. We can implement this with insert_grad_of.


def f(params, x):
  h = x
  for i in range(5):
    with insert_grad_of(h) as g:
      g = tf.clip_by_value(g, -1, 1)
    h = rnn(params, h)
  return h

dfdparams = tangent.grad(f)

You can perform other backward-pass tricks with insert_grad_of, such as stop gradients (use a break in the inlined code to stop a for loop), or synthetic gradients (replace a derivative with a prediction from a neural network). This feature lets Tangent users easily debug their models, or quickly try out derivative tweaks in the backward pass.

Forward Mode

Reverse-mode autodiff, or backpropagation, generates efficient der

Related Skills

YC-Killer

2.7k

A library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.

best-practices-researcher

The most comprehensive Claude Code skills registry | Web Search: https://skills-registry-web.vercel.app

groundhog

399

Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).

last30days-skill

18.7k

AI agent skill that researches any topic across Reddit, X, YouTube, HN, Polymarket, and the web - then synthesizes a grounded summary