NARMAX
Machine learning library for symbolic fitting: the unknown system/function is described via NARMAX algebraic expressions being linear combinations of arbitrary non-linear terms provided by the user (like 0.2x²+0.7sin(x) or x[k-1]*y[k-4]^2).
Install / Use
/learn @Stee-T/NARMAXREADME
Putting the "MAX" back into NARMAX
<div align="justify">The library doesn't just raises exceptions; it raises eyebrows and occasionally, the bar for what is assumed to be possible...
Anyways, this GPU-accelerated Python package contains the machine learning algorithms described in my two papers "Arborescent Orthogonal Least Squares Regression (AOrLSR)" and "Dictionary Morphing Orthogonal Least Squares Regression (DMOrLSR)" (coming soon) both based on my "Recursive Forward Orthogonal Least Squares Regression (rFOrLSR)" to fit "Non-Linear Auto-Regressive Moving-Average Exogenous input systems (NARMAX)". So, now that we have covered all the fancy acronyms, we might get into some explanations.
Otherwise jump straight into the library examples/ tutorials.
Note 1 (unfinished library): The library currently only implements the arborescence part (see below) and is thus not finished, missing the dictionary morphing part. Thus, currently only static regressors can be fitted such that the dictionary terms need to be pre-defined and are not adapted to the system by the regression. Also I'm currently doing research into further ameliorations and even more advanced algorithms, which will all be included in progressive library updates and presented in papers.
Note 2 (Github's poor LaTex): Github's LaTex engine is unreliable, so please forgive that certain expressions (especially sums and underlines) are not rendered properly or not at all. All $x$ and $\chi$ in matrix equations are of course vectors and should be underlined (check the readme.md if in doubt).
Note 3 (rFOrLSR): You might ask yourself "how am I even supposed to pronounce rFOrLSR"?
Imagine you're a French pirate trying to pronounce "Airforce". Being French, you'll ignore the last letter in the word, making it "rFOrLS" and being a pirate, you'll say "ARRRRRRRforce" which fully suffices.
NARMAX...who?
• N for Non-linear: Any non-linearity (functions or piecewise definitions) or non-linear combination method (products of terms, etc.) applied to any of the following types of terms.
• AR for Auto-Regressive: Any type of feeding the system-output $y$ back into the system, via feedback or recursion. Thus, anything containing temporal terms (with $y[k-j]$ terms such as $\tanh(0.5y[k-1] + 0.3y[k-2]y[k-3])$ or $y^a[k-1]x^b[k-3]$) or spatial $y$ terms ($y_1y_2y_3$) or any spatio-temporal combination.
• MA for Moving Average: Any type and distribution of internal noise terms $e[k]$, which accounts for fitting error or any variables not present in the system input. Can also represent internal random states spanning multiple time steps or spatial distributions.
• X for exogenous input: Any type of system-input terms $x$ which can be temporal ($x^n[k-j]$ or $e^{x[k]y[k-2]}$) or spatial ($x_1x_2x_3$, etc) or spatio-temporal (a mixture of both).
A NARMAX system contains thus any arbitrary mixture of the above terms. The provided library examples (being those in the AOrLSR paper) demonstrate the fitting of the below dynamic (=temporal) systems. Those have no other special properties than being stable for an input of amplitude 1 (or potentially more) and being the most ridiculously non-linear systems I came up with at that time to demonstrate the fitting power and flexibility of the algorithms.
1. Linear-in-the-parameter Example
$y[k] = 0.2x[k] + 0.3x^3[k-1] + 0.7|x[k-2]x^2[k-1]| +0.5e^{x[k-3]x[k-2]} - \cos(y[k-1]x[k-2]) -0.4|x[k-1]y^2[k-2]| - y^3[k-3]$
This is essentially a monomial expansion of IIR terms ($y[k-j]$ and $x[k-j]$) also passed through common non-linearities such as abs, cos and exp, yielding a heavily non-linear NARX system.
Tutorial for this example
2. Rational Example
$y[k]=\frac{0.6|x[k]|-0.35x^3[k]-0.3x[k-1]y[k-2]+0.1|y[k-1]|}{1-0.4|x[k]|+0.3|x[k-1]x[k]|-0.2x^3[k-1]+ 0.3y[k-1]x[k-2]}$
This demonstrates that (for NARX systems) rational non-linear models can be fitted by linearizing the terms: $y[k]=\frac{A}{1+B}\iff y[k](1+B)=A⟺y[k]=A-y[k]B$, $A$ and $B$ being linear-in-the-parameter systems such as system 1 in the above example.
Tutorial for this example
3. Expansion-in-an-expression Example
$y = \text{sgn}(x)(1-\frac{1}{1+|x|A})$ with $A≔\Sigma_{j\in J}\theta _j |x|^j$ and $J\subseteq \mathbb{N}$
This is a memoryless NX (Non-linear exogenous input) system aka a normal scalar function, depending only on $x$. This system shows that NARMAX expansions can be inserted into expressions to impose constraints or system properties (here quick convergence to $\text{sgn(x)}$ and low error around the origin) or obtain complex fitting. This specific expansion is designed to emulate tanh with another continuous rational function. The provided code also demonstrates how to choose the number of terms in such an expansion and how to create a custom validation function.
Tutorial for this example
4. MIMO & MISO Example
$y_1[k] = 0.2 x_1[k] + 0.3 x_2^3[k] + 0.7 |x_3[k]| + 0.5 x_2[k-3] x_1[k-2] - 0.3 y_2[k-1] x_2^2[k-2] - 0.8 |x_3[k-1] y_1[k-2]| - 0.7 x_1[k-1] x_2^2[k-1]$ $y_2[k] = 0.3 x_1[k-1] + 0.5 x_3^3[k] + 0.7 |y_1[k-1]| + 0.6 y_1[k-3] x_1[k-2] - 0.4 y_1[k-1] x_3^2[k-2] - 0.9 |x_3[k-1] y_2[k-2]| - 0.7 x_3[k-1] x_2^2[k-1]$
This is a MIMO (Multiple Input Multiple Output) system / function with 3 input channels / variables and 2 output channels / variables, which is constituted by two MISO (Multiple Input Single Output) systems / functions: one per output. This demonstrates that the rFOrLSR can fit systems / functions with an arbitrary input and output dimensionality: $\mathbb{R}^n \rightarrow \mathbb{R}^m$ (in this example $\mathbb{R}^3 \rightarrow \mathbb{R}^2$).
Tutorial for this example
The NARMAX fitting steps:
-
Expansion Type Selection: As usual in machine learning, one must first choose an appropriate expansion type (FIR, IIR, Monomially-Expanded IIR, RBF, Wavelet, arbitrary non-linearities, etc.). As expected, the model quality strongly depends on how relevant the chosen expansion is. The advantage of this library's rFOrLSR is that any type of expansion and any mix of expansions is supported, as the rFOrLSR is based on vector matching methods.
This is achieved by creating a fitting "dictionary" $D_C \in \mathbb{R}^{p \times n_C}$ (Pytorch matrix) containing the candidate regressors $\underline{\varphi}_k \in \mathbb{R}^{p}$ stacked column-wise and passing it to the library. -
Model Structure Detection: The first real challenge is to find the correct regressors from all those present in the user-defined dictionary $D_C$, as most system behaviors can be sufficiently well described with very few terms.
To illustrate, the first system above contains a single cosine term which needs to be retrieved from the set of cosines with relevant monomial expansions as arguments. -
Model Parameter Estimation: Finally, once the correct expression terms are selected, their regression (= scaling) coefficients must be chosen. The rFOrLSR's optimization criterion is least squares.
NARMAX...why?
This section is dedicated to all the people who asked me something along the lines of "but AI is currently the thing, why not use neural networks like everyone else?".
First of all, I'd like to point out that artificial neural networks (ANNs), including our friend ChatGPT, are some subclass of NARMAXes. LLMs based on Auto-regressive Transformers are certainly non-linear (N) and auto-regressive (AR), have some internal random states affecting the computations (MA) and take exogenous inputs (X) being whatever you ask them.
Thus, this section is really about symbolic fitting vs black-box networks (which includes neural networks and other algorithm classes this library can fit such as RBF networks and to some extend Wavelet networks, etc.)
Interpretability: Symbolic models often result in equations having a clear physical or at least mathematical interpretation. This is important in situations where understanding the underlying relationships between inputs and outputs is required, such as physics, biology, or engineering. Black-box networks, however, do not provide easily interpretable models.
To illustrate, NARMAX models are used, amongst others, in the field of robotics, as they allow for example to determine a) which part of the system is affected by b) the input of which sensor c) at what time lag and d) how.
Prior Knowledge and constraints: Symbolic models allow the incorporation of domain knowledge and constraints into the model structure.
To illustrate, if the system is known to be linear, only linear terms are added to the dictionary or if the system is oscilatory in nature one fills the dictionary with sine and cosine regressors.
My rFOrLSR even allows to impose regressors to further constraint the model and impose user knowledge.
Data Efficiency: Symbolic models require very little data for training, which allows to fit them in scarce data scenarios or even keep them updated in real-time. This is a cleaer advantage when data ac
Related Skills
proje
Interactive vocabulary learning platform with smart flashcards and spaced repetition for effective language acquisition.
groundhog
398Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).
last30days-skill
17.5kAI agent skill that researches any topic across Reddit, X, YouTube, HN, Polymarket, and the web - then synthesizes a grounded summary
sec-edgar-agentkit
10AI agent toolkit for accessing and analyzing SEC EDGAR filing data. Build intelligent agents with LangChain, MCP-use, Gradio, Dify, and smolagents to analyze financial statements, insider trading, and company filings.
