PyFAST
pyFAST is a research-driven, modular Python framework built for advanced and efficient time series analysis, especially excelling in multi-source and sparse data scenarios.
Install / Use
/learn @freepose/PyFASTREADME
pyFAST: Flexible, Advanced Framework for Multi-source and Sparse Time Series Analysis in PyTorch
pyFAST (Forecasting And Sparse Time-Series) is a research-driven, modular Python framework built for advanced and efficient time series analysis, especially excelling in multi-source and sparse data scenarios. Leveraging PyTorch, pyFAST provides a unified and flexible platform for forecasting, imputation, and generative modeling, integrating cutting-edge LLM-inspired architectures, Variational Autoencoders, and classical time series models.
Update logs:
- 2025-12-05: Update the
overviewfigure to reflect the latest module structure. - 2025-10-20: All the models are categorized for better navigation and usability.
- 2025-09-15:
SMTDatasetandSSTDatasetsupports both CSV file(s) in directories and zipped files at the same time. - 2025-08-26: Released the software as well as benchmarking results and datasets link.
Unlock the Power of pyFAST for:
- Alignment-Free Multi-source Time Series Analysis: Process and fuse data from diverse sources without the need for strict temporal alignment, inspired by Large Language Model principles.
- Native Sparse Time Series Forecasting: Effectively handle and forecast sparse time series data with specialized metrics and loss functions, addressing a critical gap in existing libraries.
- Rapid Research Prototyping: Experiment and prototype novel time series models and techniques with unparalleled flexibility and modularity.
- Seamless Customization and Extensibility: Tailor and extend the library to your specific research or application needs with its component-based modular design.
- High Performance and Scalability: Benefit from optimized PyTorch implementations and multi-device acceleration for efficient handling of large datasets and complex models.
Key Capabilities:
- Pioneering LLM-Inspired Models: First-of-its-kind adaptations of Large Language Models specifically for alignment-free multi-source time series forecasting.
- Native Sparse Data Support: Comprehensive support for sparse time series, including specialized metrics, loss functions, and efficient data handling.
- Flexible Multi-source Data Fusion: Integrate and analyze time series data from diverse, potentially misaligned sources.
- Extensive Model Library: Includes a broad range of classical, deep learning (Transformers, RNNs, CNNs, GNNs), and generative time series models for both multivariate (MTS) and univariate (UTS) data.
- Modular and Extensible Architecture: Component-based design enables easy customization, extension, and combination of modules.
- Streamlined Training Pipeline:
Trainerclass simplifies model training with built-in validation, early stopping, checkpointing, and multi-device support. - Comprehensive Evaluation Suite: Includes a wide array of standard and sparse-specific evaluation metrics via the
Evaluatorclass. - Built-in Generative Modeling: Dedicated module for time series Variational Autoencoders (VAEs), including Transformer-based VAEs.
- Reproducibility Focus: Utilities like
initial_seed()ensure experiment reproducibility.
Explore the Core Modules (See Figure Above):
As depicted in the Software Overview Diagram above (Figure 1), pyFAST's fast/ library is structured into five core
modules, ensuring a cohesive and versatile framework:
data/package: Handles data loading, preprocessing, and dataset creation for SST, SMT, MMT, and BDP data scenarios. Key features include efficient sparse data handling, multi-source data integration, scaling methods, patching, and data splitting utilities.model/package: Houses a diverse collection of time series models, categorized intouts/(univariate),mts/(multivariate), andbase/(building blocks) submodules. Includes classical models, deep learning architectures (CNNs, RNNs, Transformers, GNNs), fusion models, and generative models.train.pyModule: Provides theTrainerclass to streamline the entire model training pipeline. Features include device management, model compilation, optimizer and scheduler management, training loop, validation, early stopping, checkpointing, and visualization integration.metric/package: Offers a comprehensive suite of evaluation metrics for time series tasks, managed by theEvaluatorclass. Includes standard metrics (MSE, MAE, etc.) and specialized sparse metrics for masked data.generative/package: (Optional, if you want to highlight) Focuses on generative time series modeling, providing implementations of Time series VAEs and Transformer-based VAEs.
Installation
Ensure you have Python installed. Then, to install pyFAST and its dependencies, run:
pip install -r requirements.txt
Getting Started
Basic Usage Example
Jumpstart your time series projects with pyFAST using this basic example:
import torch
from fast import initial_seed, initial_logger, get_device
from fast.data import SSTDataset
from fast.train import Trainer
from fast.metric import Evaluator
from fast.model.mts.ar import ANN # Example: Using a simple ANN model
# Initialize components for reproducibility and evaluation
initial_seed(2025)
# Initialize logger for tracking training progress
logger = initial_logger()
# Prepare your time series data: replace with actual data loading.
ts = torch.sin(torch.arange(0, 100, 0.1)).unsqueeze(1) # Shape: (1000, 1)
train_ds = SSTDataset(ts, input_window_size=10, output_window_size=1).split(0, 0.8, mark='train')
val_ds = SSTDataset(ts, input_window_size=10, output_window_size=1).split(0.8, 1., mark='val')
# Initialize the model (e.g., ANN)
model = ANN(
input_window_size=train_ds.window_size, # Adapt input window size from dataset
output_window_size=train_ds.output_window_size, # Adapt output window size from dataset, a.k.a. prediction steps
hidden_sizes=32 # Hidden layer size
)
# Set up the Trainer for model training and evaluation
device = get_device('cpu') # Use 'cuda', 'cpu', or 'mps'
evaluator = Evaluator(['MAE', 'RMSE']) # Evaluation metrics
trainer = Trainer(device, model, evaluator=evaluator)
# Train model using prepared datasets
history = trainer.fit(train_ds, val_ds, epoch_range=(1, 10)) # Train for 10 epochs
logger.info(str(history))
# After training, evaluate on a test dataset (if available)
val_results = trainer.evaluate(val_ds)
logger.info(str(val_results))
Data Structures Overview
pyFAST is designed to handle various time series data structures:
-
Multiple Time Series (MTS):
- Shape:
[batch_size, window_size, n_vars] - For datasets with multiple variables recorded over time (e.g., sensor readings, stock prices of multiple companies).
- Shape:
-
Univariate Time Series (UTS):
- Shape:
[batch_size * n_vars, window_size, 1] - For datasets focusing on single-variable sequences, often processed in batches for efficiency.
- Shape:
-
Advanced Data Handling:
- Sparse Data Ready: Models and metrics are designed to effectively work with sparse time series data and missing values, utilizing masks for accurate computations.
- Exogenous Variable Integration: Seamlessly incorporate external factors (exogenous variables) to enrich your time series models.
- Variable-Length Sequence Support: Utilizes dynamic padding to efficiently process time series with varying lengths within batches, optimizing training and inference.
Supporting Models
pyFAST offers a wide range of time series models, categorized as follows:
- Multivariate Time Series Forecasting:
-
AR, GAR, VAR: Autoregressive models.
-
ANN: Artificial Neural Networks.
-
NLinear: Normalization-Linear models.
-
DLinear: Decomposition-Linear models.
-
RLinear: Revisiting Long-term Time Series Forecasting.
-
STD: Seasonal-Trend Decomposition.
-
TimeSeriesRNN, EncoderDecoder: RNN-based forecasting architectures, such as RNN, GRU, LSTM and miniLSTM.
-
TemporalConvNet: Temporal Convolutional Network.
-
CNN1D, CNNRNN, CNNRNNRes: Convolutional sequence models.
-
LSTNet: LSTM + CNN hybrid forecasting model.
-
TSMixer: Time Series Mixer.
-
PatchMLP: Patch-based MLP forecaster.
-
KAN: Kolmogorov-Arnold Networks.
-
DeepResidualNetwork: Deep residual forecasting network.
-
Amplifier: Feature amplification forecasting model.
-
Transformer: Attention is All You Need.
-
Informer: Efficient long-sequence forecasting.
-
Autoformer: Decomposition Transformer.
-
FEDformer: Frequency-enhanced Transformer.
-
FiLM: Frequency improved Legendre Memory Model.
-
Triformer: Tri-level Transformer.
-
Crossformer: Cross-dimension attention.
-
TimesNet: Multi-periodicity modeling.
-
PatchTST: Patch-based Transformer.
-
STAEformer: Spatio-Temporal Adaptive Embedding Transf
-
