SkillAgentSearch skills...

PyFAST

pyFAST is a research-driven, modular Python framework built for advanced and efficient time series analysis, especially excelling in multi-source and sparse data scenarios.

Install / Use

/learn @freepose/PyFAST
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

pyFAST: Flexible, Advanced Framework for Multi-source and Sparse Time Series Analysis in PyTorch

Software Overview Figure

pyFAST (Forecasting And Sparse Time-Series) is a research-driven, modular Python framework built for advanced and efficient time series analysis, especially excelling in multi-source and sparse data scenarios. Leveraging PyTorch, pyFAST provides a unified and flexible platform for forecasting, imputation, and generative modeling, integrating cutting-edge LLM-inspired architectures, Variational Autoencoders, and classical time series models.

Update logs:

  • 2025-12-05: Update the overview figure to reflect the latest module structure.
  • 2025-10-20: All the models are categorized for better navigation and usability.
  • 2025-09-15: SMTDataset and SSTDataset supports both CSV file(s) in directories and zipped files at the same time.
  • 2025-08-26: Released the software as well as benchmarking results and datasets link.

Unlock the Power of pyFAST for:

  • Alignment-Free Multi-source Time Series Analysis: Process and fuse data from diverse sources without the need for strict temporal alignment, inspired by Large Language Model principles.
  • Native Sparse Time Series Forecasting: Effectively handle and forecast sparse time series data with specialized metrics and loss functions, addressing a critical gap in existing libraries.
  • Rapid Research Prototyping: Experiment and prototype novel time series models and techniques with unparalleled flexibility and modularity.
  • Seamless Customization and Extensibility: Tailor and extend the library to your specific research or application needs with its component-based modular design.
  • High Performance and Scalability: Benefit from optimized PyTorch implementations and multi-device acceleration for efficient handling of large datasets and complex models.

Key Capabilities:

  • Pioneering LLM-Inspired Models: First-of-its-kind adaptations of Large Language Models specifically for alignment-free multi-source time series forecasting.
  • Native Sparse Data Support: Comprehensive support for sparse time series, including specialized metrics, loss functions, and efficient data handling.
  • Flexible Multi-source Data Fusion: Integrate and analyze time series data from diverse, potentially misaligned sources.
  • Extensive Model Library: Includes a broad range of classical, deep learning (Transformers, RNNs, CNNs, GNNs), and generative time series models for both multivariate (MTS) and univariate (UTS) data.
  • Modular and Extensible Architecture: Component-based design enables easy customization, extension, and combination of modules.
  • Streamlined Training Pipeline: Trainer class simplifies model training with built-in validation, early stopping, checkpointing, and multi-device support.
  • Comprehensive Evaluation Suite: Includes a wide array of standard and sparse-specific evaluation metrics via the Evaluator class.
  • Built-in Generative Modeling: Dedicated module for time series Variational Autoencoders (VAEs), including Transformer-based VAEs.
  • Reproducibility Focus: Utilities like initial_seed() ensure experiment reproducibility.

Explore the Core Modules (See Figure Above):

As depicted in the Software Overview Diagram above (Figure 1), pyFAST's fast/ library is structured into five core modules, ensuring a cohesive and versatile framework:

  • data/ package: Handles data loading, preprocessing, and dataset creation for SST, SMT, MMT, and BDP data scenarios. Key features include efficient sparse data handling, multi-source data integration, scaling methods, patching, and data splitting utilities.
  • model/ package: Houses a diverse collection of time series models, categorized into uts/ (univariate), mts/ (multivariate), and base/ (building blocks) submodules. Includes classical models, deep learning architectures (CNNs, RNNs, Transformers, GNNs), fusion models, and generative models.
  • train.py Module: Provides the Trainer class to streamline the entire model training pipeline. Features include device management, model compilation, optimizer and scheduler management, training loop, validation, early stopping, checkpointing, and visualization integration.
  • metric/ package: Offers a comprehensive suite of evaluation metrics for time series tasks, managed by the Evaluator class. Includes standard metrics (MSE, MAE, etc.) and specialized sparse metrics for masked data.
  • generative/ package: (Optional, if you want to highlight) Focuses on generative time series modeling, providing implementations of Time series VAEs and Transformer-based VAEs.

Installation

Ensure you have Python installed. Then, to install pyFAST and its dependencies, run:

pip install -r requirements.txt

Getting Started

Basic Usage Example

Jumpstart your time series projects with pyFAST using this basic example:

import torch

from fast import initial_seed, initial_logger, get_device
from fast.data import SSTDataset
from fast.train import Trainer
from fast.metric import Evaluator
from fast.model.mts.ar import ANN  # Example: Using a simple ANN model

# Initialize components for reproducibility and evaluation
initial_seed(2025)

# Initialize logger for tracking training progress
logger = initial_logger()

# Prepare your time series data: replace with actual data loading.
ts = torch.sin(torch.arange(0, 100, 0.1)).unsqueeze(1)  # Shape: (1000, 1)
train_ds = SSTDataset(ts, input_window_size=10, output_window_size=1).split(0, 0.8, mark='train')
val_ds = SSTDataset(ts, input_window_size=10, output_window_size=1).split(0.8, 1., mark='val')

# Initialize the model (e.g., ANN)
model = ANN(
    input_window_size=train_ds.window_size,  # Adapt input window size from dataset
    output_window_size=train_ds.output_window_size,  # Adapt output window size from dataset, a.k.a. prediction steps
    hidden_sizes=32  # Hidden layer size
)

# Set up the Trainer for model training and evaluation
device = get_device('cpu')  # Use 'cuda', 'cpu', or 'mps'
evaluator = Evaluator(['MAE', 'RMSE'])  # Evaluation metrics
trainer = Trainer(device, model, evaluator=evaluator)

# Train model using prepared datasets
history = trainer.fit(train_ds, val_ds, epoch_range=(1, 10))  # Train for 10 epochs
logger.info(str(history))

# After training, evaluate on a test dataset (if available)
val_results = trainer.evaluate(val_ds)
logger.info(str(val_results))

Data Structures Overview

pyFAST is designed to handle various time series data structures:

  • Multiple Time Series (MTS):

    • Shape: [batch_size, window_size, n_vars]
    • For datasets with multiple variables recorded over time (e.g., sensor readings, stock prices of multiple companies).
  • Univariate Time Series (UTS):

    • Shape: [batch_size * n_vars, window_size, 1]
    • For datasets focusing on single-variable sequences, often processed in batches for efficiency.
  • Advanced Data Handling:

    • Sparse Data Ready: Models and metrics are designed to effectively work with sparse time series data and missing values, utilizing masks for accurate computations.
    • Exogenous Variable Integration: Seamlessly incorporate external factors (exogenous variables) to enrich your time series models.
    • Variable-Length Sequence Support: Utilizes dynamic padding to efficiently process time series with varying lengths within batches, optimizing training and inference.

Supporting Models

pyFAST offers a wide range of time series models, categorized as follows:

View on GitHub
GitHub Stars57
CategoryProduct
Updated8h ago
Forks7

Languages

Python

Security Score

80/100

Audited on Mar 25, 2026

No findings