MM4TSA
A professional list on Multi-Modalities For Time Series Analysis (MM4TSA) Papers and Resource.
Install / Use
/learn @AdityaLab/MM4TSAREADME
News
- 🔥 NeurIPS 2025 Paper Updated.
- 🔥 KDD 2025 Paper Updated.
- 🔥 ICML 2025 Paper Updated.
- 🔥 IJCAI 2025 Paper Updated.
Time series analysis (TSA) is a longstanding research topic in the data mining community and has wide real-world significance. Compared to "richer" modalities such as language and vision, which have recently experienced explosive development and are densely connected, the time-series modality remains relatively underexplored and isolated. We notice that many recent TSA works have formed a new research field, i.e., Multiple Modalities for TSA (MM4TSA). In general, these MM4TSA works follow a common motivation: how TSA can benefit from multiple modalities. This survey is the first to offer a comprehensive review and a detailed outlook for this emerging field. Specifically, we systematically discuss three benefits: (1) reusing foundation models of other modalities for efficient TSA, (2) multimodal extension for enhanced TSA, and (3) cross-modality interaction for advanced TSA. We further group the works by the introduced modality type, including text, images, audio, tables, and others, within each perspective. Finally, we identify the gaps with future opportunities, including the reused modalities selections, heterogeneous modality combinations, and unseen tasks generalizations, corresponding to the three benefits. We release this up-to-date GitHub repository that includes key papers and resources. More details please check our <a href="https://arxiv.org/abs/2503.11835"><strong>survey</strong></a>.
<div align="center"> <img src="https://github.com/AdityaLab/MM4TSA/blob/main/Survey_Logo_1_1.jpg" width="447"> <img src="https://github.com/AdityaLab/MM4TSA/blob/main/Survey_Logo_2.jpg" width="500"> </div>Contributing
🚀 We will continue to update this repo. If you find it helpful, please Star it or Cite Our Survey.
🤝 Contributions are welcome! Please feel free to submit a Pull Request.
Citation
🤗 If you find this survey useful, please consider citing our paper. 🤗
@misc{liu2025timeseriesanalysisbenefit,
title={How Can Time Series Analysis Benefit From Multiple Modalities? A Survey and Outlook},
author={Haoxin Liu and Harshavardhan Kamarthi and Zhiyuan Zhao and Shangqing Xu and Shiyu Wang and Qingsong Wen and Tom Hartvigsen and Fei Wang and B. Aditya Prakash},
year={2025},
eprint={2503.11835},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={[https://arxiv.org/abs/2503.11835}](https://arxiv.org/abs/2503.11835})
}
Table of Contents
-
-
-
- 2.4.1 Heterogeneous Modality Combinations
- 2.4.2 Robust & Efficient Multimodal TS (Outlook)
-
2.5 Datasets & Benchmarks (Multimodal)
-
-
3.3 Time Series as Other Modalities
- 3.3.1 Tabular Data
- 3.3.2 Audio Data
- 3.3.3 Other Modalities
1. Time2X and X2Time
1.1 Text to Time Series
1.1.1 Generation
<a id="111-generation"></a>
| Title | Venue | | -------------------------------------------------------------------------------------------------------------------------- | --------- | | VerbalTS: Generating Time Series from Texts | ICML 2025 | | BRIDGE: Bootstrapping Text to Control Time-Series Generation via Multi-Agent Iterative Optimization and Diffusion Modeling | ICML 2025 | | T2S: High-resolution Time Series Generation with Text-to-Series Diffusion Models | IJCAI 2025 | | MedDiTPro: A Prompt-Guided Diffusion Transformer for Multimodal Longitudinal Medical Data Synthesis | KDD 2025 | | TarDiff: Target-Oriented Diffusion Guidance for Synthetic Electronic Health Record Time Series Generation | KDD 2025 | | Language Models Still Struggle to Zero-shot Reason about Time Series | EMNLP 2024 Findings | | DiffuSETS: 12-lead ECG Generation Conditioned on Clinical Text Reports and Patient-Specific Information | arXiv 25.01 | | ChatTS: Aligning Time Series with LLMs via Synthetic Data for Enhanced Understanding and Reasoning | arXiv 24.12 | | Forging Time Series with Language: A Large Language Model Approach to Synthetic Data Generation | NeurIPS 2025 |
1.1.2 Retrieval
<a id="112-retrieval"></a>
| Title | Venue | | ------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------- | | Evaluating Large Language Models on Time Series Feature Understanding: A Comprehensive Taxonomy and Benchmark | EMNLP 2024 | | TimeSeriesExam: A Time Series Understanding Exam | NeurIPS 2024 Workshop on Time Series in the Age of Large Models | | CLaSP: Learning Concepts for Time-Series Signals from Natural Language Supervision | arXiv 24.11 | | TRACE: Grounding Time Series in Context for Multimodal Embedding and Retrieval | NeurIPS 2025 |
1.2 Time Series to Text
1.2.1 Explanation
<a id="121-explanation"></a>
| Title | Venue | | ------------------------------------------------------------------------------------------ | ----------- | | Inferring Event Descriptions from Time Series with Language Models | arXiv 25.03 | | Explainable Multi-modal Time Series Prediction with LLM-in-the-Loop | NeurIPS 2025 | | Xforecast: Evaluating natural language explanations for time series forecasting | arXiv 24.10 | | Large language models can deliver accurate and interpretable time series anomaly detection | arXiv 24.05 |
1.2.2 Captioning
<a id="122-captioning"></a>
| Title | Venue | | ---------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------ | | Repr2Seq: A Data-to-Text Generation Model for Time Series | IJCNN 2023 | | Insight miner: A time series analysis dataset for cross-domain alignment with natural language | NeurIPS 2023 AI for Science Workshop |
Related Skills
YC-Killer
2.7kA library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.
groundhog
398Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).
last30days-skill
16.5kAI agent skill that researches any topic across Reddit, X, YouTube, HN, Polymarket, and the web - then synthesizes a grounded summary
sec-edgar-agentkit
10AI agent toolkit for accessing and analyzing SEC EDGAR filing data. Build intelligent agents with LangChain, MCP-use, Gradio, Dify, and smolagents to analyze financial statements, insider trading, and company filings.
Security Score
Audited on Mar 27, 2026
