SkillAgentSearch skills...

LMTrajectory

Official Code for "Can Language Beat Numerical Regression? Language-Based Multimodal Trajectory Prediction (CVPR 2024)" and "Social Reasoning-Aware Trajectory Prediction via Multimodal Language Model (TPAMI)"

Install / Use

/learn @InhwanBae/LMTrajectory

README

<!--<h2 align="center">Can Language Beat Numerical Regression?<br>Language-Based Multimodal Trajectory Prediction</h2>--> <!--<h2 align="center">Social Reasoning-Aware Trajectory Prediction<br>via Multimodal Language Model</h2>--> <h2 align="center"> Can $\large{\color{Orange}{\textbf{\textsf{Language}}}}$ Beat $\large{\color{MidnightBlue}{\textbf{\textsf{Numerical Regression}}}}$?<br>Language-Based Multimodal Trajectory Prediction <br>$\tiny{—~and~—}$ <br>Social Reasoning-Aware Trajectory Prediction<br>via $\large{\color{Maroon}{\textbf{\textsf{Multimodal Language Model}}}}$ </h2> <p align="center"> <a href="https://InhwanBae.github.io/"><strong>Inhwan Bae</strong></a> · <a href="https://leejunoh.com/"><strong>Junoh Lee</strong></a> · <a href="https://scholar.google.com/citations?user=Ei00xroAAAAJ"><strong>Hae-Gon Jeon</strong></a> <br> CVPR 2024  &  TPAMI </p> <p align="center"> <a href="https://inhwanbae.github.io/publication/lmtrajectory/"><strong><code>Project Page</code></strong></a> <a href="https://arxiv.org/abs/2403.18447"><strong><code>CVPR Paper</code></strong></a> <a href="https://ieeexplore.ieee.org/abstract/document/11045841"><strong><code>TPAMI Paper</code></strong></a> <a href="https://github.com/InhwanBae/LMTrajectory"><strong><code>Source Code</code></strong></a> <a href="#-citation"><strong><code>Related Works</code></strong></a> </p> <div align='center'> <br><img src="img/lmtraj-model.gif" width=70%> <br>Traditional vs. Our language-based trajectory prediction, LMTraj. </div> <!--<br>This repository contains the code for the LMTrajectory framework.-->

<br>Summary: Language model-based, Multimodal input, Multimodal output, Multi-task training approach for Zero-shot and Supervised human trajectory prediction.

<br>

💬 LMTrajectory Framework 🗨️

  • Prompt-Based Approach: Moving away from conventional numerical regression models, we reframe the task into a prompt-based question-answering perspective.
  • Social Reasoning: Beyond physics-based mathematical interaction modeling, our approach leverages language models to incorporate social reasoning.
  • Multi-Task Training: Supplementary tasks enhance the model's ability to grasp higher-level context through multi-task training.
  • Numerical Tokenizer: Our numerical tokenizer effectively separates text and numbers, enabling the model to learn correlations in sequential data.
  • SOTA Performance: Our holistic solution achieves state-of-the-art results on trajectory prediction benchmarks traditionally dominated by numerical regressors.
<br>

❄️ Zero-Shot Evaluation ❄️

Setup

Environment <br>All models were tested on Ubuntu 20.04 with Python 3.10 and PyTorch 2.0.1 with CUDA 11.7. Dependencies include Python packages such as scipy, simdkalman and openai==0.28.0.

Dataset <br>Preprocessed ETH and UCY datasets are released in this repository. The train/validation/test splits are the same as those found in Social-GAN.

Sample <br>We provide our zero-shot prediction results in the release section. These results include all multimodal trajectories and are available for use in future zero-shot research.

Evaluate LMTraj-ZERO

Preliminary <br>To evaluate our LMTraj-ZERO model, you will need an OPENAI_API_KEY to access the OpenAI API. Create the API key using the instruction provided by OpenAI, and then paste the key into ./zero-shot/chatgpt_trajectory_predictor_v3.py line 25.

Prediction <br>We provide scripts to evaluate our LMTraj-ZERO model for all datasets simultaneously. Two scripts are provided in ./zero-shot/chatgpt_sequential_v3.sh and ./zero-shot/chatgpt_multi_v3.sh. The former script is used to evaluate our model step-by-step, and the latter script is used to evaluate our model with a thread pool for faster inference.

# Choose one of the following scripts to evaluate our LMTraj-ZERO model.
./chatgpt_sequential_v3.sh -d <DATASET_ID> -m <LLM_MODEL_ID>
./chatgpt_multi_v3.sh -d <DATASET_ID> -m <LLM_MODEL_ID>

# Supported dataset id: 0 (ETH), 1 (HOTEL), 2 (UNIV), 3 (ZARA1), 4 (ZARA2)
# Supported llm model id: 0 (gpt-3.5-turbo-0301), 1 (gpt-4-0314), 2 (gpt-3.5-turbo-1106), 3 (gpt-4-1106-preview)

# Examples
cd zero-shot
./chatgpt_multi_v3.sh -d 0 -m 3
./chatgpt_multi_v3.sh -d 1 -m 3

If an error is encountered, your progress will be saved. When you rerun the same script, it will skip the parts that were successfully executed and only regenerate the paths where issues occurred.

If you want to run the model with custom hyperparameters or other models available by OpenAI, use ./zero-shot/chatgpt_trajectory_predictor_v3.py instead of the script file. <br>Warning: A misclick could upgrade you to OpenAI Tier 5, as it did for me :(

Evaluation <br>As the final step, we provide code to evaluate the trajectories generated by our LMTraj-ZERO. To evaluate, first combine the predicted trajectories into a single JSON file.

python ./zero-shot/chatgpt-fragmented_dump_combiner.py --dataset <DATASET_ID> --model <LLM_MODEL_ID>

# Supported dataset id: 0 (ETH), 1 (HOTEL), 2 (UNIV), 3 (ZARA1), 4 (ZARA2)
# Supported llm model id: 0 (gpt-3.5-turbo-0301), 1 (gpt-4-0314), 2 (gpt-3.5-turbo-1106), 3 (gpt-4-1106-preview)

# Examples
python ./zero-shot/chatgpt-fragmented_dump_combiner.py --dataset 0 --model 3
python ./zero-shot/chatgpt-fragmented_dump_combiner.py --dataset 1 --model 3

Next, evaluate the combined trajectories using ADE and FDE metrics.

python ./zero-shot/compute_ade_fde_from_dump.py --dataset <DATASET_ID> --model <LLM_MODEL_ID>

# Supported dataset id: 0 (ETH), 1 (HOTEL), 2 (UNIV), 3 (ZARA1), 4 (ZARA2)
# Supported llm model id: 0 (gpt-3.5-turbo-0301), 1 (gpt-4-0314), 2 (gpt-3.5-turbo-1106), 3 (gpt-4-1106-preview)

# Examples
python ./zero-shot/compute_ade_fde_from_dump.py --dataset 0 --model 3
python ./zero-shot/compute_ade_fde_from_dump.py --dataset 1 --model 3

Results

<table><thead><tr><th rowspan="2"><sub><b>LMTraj-ZERO</b></sub></th><th colspan="2"><sub><b>ETH</b></sub></th><th colspan="2"><sub><b>HOTEL</b></sub></th><th colspan="2"><sub><b>UNIV</b></sub></th><th colspan="2"><sub><b>ZARA1</b></sub></th><th colspan="2"><sub><b>ZARA2</b></sub></th><th colspan="2"><sub><b>AVG</b></sub></th></tr> <tr><th><sub><b>ADE</b></sub></th><th><sub><b>FDE</b></sub></th><th><sub><b>ADE</b></sub></th><th><sub><b>FDE</b></sub></th><th><sub><b>ADE</b></sub></th><th><sub><b>FDE</b></sub></th><th><sub><b>ADE</b></sub></th><th><sub><b>FDE</b></sub></th><th><sub><b>ADE</b></sub></th><th><sub><b>FDE</b></sub></th><th><sub><b>ADE</b></sub></th><th><sub><b>FDE</b></sub></th></tr></thead><tbody> <tr><td><sub><b>gpt-3.5-turbo-0301</b></sub></td><td><sub>1.0668</sub></td><td><sub>1.8241</sub></td><td><sub>0.4229</sub></td><td><sub>0.6538</sub></td><td><sub>0.5570</sub></td><td><sub>0.9836</sub></td><td><sub>0.4715</sub></td><td><sub>0.9073</sub></td><td><sub>0.3878</sub></td><td><sub>0.7056</sub></td><td><sub>0.5812</sub></td><td><sub>1.0149</sub></td></tr> <tr><td><sub><b>gpt-3.5-turbo-1106</b></sub></td><td><sub></sub></td><td><sub></sub></td><td><sub>0.4713</sub></td><td><sub>0.6297</sub></td><td><sub></sub></td><td><sub></sub></td><td><sub></sub></td><td><sub></sub></td><td><sub></sub></td><td><sub></sub></td><td><sub></sub></td><td><sub></sub></td></tr> <tr><td><sub><b>gpt-4-0314</b></sub></td><td><sub>0.7978</sub></td><td><sub>1.6446</sub></td><td><sub>0.2001</sub></td><td><sub>0.3658</sub></td><td><sub>0.3709</sub></td><td><sub>0.7675</sub></td><td><sub>0.3268</sub></td><td><sub>0.6638</sub></td><td><sub>0.2386</sub></td><td><sub>0.4998</sub></td><td><sub>0.3868</sub></td><td><sub>0.7883</sub></td></tr> <tr><td><sub><b>gpt-4-1106-preview</b></sub></td><td><sub></sub></td><td><sub></sub></td><td><sub>0.1757</sub></td><td><sub>0.3279</sub></td><td><sub></sub></td><td><sub></sub></td><td><sub></sub></td><td><sub></sub></td><td><sub></sub></td><td><sub></sub></td><td><sub></sub></td><td><sub></sub></td></tr></tbody></table>

Evaluate Algorithmic Models

We provide four algorithmic models for comparison in zero-shot trajectory prediction task, available in ./zero-shot/algorithmic_model_benchmark.py. The source code supports four extrapolation methods: stop, linear extrapolation, cubic extrapolation and Kalman filter.

python ./zero-shot/algorithmic_model_benchmark.py --model <MODEL_TYPE>

# Examples
python ./zero-shot/algorithmic_model_benchmark.py --model stop
python ./zero-shot/algorithmic_model_benchmark.py --model linear
python ./zero-shot/algorithmic_model_benchmark.py --model cubic
python ./zero-shot/algorithmic_model_benchmark.py --model kalman
<br>

🔥 Supervised Training & Evaluation 🔥

Setup

Environment <br>All models were tested on Ubuntu 20.04 with Python 3.10 and PyTorch 2.0.1 with CUDA 11.7. Dependencies include Python packages such as transformers, accelerate, datasets, nltk and sentencepiece.

Dataset <br>Preprocessed ETH and UCY datasets are released in this repository. The train/validation/test splits are the same as those fond in Social-GAN.

Preliminary

We provide preprocessed datasets, [pretrained tokenizers](https://github.com/InhwanBae/LMTr

Related Skills

View on GitHub
GitHub Stars161
CategoryEducation
Updated7d ago
Forks10

Languages

Python

Security Score

85/100

Audited on Mar 19, 2026

No findings