LMTrajectory

Official Code for "Can Language Beat Numerical Regression? Language-Based Multimodal Trajectory Prediction (CVPR 2024)" and "Social Reasoning-Aware Trajectory Prediction via Multimodal Language Model (TPAMI)"

Generate Convert Improve

Install / Use

/learn @InhwanBae/LMTrajectory

About this skill

Quality Score

0/100

README

<h2 align="center"> Can $\large{\color{Orange}{\textbf{\textsf{Language}}}}$ Beat $\large{\color{MidnightBlue}{\textbf{\textsf{Numerical Regression}}}}$? Language-Based Multimodal Trajectory Prediction $\tiny{—~and~—}$ Social Reasoning-Aware Trajectory Prediction via $\large{\color{Maroon}{\textbf{\textsf{Multimodal Language Model}}}}$ </h2> <a href="https://InhwanBae.github.io/">Inhwan Bae</a> · <a href="https://leejunoh.com/">Junoh Lee</a> · <a href="https://scholar.google.com/citations?user=Ei00xroAAAAJ">Hae-Gon Jeon</a> CVPR 2024 & TPAMI <a href="https://inhwanbae.github.io/publication/lmtrajectory/"><code>Project Page</code></a> <a href="https://arxiv.org/abs/2403.18447"><code>CVPR Paper</code></a> <a href="https://ieeexplore.ieee.org/abstract/document/11045841"><code>TPAMI Paper</code></a> <a href="https://github.com/InhwanBae/LMTrajectory"><code>Source Code</code></a> <a href="#-citation"><code>Related Works</code></a> <div align='center'> <img src="img/lmtraj-model.gif" width=70%> Traditional vs. Our language-based trajectory prediction, LMTraj. </div>

Summary: Language model-based, Multimodal input, Multimodal output, Multi-task training approach for Zero-shot and Supervised human trajectory prediction.

💬 LMTrajectory Framework 🗨️

Prompt-Based Approach: Moving away from conventional numerical regression models, we reframe the task into a prompt-based question-answering perspective.
Social Reasoning: Beyond physics-based mathematical interaction modeling, our approach leverages language models to incorporate social reasoning.
Multi-Task Training: Supplementary tasks enhance the model's ability to grasp higher-level context through multi-task training.
Numerical Tokenizer: Our numerical tokenizer effectively separates text and numbers, enabling the model to learn correlations in sequential data.
SOTA Performance: Our holistic solution achieves state-of-the-art results on trajectory prediction benchmarks traditionally dominated by numerical regressors.

❄️ Zero-Shot Evaluation ❄️

Setup

Environment All models were tested on Ubuntu 20.04 with Python 3.10 and PyTorch 2.0.1 with CUDA 11.7. Dependencies include Python packages such as scipy, simdkalman and openai==0.28.0.

Dataset Preprocessed ETH and UCY datasets are released in this repository. The train/validation/test splits are the same as those found in Social-GAN.

Sample We provide our zero-shot prediction results in the release section. These results include all multimodal trajectories and are available for use in future zero-shot research.

Evaluate LMTraj-ZERO

Preliminary To evaluate our LMTraj-ZERO model, you will need an OPENAI_API_KEY to access the OpenAI API. Create the API key using the instruction provided by OpenAI, and then paste the key into ./zero-shot/chatgpt_trajectory_predictor_v3.py line 25.

Prediction We provide scripts to evaluate our LMTraj-ZERO model for all datasets simultaneously. Two scripts are provided in ./zero-shot/chatgpt_sequential_v3.sh and ./zero-shot/chatgpt_multi_v3.sh. The former script is used to evaluate our model step-by-step, and the latter script is used to evaluate our model with a thread pool for faster inference.

# Choose one of the following scripts to evaluate our LMTraj-ZERO model.
./chatgpt_sequential_v3.sh -d <DATASET_ID> -m <LLM_MODEL_ID>
./chatgpt_multi_v3.sh -d <DATASET_ID> -m <LLM_MODEL_ID>

# Supported dataset id: 0 (ETH), 1 (HOTEL), 2 (UNIV), 3 (ZARA1), 4 (ZARA2)
# Supported llm model id: 0 (gpt-3.5-turbo-0301), 1 (gpt-4-0314), 2 (gpt-3.5-turbo-1106), 3 (gpt-4-1106-preview)

# Examples
cd zero-shot
./chatgpt_multi_v3.sh -d 0 -m 3
./chatgpt_multi_v3.sh -d 1 -m 3

If an error is encountered, your progress will be saved. When you rerun the same script, it will skip the parts that were successfully executed and only regenerate the paths where issues occurred.

If you want to run the model with custom hyperparameters or other models available by OpenAI, use ./zero-shot/chatgpt_trajectory_predictor_v3.py instead of the script file. Warning: A misclick could upgrade you to OpenAI Tier 5, as it did for me :(

Evaluation As the final step, we provide code to evaluate the trajectories generated by our LMTraj-ZERO. To evaluate, first combine the predicted trajectories into a single JSON file.

python ./zero-shot/chatgpt-fragmented_dump_combiner.py --dataset <DATASET_ID> --model <LLM_MODEL_ID>

# Supported dataset id: 0 (ETH), 1 (HOTEL), 2 (UNIV), 3 (ZARA1), 4 (ZARA2)
# Supported llm model id: 0 (gpt-3.5-turbo-0301), 1 (gpt-4-0314), 2 (gpt-3.5-turbo-1106), 3 (gpt-4-1106-preview)

# Examples
python ./zero-shot/chatgpt-fragmented_dump_combiner.py --dataset 0 --model 3
python ./zero-shot/chatgpt-fragmented_dump_combiner.py --dataset 1 --model 3

Next, evaluate the combined trajectories using ADE and FDE metrics.

python ./zero-shot/compute_ade_fde_from_dump.py --dataset <DATASET_ID> --model <LLM_MODEL_ID>

# Supported dataset id: 0 (ETH), 1 (HOTEL), 2 (UNIV), 3 (ZARA1), 4 (ZARA2)
# Supported llm model id: 0 (gpt-3.5-turbo-0301), 1 (gpt-4-0314), 2 (gpt-3.5-turbo-1106), 3 (gpt-4-1106-preview)

# Examples
python ./zero-shot/compute_ade_fde_from_dump.py --dataset 0 --model 3
python ./zero-shot/compute_ade_fde_from_dump.py --dataset 1 --model 3

Results

<table><thead><tr><th rowspan="2">LMTraj-ZERO</th><th colspan="2">ETH</th><th colspan="2">HOTEL</th><th colspan="2">UNIV</th><th colspan="2">ZARA1</th><th colspan="2">ZARA2</th><th colspan="2">AVG</th></tr> <tr><th>ADE</th><th>FDE</th><th>ADE</th><th>FDE</th><th>ADE</th><th>FDE</th><th>ADE</th><th>FDE</th><th>ADE</th><th>FDE</th><th>ADE</th><th>FDE</th></tr></thead><tbody> <tr><td>gpt-3.5-turbo-0301</td><td>1.0668</td><td>1.8241</td><td>0.4229</td><td>0.6538</td><td>0.5570</td><td>0.9836</td><td>0.4715</td><td>0.9073</td><td>0.3878</td><td>0.7056</td><td>0.5812</td><td>1.0149</td></tr> <tr><td>gpt-3.5-turbo-1106</td><td></td><td></td><td>0.4713</td><td>0.6297</td><td></td><td></td><td></td><td></td><td></td><td></td><td></td><td></td></tr> <tr><td>gpt-4-0314</td><td>0.7978</td><td>1.6446</td><td>0.2001</td><td>0.3658</td><td>0.3709</td><td>0.7675</td><td>0.3268</td><td>0.6638</td><td>0.2386</td><td>0.4998</td><td>0.3868</td><td>0.7883</td></tr> <tr><td>gpt-4-1106-preview</td><td></td><td></td><td>0.1757</td><td>0.3279</td><td></td><td></td><td></td><td></td><td></td><td></td><td></td><td></td></tr></tbody></table>

Evaluate Algorithmic Models

We provide four algorithmic models for comparison in zero-shot trajectory prediction task, available in ./zero-shot/algorithmic_model_benchmark.py. The source code supports four extrapolation methods: stop, linear extrapolation, cubic extrapolation and Kalman filter.

python ./zero-shot/algorithmic_model_benchmark.py --model <MODEL_TYPE>

# Examples
python ./zero-shot/algorithmic_model_benchmark.py --model stop
python ./zero-shot/algorithmic_model_benchmark.py --model linear
python ./zero-shot/algorithmic_model_benchmark.py --model cubic
python ./zero-shot/algorithmic_model_benchmark.py --model kalman

🔥 Supervised Training & Evaluation 🔥

Setup

Environment All models were tested on Ubuntu 20.04 with Python 3.10 and PyTorch 2.0.1 with CUDA 11.7. Dependencies include Python packages such as transformers, accelerate, datasets, nltk and sentencepiece.

Dataset Preprocessed ETH and UCY datasets are released in this repository. The train/validation/test splits are the same as those fond in Social-GAN.

Preliminary

We provide preprocessed datasets, [pretrained tokenizers](https://github.com/InhwanBae/LMTr

Related Skills

best-practices-researcher

The most comprehensive Claude Code skills registry | Web Search: https://skills-registry-web.vercel.app

mentoring-juniors

Community-contributed instructions, agents, skills, and configurations to help you make the most of GitHub Copilot.

groundhog

399

Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).

isf-agent

a repo for an agent that helps researchers apply for isf funding

InhwanBae

View profile

View on GitHub

GitHub Stars161

CategoryEducation

Updated7d ago

Forks10

InhwanBae/LMTrajectory

Languages

Python

Security Score

85/100

Audited on Mar 19, 2026

No findings