MultiSensor-Home: A Wide-area Multi-modal Multi-view Dataset for Action Recognition and Transformer-based Sensor Fusion

This work was presented at the 19th IEEE International Conference on Automatic Face and Gesture Recognition (FG2025). Best Student Paper Award.

Authors: Trung Thanh Nguyen, Yasutomo Kawanishi, Vijay John, Takahiro Komamizu, Ichiro Ide

Introduction

This repository contains the implementation of MultiTSF on the MultiSensor-Home dataset.

Download dataset: https://huggingface.co/datasets/thanhhff/MultiSensor-Home1/

A simple way to download the dataset:

# Make sure hf CLI is installed: pip install -U "huggingface_hub[cli]"
hf download thanhhff/MultiSensor-Home1 --repo-type=dataset --local-dir dataset

Environment

The Python code is developed and tested in the environment specified in requirements.txt. Experiments on the MultiSensor-Home dataset were conducted on four NVIDIA A100 GPUs, each with 32 GB of memory. You can adjust the batch_size parameter in the code to accommodate GPUs with smaller memory.

Dataset

Download the MultiSensor-Home dataset and place it in the dataset/MultiSensor-Home directory.

Training

To train the model, execute the following command:

    bash ./scripts/train.sh

Inference

To perform inference, use the following command:

    bash ./scripts/infer.sh

📄 Citation

@inproceedings{nguyen2025multisensor,
  author    = {Trung Thanh Nguyen and Yasutomo Kawanishi and Vijay John and Takahiro Komamizu and Ichiro Ide},
  title     = {MultiSensor-Home: A Wide-area Multi-modal Multi-view Dataset for Action Recognition and Transformer-based Sensor Fusion},
  booktitle = {Proceedings of the 19th IEEE International Conference on Automatic Face and Gesture Recognition},
  year      = {2025},
  note      = {Best Student Paper Award}
}

Acknowledgment

This work was partly supported by Japan Society for the Promotion of Science (JSPS) KAKENHI JP21H03519 and JP24H00733.

MultiTSF

Install / Use

README