TimestampActionSeg
No description available
Install / Use
/learn @ZheLi2020/TimestampActionSegREADME
Temporal Action Segmentation from Timestamp Supervision
This repository provides a PyTorch implementation of the paper Temporal Action Segmentation from Timestamp Supervision.
Tested with:
- PyTorch 1.1.0
- Python 3.6.10
Training:
- Download the data folder, which contains the features and the ground truth labels. (~30GB) (try to download it from here))
- Extract it so that you have the
datafolder in the same directory asmain.py. - The three
.npyfiles in 'data/' in this repository are the timestamp annotations. Put each one in corresponding ground truth folder. For example,./data/breakfast/groundTruth/for Breakfast dataset. - To train the model run
python main.py --action=train --dataset=DS --split=SPwhereDSisbreakfast,50saladsorgtea, andSPis the split number (1-5) for 50salads and (1-4) for the other datasets. - The output of evaluation is saved in
result/folder as an excel file. - The
models/folder saves the trained model and theresults/folder saves the predicted action labels of each video in test dataset.
Prediction and Evaluation:
Normally we get the prediction and evaluation after training and do not have to run this independently.
In case you want to test the saved model again by prediction and evaluation, please change the time_data in main.py and run
python main.py --action=predict --dataset=DS --split=SP.
Model
The model used in this paper is a refined MS-TCN model. Please refer to the paper MS-TCN: Multi-Stage Temporal Convolutional Network for Action Segmentation.
Citation:
If you use the code, please cite
Zhe Li, Yazan Abu Farha and Juergen Gall.
Temporal Action Segmentation from Timestamp Supervision.
In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021
Related Skills
node-connect
349.9kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
109.8kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
349.9kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
349.9kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
