SkillAgentSearch skills...

Videoqa

Unifying the Video and Question Attentions for Open-Ended Video Question Answering

Install / Use

/learn @ZJULearning/Videoqa
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

Unifying the Video and Question Attentions for Open-Ended Video Question Answering

Table of Contents

<!--ts--> <!--te-->

Introduction

videoqa is the dataset and the algorithms used in Unifying the Video and Question Attentions for Open-Ended Video Question Answering

Datasets

  • file_map: contains the Tumblr urls of the videos
  • QA: contains the question-answer pairs
  • Split: contains the dataset split in the paper

Methods

Compared Algorithms

  • [E-SA] (https://www.aaai.org/ocs/index.php/AAAI/AAAI17/paper/viewFile/14906/14319)
  • [SS-VQA] (https://www.aaai.org/ocs/index.php/AAAI/AAAI17/paper/viewFile/14906/14319)
  • Mean-VQA: a designed baseline where imageQA is performed on each frame

Results

ex1

  • Question: What is a boy combing his hair with?
  • Groundtruth: with his fingers
  • Prediction: with his hands

ex2

  • Question: What runs up a fence?
  • Groundtruth: a cat
  • Prediction: a cat

ex3

  • Question: What is a young girl in a car adjusting?
  • Groundtruth: her dark glasses
  • Prediction: her hair

Dependency

Usage

python main.py

Reference

If you use the code or our dataset, please cite our paper

@article{xue2017unifying,

title={Unifying the Video and Question Attentions for Open-Ended Video Question Answering},

author={Xue, Hongyang and Zhao, Zhou and Cai, Deng},

journal={IEEE Transactions on Image Processing},

year={2017},

publisher={IEEE}

}

View on GitHub
GitHub Stars22
CategoryContent
Updated4mo ago
Forks4

Languages

Python

Security Score

72/100

Audited on Nov 29, 2025

No findings