Avobjects

Implementation for ECCV20 paper "Self-Supervised Learning of audio-visual objects from video"

Generate Convert Improve

Install / Use

/learn @afourast/Avobjects

About this skill

Quality Score

0/100

README

avobjects

Implementation for ECCV20 paper "Self-Supervised Learning of audio-visual objects from video"

Project page | Paper

Self-Supervised Learning of audio-visual objects from video. Triantafyllos Afouras, Andrew Owens, Joon Son Chung, Andrew Zisserman In ECCV 2020.

Installing dependencies

conda env create -f environment.yml
conda activate avobjects

Demo

Download pretrained model weights

bash download_models.bash

Run the separation demo

python main.py  --resume checkpoints/avobjects_loc_sep.pt --input_video demo.mp4 --output_dir demo_out

Output

The output directory will also contain videos with the separated audio for every tracked speaker.

Optional: You can point a web browser to the output directory to view the video results. If working on a remote machine, you can run a web server on port 8000 by running

cd demo_out; python3 -m http.server 8000

Running on custom video

To run the model on a new video, add it into the media/ directory and select it using the --input_video argument.

You can specify the number of AV objects to track using the --n_peaks argument.

For example

python main.py  --resume checkpoints/avobjects_loc_sep.pt --n_peaks 2 --input_video trump_biden_debate.mp4  --output_dir trum_biden_debate_out

Output

Citation

If you use this code for your research, please cite:

@InProceedings{Afouras20b,
                 author       = "Triantafyllos Afouras and Andrew Owens and Joon~Son Chung and Andrew Zisserman",
                 title        = "Self-Supervised Learning of Audio-Visual Objects from Video",
                 booktitle    = "European Conference on Computer Vision",
                 year         = "2020",
                }

Related Skills

qqbot-channel

350.1k

QQ 频道管理技能。查询频道列表、子频道、成员、发帖、公告、日程等操作。使用 qqbot_channel_api 工具代理 QQ 开放平台 HTTP 接口，自动处理 Token 鉴权。当用户需要查看频道、管理子频道、查询成员、发布帖子/公告/日程时使用。

docs-writer

100.4k

`docs-writer` skill instructions As an expert technical writer and editor for the Gemini CLI project, you produce accurate, clear, and consistent documentation. When asked to write, edit, or revie

model-usage

350.1k

Use CodexBar CLI local cost usage to summarize per-model usage for Codex or Claude, including the current (most recent) model or a full model breakdown. Trigger when asked for model-level usage/cost data from codexbar, or when you need a scriptable per-model summary from codexbar cost JSON.

Design

Campus Second-Hand Trading Platform \- General Design Document (v5.0 \- React Architecture \- Complete Final Version)1\. System Overall Design 1.1. Project Overview This project aims t

afourast

View profile

View on GitHub

GitHub Stars115

CategoryContent

Updated1mo ago

Forks25

afourast/avobjects

Languages

Python

Security Score

95/100

Audited on Feb 9, 2026

No findings