LearningToCut
Official Code of ICCV 2021 Paper: Learning to Cut by Watching Movies
Install / Use
/learn @PardoAlejo/LearningToCutREADME
Learning to Cut by Watching Movies
Official Code of ICCV 2021 Paper: Learning to Cut by Watching Movies
[ ArXiv | Project Website | ICCV2021 ]
Learning to Cut by Watching Movies. Alejandro Pardo*, Fabian Caba Heilbron, Juan León Alcázar, Ali Thabet, Bernard Ghanem. In ICCV, 2021.
<div align="center" valign="middle"><img height="450px" src="./pull_figure.jpg"></div>Installation
Clone the repository and move to folder:
git clone https://github.com/PardoAlejo/LearningToCut.git
cd LearningToCut
Install environmnet:
conda env create -f ltc-env.yml
Data
Download the following resources and extract the content in the appropriate destination folder. See table.
| Resource | Drive File | Destination Folder |
| ---- |:-----: | :-----: |
| Train Annotations | link | ./data/|
| Val Annotations | link | ./data/|
| Video Durations | link | ./data/|
||||
| Video Features | link | ./data/|
| Audio Features | link | ./data/|
||||
| Best Model | link | ./checkpoints/|
If you want to extract features yourself, or you need the original videos instead, please refer to data/DATA.md
The folder structure should be as follows:
README.md
ltc-env.yml
│
├── data
│ ├── ResNexT-101_3D_video_features.h5
│ ├── ResNet-18_audio_features.h5
│ ├── subset_moviescenes_shotcuts_train.csv
│ ├── subset_moviescenes_shotcuts_val.csv
│ └── durations.csv
│
├── checkpoints
| ├── best_state.ckpt
│
└── scripts
Inference
Copy paste the following commands in the terminal. </br>
Load environment:
conda activate ltc
cd scripts/
Inference on val set
sh inference.sh
Expected results (Table 1 of the Paper):
| Method | AR1-D1 | AR3-D1 | AR5-D1 | AR10-D1 | AR1-D2 | AR3-D2 | AR5-D2 | AR10-D2 | AR1-D3 | AR3-D3 | AR5-D3 | AR10-D3 | |--------|--------|--------|--------|---------|--------|--------|--------|---------|--------|--------|--------|---------| | Random | 0.64% | 1.91% | 3.15% | 6.28% | 1.85% | 5.65% | 9.32% | 18.52% | 3.67% | 10.67% | 17.62% | 33.91% | | Raw | 1.16% | 3.97% | 6.36% | 11.72% | 2.51% | 8.32% | 13.15% | 24.25% | 3.73% | 12.19% | 19.33% | 34.97% | | LTC | 8.18% | 17.95% | 24.44% | 30.35% | 15.30% | 35.11% | 48.26% | 59.42% | 19.18% | 46.32% | 64.30% | 79.35% | </br>
Cite us
@InProceedings{Pardo_2021_ICCV,
author = {Pardo, Alejandro and Caba, Fabian and Alcazar, Juan Leon and Thabet, Ali K. and Ghanem, Bernard},
title = {Learning To Cut by Watching Movies},
booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
month = {October},
year = {2021},
pages = {6858-6868}
}
Related Skills
proje
Interactive vocabulary learning platform with smart flashcards and spaced repetition for effective language acquisition.
YC-Killer
2.7kA library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.
best-practices-researcher
The most comprehensive Claude Code skills registry | Web Search: https://skills-registry-web.vercel.app
groundhog
400Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).
