LiteFlowNet

LiteFlowNet: A Lightweight Convolutional Neural Network for Optical Flow Estimation, CVPR 2018 (Spotlight paper, 6.6%)

Generate Convert Improve

Install / Use

/learn @twhui/LiteFlowNet

About this skill

Quality Score

0/100

README

LiteFlowNet

<p align="center"><img src="./figure/LiteFlowNet.png" width="800" /></p> <p align = "center">The network structure of LiteFlowNet. For the ease of representation, only a 3-level design is shown.</p> <p align="center"><img src="./figure/cascaded_flow_inference.png" width="400" /></p> <p align = "center">A cascaded flow inference module M:S in NetE.</p>

This repository (<strong>https://github.com/twhui/LiteFlowNet</strong>) is the offical release of <strong>LiteFlowNet</strong> for my paper <a href="https://arxiv.org/pdf/1805.07036.pdf"><strong>LiteFlowNet: A Lightweight Convolutional Neural Network for Optical Flow Estimation</strong></a> in CVPR 2018 (Spotlight paper, 6.6%). <i>The up-to-date version of the paper is available on <a href="https://arxiv.org/pdf/1805.07036.pdf"><strong>arXiv</strong></a></i>.

LiteFlowNet is a lightweight, fast, and accurate opitcal flow CNN. We develop several specialized modules including (1) pyramidal features, (2) cascaded flow inference (cost volume + sub-pixel refinement), (3) feature warping (f-warp) layer, and (4) flow regularization by feature-driven local convolution (f-lconv) layer. LiteFlowNet outperforms PWC-Net (CVPR 2018) on KITTI and has a smaller model size (less than PWC-Net by ~40%). For more details about LiteFlowNet, you may visit <a href="http://mmlab.ie.cuhk.edu.hk/projects/LiteFlowNet/"><strong>my project page</strong></a>.

Oral presentation at CVPR 2018 is also available on <a href="http://www.youtube.com/watch?v=LBJ20kxr1a0&t=60m33s"> <strong>YouTube</strong></a>.

</ul> <table> <thead> <tr> <th align="center"></th> <th align="center">KITTI12 Testing Set (Out-Noc)</th> <th align="center">KITTI15 Testing Set (Fl-all)</th> <th align="center">Model Size (M)</th> </tr> <tr> <td align="center">FlowNet2 (CVPR17)</td> <td align="center">4.82%</td> <td align="center">10.41%</td> <td align="center">162.49</td> <tr> <td align="center">PWC-Net (CVPR18)</td> <td align="center">4.22%</td> <td align="center">9.60%</td> <td align="center">8.75</td> </tr> <tr> <td align="center"><strong>LiteFlowNet (CVPR18)</strong></td> <td align="center"><strong>3.27%</strong></td> <td align="center"><strong>9.38%</strong></td> <td align="center"><strong>5.37</strong></td> </tr> </tbody></table>

LiteFlowNet2

<strong>NEW! Our extended work (LiteFlowNet2, TPAMI 2020) is now available at https://github.com/twhui/LiteFlowNet2</strong>.

LiteFlowNet2 in TPAMI 2020, another lightweight convolutional network, is evolved from LiteFlowNet (CVPR 2018) to better address the problem of optical flow estimation by improving flow accuracy and computation time. Comparing to our earlier work, LiteFlowNet2 improves the optical flow accuracy on Sintel clean pass by 23.3%, Sintel final pass by 12.8%, KITTI 2012 by 19.6%, and KITTI 2015 by 18.8%. Its runtime is 2.2 times faster!

</ul> <table> <thead> <tr> <th align="center"></th> <th align="center">Sintel Clean Testing Set</th> <th align="center">Sintel Final Testing Set</th> <th align="center">KITTI12 Testing Set (Out-Noc)</th> <th align="center">KITTI15 Testing Set (Fl-all)</th> <th align="center">Model Size (M)</th> <th align="center">Runtime* (ms) GTX 1080</th> </tr> <tr> <td align="center">FlowNet2 (CVPR17)</td> <td align="center">4.16</td> <td align="center">5.74</td> <td align="center">4.82%</td> <td align="center">10.41%</td> <td align="center">162</td> <td align="center">121</td> </tr> <tr> <td align="center">PWC-Net+</td> <td align="center"><strong>3.45</strong></td> <td align="center"><strong>4.60</strong></td> <td align="center">3.36%</td> <td align="center">7.72% <td align="center">8.75</td> <td align="center"><strong>40</strong></td> </tr> <tr> <td align="center"><strong>LiteFlowNet2</strong></td> <td align="center"><strong>3.48</strong></td> <td align="center"><strong>4.69</strong></td> <td align="center"><strong>2.63%</strong></td> <td align="center"><strong>7.62%</strong></td> <td align="center"><strong>6.42</strong></td> <td align="center"><strong>40</strong></td> </tr> </tbody></table>

Note: *Runtime is averaged over 100 runs for a Sintel's image pair of size 1024 × 436.

LiteFlowNet3

<strong>NEW! Our extended work (LiteFlowNet3, ECCV 2020) is now available at https://github.com/twhui/LiteFlowNet3</strong>.

We ameliorate the issue of outliers in the cost volume by amending each cost vector through an adaptive modulation prior to the flow decoding. We further improve the flow accuracy by exploring local flow consistency. To this end, each inaccurate optical flow is replaced with an accurate one from a nearby position through a novel warping of the flow field. LiteFlowNet3 not only achieves promising results on public benchmarks but also has a small model size and a fast runtime.

</ul> <table> <thead> <tr> <th align="center"></th> <th align="center">Sintel Clean Testing Set</th> <th align="center">Sintel Final Testing Set</th> <th align="center">KITTI12 Testing Set (Avg-All)</th> <th align="center">KITTI15 Testing Set (Fl-fg)</th> <th align="center">Model Size (M)</th> <th align="center">Runtime* (ms) GTX 1080</th> </tr> <tr> <td align="center">LiteFlowNet (CVPR18)</td> <td align="center">4.54</td> <td align="center">5.38</td> <td align="center">1.6</td> <td align="center">7.99%</td> <td align="center">5.4</td> <td align="center">88</td> </tr> <tr> <td align="center">LiteFlowNet2 (TPAMI20)</td> <td align="center">3.48</td> <td align="center">4.69</td> <td align="center">1.4</td> <td align="center">7.64%</td> <td align="center">6.4</td> <td align="center"><strong>40</strong></td> </tr> <tr> <td align="center">HD3 (CVPR19)</td> <td align="center">4.79</td> <td align="center">4.67</td> <td align="center">1.4</td> <td align="center">9.02%</td> <td align="center">39.9</td> <td align="center">128</td> </tr> <tr> <td align="center">IRR-PWC (CVPR19)</td> <td align="center">3.84</td> <td align="center">4.58</td> <td align="center">1.6</td> <td align="center">7.52%</td> <td align="center">6.4</td> <td align="center">180</td> </tr> <tr> <td align="center"><strong>LiteFlowNet3 (ECCV20)</strong></td> <td align="center"><strong>3.03</strong></td> <td align="center"><strong>4.53</strong></td> <td align="center"><strong>1.3</strong></td> <td align="center"><strong>6.96%</strong></td> <td align="center"><strong>5.2</strong></td> <td align="center">59</td> </tr> </tbody></table>

Note: *Runtime is averaged over 100 runs for a Sintel's image pair of size 1024 × 436.

License and Citation

This software and associated documentation files (the "Software"), and the research paper (<i>LiteFlowNet: A Lightweight Convolutional Neural Network for Optical Flow Estimation</i>) including but not limited to the figures, and tables (the "Paper") are provided for academic research purposes only and without any warranty. Any commercial use requires my consent. When using any parts of the Software or the Paper in your work, please cite the following paper:

<pre><code>@InProceedings{hui18liteflownet, author = {Tak-Wai Hui and Xiaoou Tang and Chen Change Loy}, title = {{LiteFlowNet: A Lightweight Convolutional Neural Network for Optical Flow Estimation}}, booktitle = {{Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR)}}, year = {2018}, pages = {8981--8989}, url = {http://mmlab.ie.cuhk.edu.hk/projects/LiteFlowNet/} }</code></pre>

Datasets

<a href="https://lmb.informatik.uni-freiburg.de/data/FlyingChairs/FlyingChairs.zip"> FlyingChairs dataset</a> (31GB) and <a href="https://lmb.informatik.uni-freiburg.de/resources/datasets/FlyingChairs/FlyingChairs_train_val.txt">train-validation split</a>.
<a href="https://lmb.informatik.uni-freiburg.de/data/SceneFlowDatasets_CVPR16/Release_april16/data/FlyingThings3D/raw_data/flyingthings3d__frames_cleanpass.tar"> RGB image pairs (clean pass)</a> (37GB) and <a href="https://lmb.informatik.uni-freiburg.de/data/SceneFlowDatasets_CVPR16/Release_april16/data/FlyingThings3D/derived_data/flyingthings3d__optical_flow.tar.bz2"> flow fields</a> (311GB) for Things3D dataset.
<a href="http://files.is.tue.mpg.de/sintel/MPI-Sintel-complete.zip"> Sintel dataset (clean + final passes)</a> (5.3GB).
<a href="http://www.cvlibs.net/download.php?file=data_stereo_flow.zip"> KITTI12 dataset</a> (2GB) and <a href="http://www.cvlibs.net/download.php?file=data_scene_flow.zip"> KITTI15 dataset</a> (2GB) (Simple registration is required).

</ul> <table> <thead> <tr> <th align="center"></th> <th align="center">FlyingChairs</th> <th align="center">FlyingThings3D</th> <th align="center">Sintel</th> <th align="center">KITTI</th> </tr> <tr> <td align="center">Crop size</td> <td align="center">448 x 320</td> <td align="center">768 x 384</td> <td align="center">768 x 384</td> <td align="center">896 x 320</td> </tr> <tr> <td align="center">Batch size</td> <td align="center">8</td> <td align="center">4</td> <td align="center">4</td> <td align="center">4</td> </tr> </tbody></table>

Prerequisite

The code package comes as the modified Caffe from <a href="https://lmb.informatik.uni-freiburg.de/resources/software.php">DispFlowNet</a> and <a href="https://github.com/lmb-freiburg/flownet2">FlowNet2</a> with our new layers, scripts, and trained models.

Reimplementations in <a href="https://github.com/twhui/LiteFlowNet#Reimplementations-in-PyTorch-and-TensorFlow">Pytorch and TensorFlow</a> are also available.

Installation was tested under Ubuntu 14.04.5/16.04.2 with CUDA 8.0, cuDNN 5.1 and openCV 2.4.8/3.1.0.

Edit Makefile.config (and Makefile) if necessary in order to fit your machine's settings.

For openCV 3+, you may need to change <code>opencv2/gpu/gpu.hpp</code> to <code>opencv2/cudaarithm.hpp</code> in <code>/src/caffe/layers/resample_layer.cu</code>.

If your machine installed a newer version of

Related Skills

best-practices-researcher

The most comprehensive Claude Code skills registry | Web Search: https://skills-registry-web.vercel.app

groundhog

399

Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).

codebase-to-course

Turn any codebase into a beautiful, interactive single-page HTML course that teaches how the code works to non-technical people. Use this skill whenever someone wants to create an interactive course, tutorial, or educational walkthrough from a codebase or project. Also trigger when users mention 'turn this into a course,' 'explain this codebase interactively,' 'teach this code,' 'interactive tutorial from code,' 'codebase walkthrough,' 'learn from this codebase,' or 'make a course from this project.' This skill produces a stunning, self-contained HTML file with scroll-based navigation, animated visualizations, embedded quizzes, and code-with-plain-English side-by-side translations.

academic-pptx

Use this skill whenever the user wants to create or improve a presentation for an academic context — conference papers, seminar talks, thesis defenses, grant briefings, lab meetings, invited lectures, or any presentation where the audience will evaluate reasoning and evidence. Triggers include: 'conference talk', 'seminar slides', 'thesis defense', 'research presentation', 'academic deck', 'academic presentation'. Also triggers when the user asks to 'make slides' in combination with academic content (e.g., 'make slides for my paper on X', 'create a presentation for my dissertation defense', 'build a deck for my grant proposal'). This skill governs CONTENT and STRUCTURE decisions. For the technical work of creating or editing the .pptx file itself, also read the pptx SKILL.md.

twhui

View profile

View on GitHub

GitHub Stars628

CategoryEducation

Updated10h ago

Forks105

twhui/LiteFlowNet

Languages

C++

Security Score

85/100

Audited on Mar 26, 2026

No findings