Mvtorch

a Pytorch library for multi-view 3D understanding and generation

Generate Convert Improve

Install / Use

/learn @ajhamdi/Mvtorch

About this skill

Quality Score

0/100

README

<p align="center"> <img src="./docs/misc/logomvtorch.png" width="35%"/>  </p>

MVTorch [paper]

A modular Pytroch library for multi-view research on 3D understanding and 3D generation. It is published as part of the MVTN IJCV Journal paper

Introduction

MVTorch provides efficient, reusable components for 3D Computer Vision and Graphics research based on mult-view representation with PyTorch and Pytorch3D.

Key Features include:

Render differentiable multi-view images from meshes and point clouds with 3D-2D correspondances.
Data loaders for 3D data and multi-view images (posed or unposed )
Visualizations of 3D mesh,point cloud, multi-view images.
Modular training of multi-view networks for different 3D tasks
I/O 3D data and multi-view images.

Benifits :

Are implemented using PyTorch tensors and on top of Pytorch3D
Can handle minibatches of hetereogenous data
Can be differentiated for input gradients.
Can utilize GPUs for acceleration

Installation

For detailed instructions refer to INSTALL.md.

Test

After installing mvtorch, download common 3D datasets (ModelNet40, ScanObjectNN, ShapeNet Parts, nerf_synthetic) and unzip inside data directory.

cd data/
wget https://shapenet.cs.stanford.edu/media/shapenet_part_seg_hdf5_data.zip --no-check-certificate # download ShapeNet Parts
# download the other datasets from the browser

Run any example from examples directory

cd examples/ && python classification.py

Tutorials

Get started with MVTorch by trying one of the following tutorials.

|<img src="./docs/misc/cls.png" width="310" height="310"/> | <img src="./docs/misc/seg.png" width="310" height="310"/>| |:-----------------------------------------------------------------------------------------------------------:|:--------------------------------------------------:| | Training MVCNN in 10 lines of code for 3D Classification| Training 3D Part Segmentation with Multi-View DeepLabV3 |

|<img src="https://user-images.githubusercontent.com/7057863/78473103-9353b300-7770-11ea-98ed-6ba2d877b62c.gif" width="378" height="378"/> | <img src="https://github.com/threedle/text2mesh/blob/main/images/vases.gif" width="378"/>| |:-----------------------------------------------------------------------------------------------------------:|:--------------------------------------------------:| | Fit A Simple Neural Radiance Field | Create Textured Meshes from Text |

Key Classes

MVRenderer ( renders multi-view images of both point clouds and meshes )
MVNetwork ( allow to take any 2D network as input and outputs its multi-view features)
Visualizer ( handles multi-view and 3D visualization both for server saves and interactive visualization)
data I/O ( load any dataset: modelnet, shapenet, scanobjectnn, shapenet parts, s3dis, nerf, as well as saving Multi-view datasets.)
ViewSelector ( multi-view selector to select M viewpoints to render: random, circular ,spherical, mvtn etc ... )
MVAggregate ( a super model that accepts any 2D network as input and outputs the global multi-view features of input multi-view images: MeanPool, MaxPool)
MVLifting ( aggregates dense features from multi-view pixel features to 3D features , eg. LabelPool, MeanPool, Voint aggregation and lifting )
other useful utility functions and operations.

Development

We welcome new contributions to MVTorch by following this procedure for pull requests:

For code modifications, create an issue with tag request and wait for 10 days for the issue to be resolved.
If issue not resolved in 10 days, fork the repo and create a pull request on a new branch. Please make sure the main examples can run after your adjustments on the core library.
For additional examples, just create a pull request without creating an issue.
If you can contribue regularly on the library, please contact Abdullah to be added to the contruters list.

Citation

If you find mvtorch useful in your research, please cite the library paper:

@article{Hamdi2024,
  author    = {Abdullah Hamdi and Faisal AlZahrani and Silvio Giancola and Bernard Ghanem},
  title     = {MVTN: Learning Multi-view Transformations for 3D Understanding},
  journal   = {International Journal of Computer Vision},
  year      = {2024},
  doi       = {10.1007/s11263-024-02283-5},
  issn      = {1573-1405}
}

News

[July 23 2022]: MVTorch repo created

[December 26 2022]: MVTorch made public

Projects

Projects that MVTorch benifited from in devlopment: MVTN, Voint Cloud, Text2Mesh and NeRF

Documentation

A detailed documentation of the library should be coming soon...

Overview Video

Coming soon ...

License

MVTorch is released under the BSD License.

Related Skills

YC-Killer

2.7k

A library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.

flutter-tutor

Flutter Learning Tutor Guide You are a friendly computer science tutor specializing in Flutter development. Your role is to guide the student through learning Flutter step by step, not to provide d

groundhog

398

Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).

last30days-skill

16.9k

AI agent skill that researches any topic across Reddit, X, YouTube, HN, Polymarket, and the web - then synthesizes a grounded summary