SkillAgentSearch skills...

Assignment1

Rendering Basics with PyTorch3D

Install / Use

/learn @learning3d/Assignment1
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

16-825 Assignment 1: Rendering Basics with PyTorch3D (Total: 100 Points + 10 Bonus)

Goals: In this assignment, you will learn the basics of rendering with PyTorch3D, explore 3D representations, and practice constructing simple geometry.

You may also find it helpful to follow the Pytorch3D tutorials.

Table of Contents

  1. Setup
  2. Practicing with Cameras (15 Points)
  3. Practicing with Meshes (10 Points)
  4. Re-texturing a mesh (10 Points)
  5. Camera Transformations (10 Points)
  6. Rendering Generic 3D Representations (45 Points)
  7. Do Something Fun (10 Points)
  8. Extra Credit (10 Points)

0. Setup

You will need to install Pytorch3d. See the directions for your platform here. You will also need to install Pytorch. If you do not have a GPU, you can directly pip install it (pip install torch). Otherwise, follow the installation directions here.

Other miscellaneous packages that you will need can be installed using the requirements.txt file (pip install -r requirements.txt).

If you have access to a GPU, the rendering code may run faster, but everything should be able to run locally on a CPU. Below are some sample installation instructions for a Linux Machine.

For GPU installation, we recommend CUDA>=11.6.

# GPU Installation on a CUDA 11.6 Machine
conda create -n learning3d python=3.10
conda activate learning3d
pip install torch --index-url https://download.pytorch.org/whl/cu116 # Modify according to your cuda version. For example, cu121 for CUDA 12.1
pip install fvcore iopath
conda install -c bottler nvidiacub (required for CUDA older than 11.7)
MAX_JOBS=8 pip install "git+https://github.com/facebookresearch/pytorch3d.git@stable" # This will take some time to compile!
pip install -r requirements.txt

# CPU Installation
conda create -n learning3d python=3.10
conda activate learning3d
pip install torch --index-url https://download.pytorch.org/whl/cpu
pip install fvcore iopath
MAX_JOBS=8 pip install "git+https://github.com/facebookresearch/pytorch3d.git@stable"
pip install -r requirements.txt

Make sure that you have gcc $\ge$ 4.9.

0.1 Rendering your first mesh

To render a mesh using Pytorch3D, you will need a mesh that defines the geometry and texture of an object, a camera that defines the viewpoint, and a Pytorch3D renderer that encapsulates rasterization and shading parameters. You can abstract away the renderer using the get_mesh_renderer wrapper function in utils.py:

renderer = get_mesh_renderer(image_size=512)

Meshes in Pytorch3D are defined by a list of vertices, faces, and texture information. We will be using per-vertex texture features that assign an RGB color to each vertex. You can construct a mesh using the pytorch3d.structures.Meshes class:

vertices = ...  # 1 x N_v x 3 tensor.
faces = ...  # 1 x N_f x 3 tensor.
textures = ...  # 1 x N_v x 3 tensor.
meshes = pytorch3d.structures.Meshes(
    verts=vertices,
    faces=faces,
    textures=pytorch3d.renderer.TexturesVertex(textures),
)

Note that Pytorch3D assumes that meshes are batched, so the first dimension of all parameters should be 1. You can easily do this by calling tensor.unsqueeze(0) to add a batch dimension.

Cameras can be constructed using a rotation, translation, and field-of-view (in degrees). A camera with identity rotation placed 3 units from the origin can be constructed as follows:

cameras = pytorch3d.renderer.FoVPerspectiveCameras(
    R=torch.eye(3).unsqueeze(0),
    T=torch.tensor([[0, 0, 3]]),
    fov=60,
)

Again, the rotation and translations must be batched. You should familiarize yourself with the camera coordinate system that Pytorch3D uses. This will save you a lot of headaches down the line.

Finally, to render the mesh, call the renderer on the mesh, camera, and lighting (optional). Our light will be placed in front of the cow at (0, 0, -3).

lights = pytorch3d.renderer.PointLights(location=[[0, 0, -3]])
rend = renderer(mesh, cameras=cameras, lights=lights)
image = rend[0, ..., :3].numpy()

The output from the renderer is B x H x W x 4. Since our batch is one, we can just take the first element of the batch to get an image of H x W x 4. The fourth channel contains silhouette information that we will ignore, so we will only keep the 3 RGB channels.

An example of the entire process is available in starter/render_mesh.py, which loads a sample cow mesh and renders it. Please take a close look at the code and make sure you understand how it works. If you run python -m starter.render_mesh, you should see the following output:

Cow render

1. Practicing with Cameras (15 Points)

1.1. 360-degree Renders (5 points)

Your first task is to create a 360-degree gif video that shows many continuous views of the provided cow mesh. For many of your results this semester, you will be expected to show full turntable views of your outputs. You may find the following helpful:

  • pytorch3d.renderer.look_at_view_transform: Given a distance, elevation, and azimuth, this function returns the corresponding set of rotations and translations to align the world to view coordinate system.
  • Rendering a gif given a set of images:
import imageio
my_images = ...  # List of images [(H, W, 3)]
duration = 1000 // 15  # Convert FPS (frames per second) to duration (ms per frame)
imageio.mimsave('my_gif.gif', my_images, duration=duration)

You can pass an additional argument loop=0 to imageio.mimsave to make the gif loop.

<g>Submission</g>: On your webpage, you should include a gif that shows the cow mesh from many continuously changing viewpoints.

1.2 Re-creating the Dolly Zoom (10 points)

The Dolly Zoom is a famous camera effect, first used in the Alfred Hitchcock film Vertigo. The core idea is to change the focal length of the camera while moving the camera in a way such that the subject is the same size in the frame, producing a rather unsettling effect.

In this task, you will recreate this effect in Pytorch3D, producing an output that should look something like this:

Dolly Zoom

You will make modifications to starter/dolly_zoom.py. You can render your gif by calling python -m starter.dolly_zoom.

<g>Submission</g>: On your webpage, include a gif with your dolly zoom effect.

2. Practicing with Meshes (10 Points)

2.1 Constructing a Tetrahedron (5 points)

In this part, you will practice working with the geometry of 3D meshes. Construct a tetrahedron mesh and then render it from multiple viewpoints. Your tetrahedron does not need to be a regular tetrahedron (i.e. not all faces need to be equilateral triangles) as long as it is obvious from the renderings that the shape is a tetrahedron.

You will need to manually define the vertices and faces of the mesh. Once you have the vertices and faces, you can define a single-color texture, similarly to the cow in render_mesh.py. Remember that the faces are the vertex indices of the triangle mesh.

It may help to draw a picture of your tetrahedron and label the vertices and assign 3D coordinates.

<g>Submission</g>: On your webpage, show a 360-degree gif animation of your tetrahedron. Also, list how many vertices and (triangle) faces your mesh should have.

2.2 Constructing a Cube (5 points)

Construct a cube mesh and then render it from multiple viewpoints. Remember that we are still working with triangle meshes, so you will need to use two sets of triangle faces to represent one face of the cube.

<g>Submission</g>: On your webpage, show a 360-degree gif animation of your cube. Also, list how many vertices and (triangle) faces your mesh should have.

3. Re-texturing a mesh (10 points)

Now let's practice re-texturing a mesh. For this task, we will be retexturing the cow mesh such that the color smoothly changes from the front of the cow to the back of the cow.

More concretely, you will pick 2 RGB colors, color1 and color2. We will assign the front of the cow a color of color1, and the back of the cow a color of color2. The front of the cow corresponds to the vertex with the smallest z-coordinate z_min, and the back of the cow corresponds to the vertex with the largest z-coordinate z_max. Then, we will assign the color of each vertex using linear interpolation based on the z-value of the vertex:

alpha = (z - z_min) / (z_max - z_min)
color = alpha * color2 + (1 - alpha) * color1

Your final output should look something like this:

Cow render

In this case, color1 = [0, 0, 1] and color2 = [1, 0, 0].

<g>Submission</g>: In your submission, describe your choice of color1 and color2, and include a gif of the rendered mesh.

4. Camera Transformations (10 points)

When working with 3D, finding a reasonable camera pose is often the first step to producing a useful visualization, and an important first step toward debugging.

Running python -m starter.camera_transforms produces the following image using the camera extrinsics rotation R_0 and translation T_0:

Cow render

What are the relative camera transformations that would produce each of the following output images? You sh

Related Skills

View on GitHub
GitHub Stars51
CategoryDevelopment
Updated4mo ago
Forks49

Languages

Python

Security Score

77/100

Audited on Nov 20, 2025

No findings