View-Independent Procedural Virtual Texture with Unity Terrain

Final Result

OVERVIEW

PVTUT is an Unity plugin which features procedural virtual texture with Unity terrain(PVTUT). It supports up to 12 splat maps per terrain. PVTUT improves framerate at the expense of the texture quality at distance. Also, it will take up a lot of VRAM space, so this is a trade-off between space and time.

PVTUT also supports vertically placed terrain decals. In this case, it is using a free Unity plugin called Dynamic Decals to provide in-editor preview. In run-time, all Dynamic Decals objects will be disabled, all decals you see are baked in the texture.

PVTUT supports multi-terrain layout, which means you can have more than 12 splat maps in your scene in total.

You can find all the references at GDC vault, but just in case, these are some other links to the same material.

SHOWCASE 1

PVTUT Composite(final result, LOD level, normal, tile layout)

SHOWCASE 2

PVTUT(up, 1.6ms) v.s. Unity Terrain with Standard Terrain Shader(shadow casting off, 3.5ms)

SHOWCASE 3

PVTUT Composite(final result, LOD level, normal, tile layout)

PVTUT(up, 2.3ms) v.s. Unity Terrain with Standard Terrain Shader(shadow casting off, 6.9ms)

PVTUT(up, 2.3ms) v.s. Unity Terrain with Standard Terrain Shader(shadow casting off, Dynamic Decals disabled, 4.2ms)

SHOWCASE 4 (in motion)

Composite

LOD levels

DETAILS

Intro

PVTUT is based on the 2014 GDC talk Adaptive Virtual Texture presented by Ka Chen from Ubisoft. The idea is to decrease rendering time by reducing how many times that terrains sample the splat textures.

Splatmap

Most terrain systems utilize a technique called splatting to render massive area of terrains while producing a relatively unique look. In general, you would want to texture a terrain just like any other object, by supplying one unique texture. However, this texture has to be unrealistically large to match the massive area of the terrain. For a small scene, this naive method will do. But if you are making an open world game, it is not possible(if no tricks are applied, e.g. Megatexture). So, for a higher resolution, game developers decide that a repeated pattern is tolerable. If we set the uv coordinates of the terrain to some value larger than 1, then the resolution of the result will be increased. But how to break the repeated pattern? Use more textures! If we blend 2 repeated pattern evenly, with a little offset and scaling, the result looks already better. If we use a separate texture's RGB value to control the blending weight, the repeated pattern looks almost gone.

Virtual Texture

With splatting comes the problem of the long sampling time in a shader. If we are splatting 4 textures together, then we are going to sample 5 textures in total to render one pixel. What if we can sample only one time? This is the purpose of virtual texture. In general, if all the texture I need to render the current frame is stored all in one texture, then I only need to bind one texture for this frame to avoid the context transition, which will increase the frame rate. We can either generate this single massive texture for all objects(A.K.A. Megatexture), or we can use this technique only on terrains that using splat maps(A.K.A. Procedural Virtual Texture).

Procedural Virtual Texture

If we can store the blended result in one texture, then the time of sampling multiple textures will be saved. Since terrains are often large, the texture can not be entirely saved in VRAM. So we need to procedurally generate the ones we need and discard the ones which is not currently needed. (It may sounds scary, since we are rendering textures every frame but the good news is they are small quads, so it's not that costly.) If we need every inch of the terrain to has the best quality at same time, it will be a waste of storing space, since we don't care how detailed the terrain could be if it only take one pixel in the distance(or even being occluded by other objects), let alone it is unrealistic to store all these data.

Level Of Detail

So, how do we decide which part of the terrain needs which level of detail? In Megatexture, id software uses a feedback buffer to decide which part of the texture can be seen and only put those parts in the single massive texture. In Procedural Virtual Texture(Battlefield3 and FarCry4), position and view direction is enough for terrain rendering. In my implementation, I simply use position. This means, if you only look around without moving, there will be no texture baking happening.

Storage

The texture will be stored in VRAM, so how do we do it? There are two general options. Either store them in one texture atlas(normally smaller than 8192x8192) or store them in a texture array. No matter where we store them, it will always require a tiling system to map different sections of the texture to different parts of the object(terrain in our case) or even different objects(in the case of Megatexture). The two options have their pros and cons. PVTUT uses texture arrays to store the data. One LOD has one texture array assigned to it.

One of the reasons that I didn't use Unity sparse texture was at that point of time, I didn't find any interface of sparse texture class to use sparse textures as render textures in Unity. It seems like it involves some sort of CPU mapping/read&write to render to the sparse texture, so I just gave it up and used texture array instead.

Mipmap

The reason PVTUT uses texture arrays is that it is relatively easier(or faster) to generate mipmaps for a texture array than a texture atlas. Why do mipmaps matter?

One of the general reasons to use mipmap is to eliminate the flickering when using a point filter to down sample a texture. For example, when texturing a floor. When you look forward, the floor at distance only take a small proportion of the screen pixels but take the same amount of texel in the texture as the floor near to you. You don't want to use only some of the color in the texture. Instead, you want it to reflect a general look of all the color it should use. So we genreate a smaller and blurred version of the original texture and sample from this texture to give the general look that we need.

As mentioned above, PVTUT uses position to decide LOD of a particular area of the terrain. If a hill and a plain exist at the same area which is assigned to, say, LOD 3 and the corresponding texture is of the resolution 512x512, then we will also need the same texture of the resolution 256x256, 128x128, 64x64, 32x32, 16x16, 8x8, 4x4, 2x2, 1x1. Because we don't know how many texels the pixels in this area will take. Take this picture for example:

The face of the hill has the largest pixel/texel ratio. The side of the hill has a smaller one. The ground has the smallest. If we use a large picture with point sampling or bi-linear sampling, flickering will happen on the ground. If we use a small texture the hill will be blurry. We need tri-linear sampling to avoid these artifacts. That means for that area, we need the whole mipmap chain to exist in VRAM (basically the LOD you decided for it is only the maximum resolution).

Maybe you think we can use the texture of the next LOD to blend with the texture of the current LOD. But my implementation can not guarantee the texture of next LOD always exists in GPU memory. And also, texture of the next LOD is not enough to eliminate the artifact. We still need the rest of the chain because it's not guaranteed that the texel/piexl ratio is the same everywhere in that area, they may fall into several ranges. Blending a sample from a 512x512 texture and a sample from 256x256 texture(e.g. hill face) is different from blending a sample from a 128x128 texture and a sample from a 64x64 texture(e.g. ground), and they are both different from blending a sample from a 2x2 texture and a sample from a 1x1 texture(e.g. distant ground in the same tile). Obviously we need all of them for that tile. That's why we need the whole chain.

My implementation failed to reuse mip maps from different LODs. To be specific, my implementation does not shift up or down the mip map of the current LOD to adjacent LODs. This optimization is well explained in Adaptive Virtual Texture. For example, when an LOD 1 tile becomes LOD 2, I should assign LOD 1 mip n to LOD 2 mip n - 1 and render an extra LOD 2 mip MAX. when an LOD 2 tile becomes LOD 1, I should assign LOD 2 mip n to LOD 1 mip n + 1 and render an extra LOD 1 mip 0. This way each time an LOD change happens, I only need to render one mip and reuse the others in stead of letting GPU to generate the whole chain.

Indirection Method

As metioned above, there will always be a tiling system to map the original uv of the object to the ones used in the virtual texture. In Megatexture, I believe they modified the uv of the models off-line because they don't need to procedurally generate any textures. In Procedural Virtual Texture, indirection texture or indirection table can be used to handle the UV coordinates conversion. PVTUT uses one indirection texture for each LOD texture array

PVTUT

Install / Use

README

View-Independent Procedural Virtual Texture with Unity Terrain

Final Result

OVERVIEW

SHOWCASE 1

PVTUT Composite(final result, LOD level, normal, tile layout)

SHOWCASE 2

PVTUT(up, 1.6ms) v.s. Unity Terrain with Standard Terrain Shader(shadow casting off, 3.5ms)

SHOWCASE 3

PVTUT Composite(final result, LOD level, normal, tile layout)

PVTUT(up, 2.3ms) v.s. Unity Terrain with Standard Terrain Shader(shadow casting off, 6.9ms)

PVTUT(up, 2.3ms) v.s. Unity Terrain with Standard Terrain Shader(shadow casting off, Dynamic Decals disabled, 4.2ms)

SHOWCASE 4 (in motion)

Composite

LOD levels

DETAILS

Intro

Splatmap

Virtual Texture

Procedural Virtual Texture

Level Of Detail

Storage

Mipmap

Indirection Method