Jos Stam Advection Scheme

In this chapter:

GPU Data Structures and Operations
- Eulerian Data on the GPU
- Slab Operations
Classical Advection Scheme
- Issues with Classical Advection Scheme
Stam’s Advection Scheme
- Issues with Stam’s Advection Scheme
A Note on Alternatives

We can now finally start with writing the first-step of our fluid simulator, the advection scheme. In order to implement this on the GPU as shaders, we need to review some information about GPU architecture so we can come up with an appropriate way to store and operate on our data.

GPU Data Structures and Operations

Whether you choose to implement the fluid simulator using compute shaders or fragment shaders is entirely up to you.

Eulerian Data on the GPU

Since we are doing an Eulerian simulation (which just means that it is a grid-based simulation), it makes sense to store our data in textures. For a fragment program, the position of the fragment would represent the position in the grid you are working on, and for a compute program, the invocation ((A.K.A. thread or GPU work unit) The smallest unit of GPU execution within a kernel function. For GPGPU programming, invocations are organized into 3D grids and are dispatched in batches called “workgroups”.
) index would tell you what position in the grid you are working on.

Alternatively, for a compute program you could also store the data in a 1D buffer and calculate the buffer index from the grid position. The benefit of doing this is the freedom to store more data. In a texture you only get up to 4 channels to store data to for each cell, however, a compute buffer can store contains structs of larger sizes making it easier to store data.

block columns 5 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

You can get around the problem of texture storage space being limited to 4-channels by simply adding more textures to your shader of course. In this book we assume the use of textures as they make our explanations easier to understand.

Slab Operations

Every step that we are going to carry out in our fluid simulation (advection, diffusion, external forces, projection, etc.) is going to be carried out as a slab operation (often abbreviated as slabop).

A slab operation involves reading all of the old simulation data in parallel, computing the new values, writing the new values to a separate temporary texture, and finally copying the data from this new texture into the original texture (which is called a texture update). This can be summarized as:

Read old values
Calculate new values
Write new values to temporary storage
After all new values are written to temporary storage, copy this data to original buffer

Classical Advection Scheme

As we discussed earlier, advection is the process by which some fluid property is moved by the velocity of the fluid. Even the velocities themselves are moved each simulation step by their own velocity.

Describing the classical advection scheme using this information because very simple. Suppose we want to advect the velocities of the fluid (that means we want to move the position of the velocities, by the velocity itself):

flowchart TD id1["Look up velocity $\vec{v}$ at current position $\vec{p}$"] id2["Add velocity multiplied by time-step to current position: $$\vec{p}_{\text{new}} = \vec{p} + \vec{v} \cdot \Delta t$$"] id3["Look up the velocity at $\vec{p}_{\text{new}}$"] id4["Change the velocity at $\vec{p}_{\text{new}}$ to $\vec{v}$"] id1 e1@==> id2 id2 e2@==> id3 id3 e3@==> id4 e1@{ animate: true, animation: fast } e2@{ animate: true, animation: fast } e3@{ animate: true, animation: fast }

Issues with Classical Advection Scheme

This approach is unsuitable for our needs for a few reasons:

This scheme is highly unsuitable for GPU implementation because it means that multiple GPU invocations ((A.K.A. thread or GPU work unit) The smallest unit of GPU execution within a kernel function. For GPGPU programming, invocations are organized into 3D grids and are dispatched in batches called “workgroups”.
) can write to the same pixel. We want each pixel to be updated by one GPU invocation ((A.K.A. thread or GPU work unit) The smallest unit of GPU execution within a kernel function. For GPGPU programming, invocations are organized into 3D grids and are dispatched in batches called “workgroups”.
) at most.
The new lookup position $\vec{p}_{\text{new}}$ will very likely not lie exactly on a pixel center, so the location you are trying to write to is actually an interpolation of nearby pixels, which means you will have to update multiple pixels at once.

Stam’s Advection Scheme

A staple of writing your first real-time fluid simulator is reading the amazing paper by Stam (2003). Stam describes a different approach to advection that is much more suited to be implemented on the GPU:

Read the velocity $\vec{v}$ at the current position $\vec{p}$
Look “backwards” using this velocity: $\vec{p}_{\text{new}} = \vec{p} - \vec{v} \cdot \Delta t$
Look up the velocity at $\vec{p}_{\text{new}}$
Copy that velocity and use it as the new velocity for that point

It is very likely that $\vec{p}_{\text{new}}$ will not be located exactly on a pixel center, so instead, look up the 4 nearest pixels and perform a bilinear interpolation* to calculate the new velocity value.

If that’s difficult to visualize, I have made an animation to hopefully illustrate this advection scheme better. You can also see how part of the slab operation is performed, after this the “new velocity texture” data is copied to the “old velocity texture” so the calculation can be re-run with the new data on the next time step:

*You only need to do bilinear interpolation if you are looking up the values of the velocity texture using texel positions. If you are using a sampler2D and looking up values by UV coordinates instead, then all you need to do is ensure that your sampler2D is configured with bilinear filtering enabled and it will be done automatically for you!

// If you are using any of these methods to look up values
// then you need to do bilinear filtering yourself:
texelFetch(VELOCITY_TEXTURE, texel, 0);
imageLoad(VELOCITY_IMAGE, texel);

// If you are using a sampler, simply set up the sampler with bilinear filtering:
texture(VELOCITY_TEXTURE, uv);
texture(VELOCITY_TEXTURE, texel / (resolution - 1.0));

This type of advection scheme is called a semi-Lagrangian advection scheme. Lagrangian simulations are fluid simulations where the fluid is composed of simulated particles. On the other hand, what we are doing is called an Eulerian simulation, where our simulation data exists on a fixed grid. So the term semi-Lagrangian means that we pretend / imagine that there are particles in our fluid and evaluate their behavior, but we never actually store these particles as data anywhere. We are simply using the “concept” of particles.

Stam’s advection scheme is a better approach for two reasons:

It is very well suited for implementation on the GPU
It is unconditionally stable, which means that you can use it even when the time-step is large.

Issues with Stam’s Advection Scheme

Consider running Stam’s advection scheme on the following case. Here is an example current velocity texture:

block columns 5 0["0"] 1["0"] 2["0"] 3["0"] 4["0"] 5["0"] 6["0"] 7["0"] 8["0"] 9["0"] 10["0"] 11["0"] 12["$\rightarrow$"] 13["0"] 14["0"] 15["0"] 16["0"] 17["0"] 18["0"] 19["0"] 20["0"] 21["0"] 22["0"] 23["0"] 24["0"]

Only the center grid position has a velocity pointing to the right, the rest of the grid positions don’t have any velocity (represented in the diagram as “0”).

For all the positions with zero velocity, their lookup position is identical to their current position, which means their output velocity will also still be zero.

However, for the center grid position, it inverts it’s velocity, looks up the values to the left (which are all zero), and copies that velocity, resulting in the following output:

block columns 5 0["0"] 1["0"] 2["0"] 3["0"] 4["0"] 5["0"] 6["0"] 7["0"] 8["0"] 9["0"] 10["0"] 11["0"] 12["0"] 13["0"] 14["0"] 15["0"] 16["0"] 17["0"] 18["0"] 19["0"] 20["0"] 21["0"] 22["0"] 23["0"] 24["0"]

This is obviously incorrect, as advection tells us that the velocity must move by the velocity itself. So the correct result should be something like this:

block columns 5 0["0"] 1["0"] 2["0"] 3["0"] 4["0"] 5["0"] 6["0"] 7["0"] 8["0"] 9["0"] 10["0"] 11["0"] 12["0"] 13["$\rightarrow$"] 14["0"] 15["0"] 16["0"] 17["0"] 18["0"] 19["0"] 20["0"] 21["0"] 22["0"] 23["0"] 24["0"]

So what’s going on here? Why does Stam’s advection scheme work for simple fluid simulations if it gives us an obviously incorrect answer here? Take a moment to think about it.

The devil lies in the example problem itself. The example I have given you here is not a valid simulation state in the first place . Recall our incompressibility condition we discussed with the Navier-Stokes equations:

\nabla \cdot \vec{u} = 0 \; \textrm{(Incompressible Flow)}

It states that the divergence of the velocity field must always be zero. Divergence is a measure of how much the vectors of a vector field are “flowing out” of a certain point. Look back at this chapter for a review. A non-zero divergence means that matter is either being created or destroyed, which violates the law of conservation of mass.

If you look at the example above, it is not possible for the center grid cell to have a non-zero velocity value when all of it’s neighbors are zero. That means that there is fluid flowing out of that cell without any fluid entering that cell, which means matter is being mysteriously created there. A valid example might look something like this instead:

block columns 5 0["0"] 1["0"] 2["0"] 3["0"] 4["0"] 5["0"] 6["$\swarrow$"] 7["$\leftarrow$"] 8["$\nwarrow$"] 9["0"] 10["0"] 11["$\searrow$"] 12["$\rightarrow$"] 13["$\nearrow$"] 14["0"] 15["0"] 16["0"] 17["0"] 18["0"] 19["0"] 20["0"] 21["0"] 22["0"] 23["0"] 24["0"]

Notice how the velocities flow into each other so that the velocity field has no divergence.

When we get to the later parts of our simulator where we employ the Helmholtz-Hodge decomposition to obtain a divergence-free velocity field, Stam’s advection scheme works much better and proves to be a simple yet efficient advection model.

A Note on Alternatives

Keep in mind that other advection schemes also exist. Many are a lot more accurate than Stam’s advection scheme, however, Stam’s is one of the simplest to understand and easiest to implement.

After we have completed building a basic fluid simulator we will explore a different, more accurate advection scheme that you can choose to replace Stam’s advection scheme with.