Description§

In this project, I reimplemented Position Based Fluids (PBF) by Macklin and Müller (2013)1.

Essentially, PBF is just an extension of smoothed-particle hydrodynamics (SPH) into the Position Based Dynamics framework (Müller 2007).

In contrast to SPH, the major benefit that PBF gives is stability, allowing us to take larger physics time-steps. Typically SPH would require multiple physics substeps for one animation frame, PBF can take just one physics step per frame.


Videos§

Figure 1: Cube of fluid falling onto another.
Figure 2: Double dam break.
Figure 3: Quadruple dam break.

Algorithm§

The PBF algorithm is very simple, and is easily parallelisable.

Euler Integration§

Integration

for all particles ii do: apply forces vivi+Δt F(xi)\mathbf{v}_i \gets \mathbf{v}_i + \Delta t\ \mathbf{F}(\mathbf{x}_i) predict position xixi+Δtvi\mathbf{x}^{\star}_i \gets \mathbf{x}_i + \Delta t \mathbf{v}_i

First, the algorithm does an unconstrained step by applying external forces and particle velocities; but the particles do not move quite like a liquid.

Particle Sorting§

Sorting

for all particles ii do: find neighboring particles Ni(xi)N_i(\mathbf{x}^{\star}_i)

A lot of our computations are dependent on a particle’s neighbours.

  • Naively, for each particle we look at every other particle, this ends up being O(n2).\mathcal{O}\left(n^2\right).
  • A better way is to discretise the simulation space into a uniform grid of bins, insert the particles to the bins, then sort.
    • With counting sort, this step becomes O(nk)\mathcal{O}(nk)2.
    • Count sort can be parallelised with a prefix sum implementation3.

Jacobi Optimizer§

Optimizer

while iter < solverIterations do: for all particles ii do: compute densities ρi\rho_i for all particles ii do: compute Lagrange multipliers λi\lambda_i for all particles ii do: compute Δpi\Delta\mathbf{p}_i for all particles ii do: update positions xixi+Δpi\mathbf{x}^{\star}_i \gets \mathbf{x}^{\star}_i + \Delta\mathbf{p}_i

This step projects the particles’ positions back into their expected fluid position using a constraint function.

Optimization

Next, define the constraint function by Ci(pi)=ρiρ01. C_i(\mathbf{p}_i) = \frac{\rho_i}{\rho_0} - 1.

Due to the integration step we have that C(p1,,pn)0.C(\mathbf{p}_1, \dots, \mathbf{p}_n) \neq 0. As a result, we optimize for a positional correction ΔpCi(p)λ\Delta\mathbf{p} \approx \nabla C_i(\mathbf{p})\lambda such that C(p+Δp)=0,C(\mathbf{p} + \Delta\mathbf{p}) = 0, where λ\lambda is a Lagrange multiplier given by

λi=Ci(p)kpkCi2+ε. \lambda_i = -\frac{C_i(\mathbf{p})}{\sum_k\|\nabla_{\mathbf{p}_k}C_i\|^2 + \varepsilon}.

Δp\Delta\mathbf{p}

The Δp\Delta\mathbf{p} computation is straightforward, Δpi=1ρ0j(λi+λj+s)W(pipj,h) \Delta\mathbf{p}_i = \frac{1}{\rho_0}\sum_j\left(\lambda_i + \lambda_j + s\right)\nabla W(\mathbf{p}_i - \mathbf{p}_j, h)

where ss is a tensile instability term, and W()\nabla W(\cdot) is the gradient of the spiky kernel (Müller 2003)4.

The tensile instability adds an artificial pressure that encourages particles to clump together to mimic surface tension.

Typically, the optimizer is not run until convergence – rather, it is run for a few (4-6) steps which gives results that look good visually.

Viscosity and Vorticity§

Viscosity and Vorticity

for all particles ii do: update velocities vi1Δt(xixi)\mathbf{v}_i \gets \frac{1}{\Delta t}\left(\mathbf{x}^{\star}_i - \mathbf{x}_i\right) apply XSPH viscosity vivi+cρiνi\mathbf{v}_i \gets \mathbf{v}_i + \frac{c}{\rho_i} \nu_i apply vorticity confinement vivi+εΔt(η^i×ωi)\mathbf{v}_i \gets \mathbf{v}_i + \varepsilon\Delta t\,\left(\hat{\eta}_i \times \omega_i\right) update position xixi\mathbf{x}_i \gets \mathbf{x}^{\star}_i

To make the results look more visually appealing, XSPH viscosity and vorticity confinement is applied.

Vorticity confinement reintroduces energy that was lost in the integration and optimization steps making the simulation look nicer, however, this technique is not physically based.


Implementation§

I used CUDA to implement the entire algorithm on the GPU to exploit its highly parallel nature.

Originally, I had a lot of issues with performance in simulating ~150K particles in real time.

  1. I had a lot of CPU \longleftrightarrow GPU data transfers which introduced some latency.
    • Instead, I used the CUDA/OpenGL interop features to directly map the OpenGL VBOs to the PBF simulation buffer.
  2. GPUs are optimised for single precision.
    • A lot of the library functions I was using were causing type promotion from floats to doubles, preventing this gave me a 10x speed up.

My renderer simply reuses the code from the particle simulation labs, but I replaced the shaders with one that did Blinn-Phong shading on point-sprite spheres based on Green (2010)5.

Initially, I wanted to implement screen space fluid rendering5 with an anisotropic kernel6. Unfortunately, I only managed to implement the kernel and not the shaders.