Physics Informed Neural Fluid Fields

Mengyu Chu1   Lingjie Liu1   Quan Zheng2   Erik Franz3   Hans-Peter Seidel1   Christian Theobalt1   Rhaleb Zayer1  

1Max Planck Institute for Informatics, Saarland Informatics Campus
   2Chinese Academy of Sciences   3Technical University of Munich

Supplemental Webpage

We provide results in form of this html page, so that individual scenes can be easily and repeatedly watched.
Please click on each scene to play the corresponding video. It is helpful to change the zoom level of the browser (Ctrl and +/-) or to view in full screen (double click on videos or right click and open in new view). All videos below are located in the video sub-folder.

This document contains the following parts:
1. ScalarFlow Datasets
   1.1 Synthetic Scene
   1.2 Real Captures
2. Regular Fluids with Complex Lighting
   2.1 Regular Plume Scene
   2.2 Scenes with Regular Obstacles
3. Complex Scenes with Arbitrary Obstacles
4. Training Details

1. ScalarFlow Datasets

-- 1.1 ScalarFlow Synthetic Smoke --

Rendering Comparisons

Taking multiview videos as input, our method learns the spatio-temporal density, color, and velocity fields.
"Rendering Comparisons" (left) shows the overall quality of density and color together. While the reference has a black background, we render reconstructions with a blue background, so that "ghost" density in the color of the original background is visible.
"Volumetric Density" (below) shows the density alone using a uniform ambient light.
"Velocity" and "Vorticity" (further below) are learned implicitly though density fields.

Note that only GlobalTrans requires lighting conditions and visual hulls as input. Other methods with unknown lighting conditions have to disentangle the ambiguity between density and color.
- Our rendering result is close to the reference.
- Neural Volumes has artifacts on novel views due to the existence of "ghost density".
- Although Global Transport (GlobalTrans) produces sharp results, it does not faithfully reconstruct the actual scene, but instead has a lot of noise in both density and velocity.

Volumetric Density, (front-side-top, rendered with uniform ambient lighting)
Warping Errors, (left: warp frame i to i+1; right: warp frame i and i+1 to i+0.5)

- NeuralVolumes, alpha2den: Due to the "ghost density", the alpha (defined in NV) fails to model the actual smoke density.
- NeuralVolumes, rgba2den: The result of alpha*(R+G+B) is closer to the reference, which indicates that the color and density are not properly disentangled.

- GlobalTrans-Warp-Error: GlobalTrans has minimal full-step warping error. It is trained at this discrete level of time.
- Ref-Warp-Error: Simulated with a time step of 0.5, the reference has a high numerical error when warping with a time step of 1.0
- Ours-Warp-Error and Ours-MidWarp-Error: Our continuous model has slightly larger full-step warping error and minimal mid-step warping error. Note that we use the same training data as GlobalTrans without mid-step frames being observed.

Velocity, (the middle slice of front-side-top, intensity reduced outside visual hull)

Vorticity, (the middle slice of front-side-top within the visual hull)


-- 1.2 Real Smoke Captures --

Rendering Comparisons
Volumetric Density, (front-side-top)

Velocity(left) and Vorticity(right), (middle slices, front-side-top)

Warping Error,
(left: warp frame i to i+1; right: warp frame i and i+1 to i+0.5)

The conclusion of the synthetic scene evaluation is consistent with the real case, where NeuralVolumes contains "ghost density" and GlobalTrans has noises. The noise is more visible when looking at the density alone in "Volumetric Density" on top right, as well as in the velocity and vorticity fields. "Warping Error" on the bottom right shows that our results fulfill the transport equation better than GlobalTrans on the real case.


2. Synthetic Scenes with Complex Lighting

-- 2.1 The Plume Scene with Velocity Comparisons --

Rendering Comparisons

- NeuralVolumes: With much "ghost density", NeuralVolumes can easily render more details, since keeping view-consistency is not necessary with their occlusion.
- Deformation: Representing turbulent fluid dynamics as deformation fields is ill-posed and results in stretches due to rigid motion.
- Ours w.o. d2v: Comparing to NeuralVolumes, Ours w.o. d2v significantly reduces the "ghost density", but has color-bleeding artifacts due to less accurate velocity fields
- Ours: Our results match the reference in both training and novel views best with properly disentangled density and color.

Volumetric Comparisons

The color-bleeding artifact is more visible in the density visualization. Our velocity is closer to the reference with enhanced vorticity.


-- 2.2 Plume Scene with A Regular Obstacle --

Comparisons with Related Work

Sphere Scene, Ref Ours Ours w.o. d2v NeRF+T Neural Volumes

"Ghost density" is visible everywhere for NeRF+T. NeuralVolumes has some "ghost density" in white and some in the color of the sphere. Ours reconstructs the smoke nicely, while Ours w.o. d2v is slightly blurry


Our Results

Unsupervised Separation of Static and Dynamic parts

Static and dynamic components are nicely seperated.

Estimated Density and Velocity

With the model-based supervision, our full model presents more accurate velocity with clearer vorticity.


3. Complex Scenes with Arbitrary Obstacles

In the following, we tested our algorithm on complex scenes with arbitrary obstacles. Our method can successfully separate dynamic fluids and obstacles without any human labeling, which is previously impossible.

3.1 The Car Scene (Fig.12 in the paper)

3.2 The Game Scene (Fig.13 in the paper)

3.3 Estimated Density and Velocity w.r.t. the Ground Truth (Fig.12, 13 in the paper)

Car Game

4. Training Details

Across all scenes, we use the Adam optimizer with a learning rate of 0.0001. Other details of each scene are given in the following table (using a single NVIDIA Quadro RTX 8000 GPU):

Scenes Image Resolution Total Training
Total Training
Hyper-parameters for
Radiance Supervision
Hyper-parameters for
Velocity Supervision
ScalarFlow Synthetic 360x640 200k 30h $\mathcal{L}_{\widetilde{\mathit{img}}} + 0.025\mathcal{L}_{VGG} + 0.1\mathcal{L}_{ghost}$ $2\mathcal{L}_{\frac{D\sigma}{Dt}} + 0.0005\mathcal{L}_{NSE} + 6\mathcal{L}_{d2v}$
Real 540x960 500k 74h $\mathcal{L}_{\widetilde{\mathit{img}}} + 0.025\mathcal{L}_{VGG} + 0.1\mathcal{L}_{ghost}$ $2\mathcal{L}_{\frac{D\sigma}{Dt}} + 0.0005\mathcal{L}_{NSE} + 6\mathcal{L}_{d2v}$
Plume 400x400 200k 31h $\mathcal{L}_{\widetilde{\mathit{img}}} + 0.025\mathcal{L}_{VGG} + 0.05\mathcal{L}_{ghost}$ $2\mathcal{L}_{\frac{D\sigma}{Dt}} + 0.0005\mathcal{L}_{NSE} + 6\mathcal{L}_{d2v}$
Sphere 400x400 150k 37h $\mathcal{L}_{\widetilde{\mathit{img}}} + 0.025\mathcal{L}_{VGG} + 0.05\mathcal{L}_{ghost} + 0.05\mathcal{L}_{overlay} $ $2\mathcal{L}_{\frac{D\sigma}{Dt}} + 0.0005\mathcal{L}_{NSE} + 6\mathcal{L}_{d2v}$
Car 960x500 200k 51h $\mathcal{L}_{\widetilde{\mathit{img}}} + 0.025\mathcal{L}_{VGG} + 0.01\mathcal{L}_{ghost} + 0.05\mathcal{L}_{overlay} $ $2\mathcal{L}_{\frac{D\sigma}{Dt}} + 0.0005\mathcal{L}_{NSE} + 6\mathcal{L}_{d2v}$
Game 800x800 250k 64h $\mathcal{L}_{\widetilde{\mathit{img}}} + 0.025\mathcal{L}_{VGG} + 0.01\mathcal{L}_{ghost} + 0.05\mathcal{L}_{overlay} $ $2\mathcal{L}_{\frac{D\sigma}{Dt}} + 0.0005\mathcal{L}_{NSE} + 6\mathcal{L}_{d2v}$

For the ScalarFlow scenes, the ghost density regularization is more heavily weighted because the smoke is more transparent and only front views are given. In other scenes, the density-color ambiguity is less challenging because they either have denser smoke or they use more cameras and wider viewing angles. Therefore, the regularization term can be reduced.