differentiable renderer knowledge
知识框架
pipeline:
- 用DMTet 输出tetrahedral grid每个顶点的SDF和offset
- Marching tet得到mesh
- 用differentiable rasterizer calculate texture and light, render the result to 2D image
- calculate image loss and do optimization
Optimization task:**
L = Limage + Lmask + λLreg
- Limage : Image reconstruction loss
- the loss between rendered image and target image
- Need to do tone mapping, since neural rendering outputs linear image, we need to gamma correct it first.
- The purpose: they need to all in same colorspace
tone map operator:
- Linear radiance values: $x$
- sRGB transfer function: $\Gamma(x)$
↓ - tone map 之后的颜色值: $x’ = \Gamma(\log(x + 1))$
$$
\Gamma(x) =
\begin{cases}
12.92x & x \leq 0.0031308 \
(1 + a)x^{1/2.4} - a & x > 0.0031308
\end{cases}
\quad
a=0.055
$$
基本上就是说,在做tone mapping的的时候先做了一个$log(x+1)$之后套了sRGB 曲线 $\Gamma(x)$
L1 norm:
$$
| \mathbf{x}’ |_1 = \sum_i |x’_i|
$$
L1 norm是对这些tone-mapped的RGB value 取绝对值后求和
-
Lmask:
- foreground and background loss supervision
- mask loss -> foreground L2 loss
- help model to learn object’s boundary
-
Lreg:
- Target: reduce floaters and internal geometry
$$
L_{\text{reg}} = \sum_{i,j \in S_e}
H(\sigma(s_i), \text{sign}(s_j)) + H(\sigma(s_j), \text{sign}(s_i))
$$
它的结构是两个方向的 binary cross entropy。
目的:让相邻两个点的SDF符号一致。
- $σ(x)$:sigmoid 函数,输出在 0 到 1 之间
- $sign(x)$:取符号,正数变 1,负数变 0
- $H(p,y)$:二元交叉熵损失(BCE),输入一个概率 pp,目标值 yy
这么理解:
$σ(x)$理解成x为正的概率,也就是这个点在外面的概率
- 如果 $s_i$ 是正(在物体外),那就鼓励 $s_j$ 也正
- 如果 $s_i$ 是负(在物体内),那就鼓励 $s_j$ 也负
- 所以 BCE 的目标就是让 sigmoid(s) ≈ sign(邻居的 s)
DMTet
对比:
- DMTet: hybrid 3D representation (represent a shape with a discrete SDF defined on vertices of a deformable tetrahedral grid)
- NeRF: volumetric representation
- NeuS: implicit surface representation
Deep Marching Tetrahedra (hybrid 3D representation that combines both implicit and explicit 3D surface representations)
Form: SDF defined on vertices of a deformable tetrahedral grid.
↓
use differentiable Marching Tetrahedra layer(MT) to convert SDF to triangular mesh
↓
Process:
starts from a uniform tetrahedral grid of predefined resolution
- uniform tetrahedral grid是3D空间里均匀分布的四面体网格,可看作是voxel grid的三角化版本
- DMTet让网络不仅输出SDF值,害预测每个vertex的偏移向量,表示从规则的网络出发,要移动多少才更接近真实表面
↓
use network to predict SDF value and deviation vector - encoded SDF: SDF通过一个比如MLP
↓
↓
↓
↓
↓
↓