DDPM背景及原理推导---淡蓝小点 • Jianghc's Blog

最近看到一直关注的主播更新了有关DDPM的相关原理，因此来学习一下，简单记录。

DDPM包括三个主要的步骤，前向过程（foward process，或者也可以成为扩散过程diffusion process）、反向过程（reverse process，或者也称为 denoising process）以及采样过程（sampling procedure）。其中前向过程可设为确定的，一般仅仅在训练过程中需要，反向过程是生成过程，一般也是确定的，最重要的是采样过程，是参数训练的重点，主要控制的是减噪声的量。

x0是训练数据，我们期望通过反向扩散后恢复得到 $x_0$ 的可能性是最大的，即最大化 $ln p_{\theta}(x_0)$ ，这里的q是前向的概率分布，p是逆向的概率分布

$\begin{aligned} \ln p_\theta\left(\mathbf{x}_0\right) & =\int q\left(\mathbf{x}_{1: T} \mid \mathbf{x}_0\right) \ln p_\theta\left(\mathbf{x}_0\right) \mathrm{d} \mathbf{x}_{1: T} \\ & =\int q\left(\mathbf{x}_{1: T} \mid \mathbf{x}_0\right) \ln \frac{p_\theta\left(\mathbf{x}_0\right) p_\theta\left(\mathbf{x}_{1: T} \mid \mathbf{x}_0\right)}{p_\theta\left(\mathbf{x}_{1: T} \mid \mathbf{x}_0\right)} \mathrm{d} \mathbf{x}_{1: T} \\ & =\int q\left(\mathbf{x}_{1: T} \mid \mathbf{x}_0\right) \ln \frac{p_\theta\left(\mathbf{x}_{0: T}\right)}{p_\theta\left(\mathbf{x}_{1: T} \mid \mathbf{x}_0\right)} \mathrm{d} \mathbf{x}_{1: T} \\ & =\int q\left(\mathbf{x}_{1: T} \mid \mathbf{x}_0\right) \ln \frac{p_\theta\left(\mathbf{x}_{0: T}\right)}{p_\theta\left(\mathbf{x}_{1: T} \mid \mathbf{x}_0\right)} \frac{q\left(\mathbf{x}_{1: T} \mid \mathbf{x}_0\right)}{q\left(\mathbf{x}_{1: T} \mid \mathbf{x}_0\right)} \mathrm{d} \mathbf{x}_{1: T} \\ & =\int q\left(\mathbf{x}_{1: T} \mid \mathbf{x}_0\right) \ln \frac{p_\theta\left(\mathbf{x}_{0: T}\right)}{q\left(\mathbf{x}_{1: T} \mid \mathbf{x}_0\right)} \frac{q\left(\mathbf{x}_{1: T} \mid \mathbf{x}_0\right)}{p_\theta\left(\mathbf{x}_{1: T} \mid \mathbf{x}_0\right)} \mathrm{d} \mathbf{x}_{1: T} \\ & =\int q\left(\mathbf{x}_{1: T} \mid \mathbf{x}_0\right) \ln \frac{p_\theta\left(\mathbf{x}_{0: T}\right)}{q\left(\mathbf{x}_{1: T} \mid \mathbf{x}_0\right)} \mathrm{d} \mathbf{x}_{1: T}+\int q\left(\mathbf{x}_{1: T} \mid \mathbf{x}_0\right) \ln \frac{q\left(\mathbf{x}_{1: T} \mid \mathbf{x}_0\right)}{p_\theta\left(\mathbf{x}_{1: T} \mid \mathbf{x}_0\right)} \mathrm{d} \mathbf{x}_{1: T} \\ & =\mathrm{E}_{q\left(\mathbf{x}_{1: T} \mid \mathrm{x}_0\right)}\left[\ln \frac{p_\theta\left(\mathbf{x}_{0: T}\right)}{q\left(\mathbf{x}_{1: T} \mid \mathbf{x}_0\right)}\right]+\mathrm{KL}\left(q\left(\mathrm{x}_{1: T} \mid \mathrm{x}_0\right) \| p_\theta\left(\mathrm{x}_{1: T} \mid \mathrm{x}_0\right)\right)\end{aligned}$

KL散度是标量，积分是多元积分，

KL散度非负，则最大化 $ln p_{\theta}(x_0)$ 等于最大化右侧的下界。右侧是左侧的变分下界 $\mathrm{E}_{q\left(\mathrm{x}_{1: T} \mid \mathrm{x}_0\right)}\left[\ln p_\theta\left(\mathrm{x}_0\right)\right] \geq \mathrm{E}_{q\left(\mathrm{x}_{1: T} \mid \mathrm{x}_0\right)}\left[\ln \frac{p_\theta\left(\mathrm{x}_{0: T}\right)}{q\left(\mathrm{x}_{1: T} \mid \mathrm{x}_0\right)}\right]$

神经网络学习的实际上是在时间步t情况下的，反向噪声的一个分布的情况，紧接着从这个分布当中去采样来对原有的图片进行恢复。 52.52s