\[\newcommand{\norm}[1]{\left\lVert#1\right\rVert}\]

Part A: Fun with Duffision.

Part 1: Sampling Loops

1.1. Implementing the Forward Process

A key part of diffusion is the forward process (i.e. introducing noise to an image). For this part, we take the image of the Berkeley Campanile and incrementally add noise to it using:

\[x_t = \sqrt{\bar\alpha_t} x_0 + \sqrt{1 - \bar\alpha_t} \epsilon \quad \text{where}~ \epsilon \sim N(0, 1)\]

Here are the results at noise levels [0, 250, 500, 750]:

Original	250	500	750

1.2 Classical Denoising.

For comparison to later methods, we’ll present the results of classical denoising using low-pass filter here:

250	500	750

1.3 One-Step Denoising.

For this part, we’ll use a pretrained U-Net to denoise our image. Notice the results are much better despite a bit of unwanted hallucination at higher noise levels:

250	500	750

1.4 Iterative Denoising

To achieve better result, we apply UNet denoising multiple times. Here’s the results at each timestep:

For comparison, we include previous results here:

Gauss	OneStep	Iterative

1.5 Diffusion Model Sample.

To sample from the diffusion, we iteratively denoise pure noise. Here are 5 samples:

1	2	3	4	5

1.6 CFG.

To achieve better samples, we use a mix of conditional/unconditional sampling. Here are 5 samples with CFG scale of 7:

1	2	3	4	5

1.7. Image2Image Translation.

i_start=1	i_start=3	i_start=5	i_start=7	i_start=10	i_start=20

1.7.1 Editing Hand-Drawn and Web Images.

i_start=1	i_start=3	i_start=5	i_start=7	i_start=10	i_start=20

1.7.2 Inpainting.

Original	Mask	Inpainting

1.7.3 Text-Conditioning.

i_start=1	i_start=3	i_start=5	i_start=7	i_start=10	i_start=20

1.8. Visual Anagrams

Old man	Campfire

Old man	Barista

Old man	Dog

1.9 Hybrid Images.

Hybrid of Skull and Waterfall

Hybrid of Skull and Dog

Hybrid of Skull and CampFire

Part B: Fun with Duffision.

Part 1: Training Single-Step Denoising UNet.

1.2. Forward

First, we add implement a forward (adding noise) process. Here’s the result of adding increasing noise to the original image:

1.2.1 Training.

Now, we train our UNet to denoise an image with noise level sigma=0.5 applied to it.

Here’s our loss curve:

and here’s our results denoising at epoch 5:

and at epoch 1:

1.2.2 Out-of-Distribution Testing.

We’ll now try to apply our UNet on noise levels it wasn’t trained for. Here are the results for noise levels [0.0, 0.2, 0.4, 0.5, 0.6, 0.8, 1.0]:

Notice the denoising quality gets progressively worse.

Part 2: Training a Diffusion Model.

2.1 Time-Conditioning.

We train our UNet now with time-conditioning. Here’s the loss curve:

Here’s our results sampling from the time-conditioned UNet at certain epochs:

Epoch 5	Epoch 20

2.4 Class-Conditioning

Now we train our UNet with class-conditioning (on top of time-conditioning). Here’s the loss curve:

Here’s our results sampling from the class-conditioned UNet at certain epochs:

Epoch 5	Epoch 20

Bells & Whistles

Here’s our gifs for time-conditioned UNet

Epoch 1	Epoch 5	Epoch 20

and class-conditioned UNet

Epoch 1	Epoch 5	Epoch 20