From c51ce71a79e83124778c7cabf123b67adcf79456 Mon Sep 17 00:00:00 2001 From: burtenshaw Date: Fri, 12 Sep 2025 09:30:03 +0200 Subject: [PATCH 1/2] make latex fences conform to docs builder --- unit1/02_diffusion_models_from_scratch.ipynb | 18 +++++++++++------- 1 file changed, 11 insertions(+), 7 deletions(-) diff --git a/unit1/02_diffusion_models_from_scratch.ipynb b/unit1/02_diffusion_models_from_scratch.ipynb index 70b3c94..5748741 100644 --- a/unit1/02_diffusion_models_from_scratch.ipynb +++ b/unit1/02_diffusion_models_from_scratch.ipynb @@ -1478,19 +1478,23 @@ "source": [ "### The Corruption Process\n", "\n", - "The DDPM paper describes a corruption process that adds a small amount of noise for every 'timestep'. Given $x_{t-1}$ for some timestep, we can get the next (slightly more noisy) version $x_t$ with:

\n", + "The DDPM paper describes a corruption process that adds a small amount of noise for every 'timestep'. Given \\\\( x_{t-1} \\\\) for some timestep, we can get the next (slightly more noisy) version \\\\( x_t \\\\) with:

\n", "\n", - "$q(\\mathbf{x}_t \\vert \\mathbf{x}_{t-1}) = \\mathcal{N}(\\mathbf{x}_t; \\sqrt{1 - \\beta_t} \\mathbf{x}_{t-1}, \\beta_t\\mathbf{I}) \\quad\n", - "q(\\mathbf{x}_{1:T} \\vert \\mathbf{x}_0) = \\prod^T_{t=1} q(\\mathbf{x}_t \\vert \\mathbf{x}_{t-1})$

\n", + "$$q(\\mathbf{x}_t \\vert \\mathbf{x}_{t-1}) = \\mathcal{N}(\\mathbf{x}_t; \\sqrt{1 - \\beta_t} \\mathbf{x}_{t-1}, \\beta_t\\mathbf{I}) \\quad\n", + "q(\\mathbf{x}_{1:T} \\vert \\mathbf{x}_0) = \\prod^T_{t=1} q(\\mathbf{x}_t \\vert \\mathbf{x}_{t-1})$$\n", "\n", + "

\n", "\n", - "That is, we take $x_{t-1}$, scale it by $\\sqrt{1 - \\beta_t}$ and add noise scaled by $\\beta_t$. This $\\beta$ is defined for every t according to some schedule, and determines how much noise is added per timestep. Now, we don't necessarily want to do this operation 500 times to get $x_{500}$ so we have another formula to get $x_t$ for any t given $x_0$:

\n", "\n", - "$\\begin{aligned}\n", + "That is, we take \\\\( x_{t-1} )\\\\, scale it by \\\\( \\sqrt{1 - \\beta_t} )\\\\ and add noise scaled by \\\\( \\beta_t )\\\\. This \\\\( \\beta )\\\\ is defined for every t according to some schedule, and determines how much noise is added per timestep. Now, we don't necessarily want to do this operation 500 times to get \\\\( x_{500} )\\\\ so we have another formula to get \\\\( x_t )\\\\ for any \\\\( t )\\\\ given \\\\( x_0 )\\\\:

\n", + "\n", + "$$\\begin{aligned}\n", "q(\\mathbf{x}_t \\vert \\mathbf{x}_0) &= \\mathcal{N}(\\mathbf{x}_t; \\sqrt{\\bar{\\alpha}_t} \\mathbf{x}_0, \\sqrt{(1 - \\bar{\\alpha}_t)} \\mathbf{I})\n", - "\\end{aligned}$ where $\\bar{\\alpha}_t = \\prod_{i=1}^T \\alpha_i$ and $\\alpha_i = 1-\\beta_i$

\n", + "\\end{aligned}$$\n", + "\n", + "where \\\\( \\bar{\\alpha}_t = \\prod_{i=1}^T \\alpha_i )\\\\

\n", "\n", - "The maths notation always looks scary! Luckily the scheduler handles all that for us (uncomment the next cell to check out the code). We can plot $\\sqrt{\\bar{\\alpha}_t}$ (labelled as `sqrt_alpha_prod`) and $\\sqrt{(1 - \\bar{\\alpha}_t)}$ (labelled as `sqrt_one_minus_alpha_prod`) to view how the input (x) and the noise are scaled and mixed across different timesteps:\n" + "The maths notation always looks scary! Luckily the scheduler handles all that for us (uncomment the next cell to check out the code). We can plot \\\\( \\sqrt{\\bar{\\alpha}_t} )\\\\ (labelled as `sqrt_alpha_prod`) and \\\\( \\sqrt{(1 - \\bar{\\alpha}_t)} )\\\\ (labelled as `sqrt_one_minus_alpha_prod`) to view how the input (x) and the noise are scaled and mixed across different timesteps:\n" ] }, { From b6681068e6df70a9e60b5fc4b70decf59816cee6 Mon Sep 17 00:00:00 2001 From: burtenshaw Date: Fri, 12 Sep 2025 09:41:09 +0200 Subject: [PATCH 2/2] fix inline latex --- unit1/02_diffusion_models_from_scratch.ipynb | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/unit1/02_diffusion_models_from_scratch.ipynb b/unit1/02_diffusion_models_from_scratch.ipynb index 5748741..ebfd622 100644 --- a/unit1/02_diffusion_models_from_scratch.ipynb +++ b/unit1/02_diffusion_models_from_scratch.ipynb @@ -1486,15 +1486,15 @@ "

\n", "\n", "\n", - "That is, we take \\\\( x_{t-1} )\\\\, scale it by \\\\( \\sqrt{1 - \\beta_t} )\\\\ and add noise scaled by \\\\( \\beta_t )\\\\. This \\\\( \\beta )\\\\ is defined for every t according to some schedule, and determines how much noise is added per timestep. Now, we don't necessarily want to do this operation 500 times to get \\\\( x_{500} )\\\\ so we have another formula to get \\\\( x_t )\\\\ for any \\\\( t )\\\\ given \\\\( x_0 )\\\\:

\n", + "That is, we take \\\\(x_{t-1}\\\\), scale it by \\\\(\\sqrt{1 - \\beta_t}\\\\) and add noise scaled by \\\\(\\beta_t\\\\). This \\\\(\\beta\\\\) is defined for every t according to some schedule, and determines how much noise is added per timestep. Now, we don't necessarily want to do this operation 500 times to get \\\\(x_{500}\\\\) so we have another formula to get \\\\(x_t\\\\) for any \\\\(t\\\\) given \\\\(x_0\\\\):

\n", "\n", "$$\\begin{aligned}\n", "q(\\mathbf{x}_t \\vert \\mathbf{x}_0) &= \\mathcal{N}(\\mathbf{x}_t; \\sqrt{\\bar{\\alpha}_t} \\mathbf{x}_0, \\sqrt{(1 - \\bar{\\alpha}_t)} \\mathbf{I})\n", "\\end{aligned}$$\n", "\n", - "where \\\\( \\bar{\\alpha}_t = \\prod_{i=1}^T \\alpha_i )\\\\

\n", + "where \\\\(\\bar{\\alpha}_t = \\prod_{i=1}^T \\alpha_i\\\\)

\n", "\n", - "The maths notation always looks scary! Luckily the scheduler handles all that for us (uncomment the next cell to check out the code). We can plot \\\\( \\sqrt{\\bar{\\alpha}_t} )\\\\ (labelled as `sqrt_alpha_prod`) and \\\\( \\sqrt{(1 - \\bar{\\alpha}_t)} )\\\\ (labelled as `sqrt_one_minus_alpha_prod`) to view how the input (x) and the noise are scaled and mixed across different timesteps:\n" + "The maths notation always looks scary! Luckily the scheduler handles all that for us (uncomment the next cell to check out the code). We can plot \\\\(\\sqrt{\\bar{\\alpha}_t}\\\\) (labelled as `sqrt_alpha_prod`) and \\\\(\\sqrt{(1 - \\bar{\\alpha}_t)}\\\\) (labelled as `sqrt_one_minus_alpha_prod`) to view how the input (x) and the noise are scaled and mixed across different timesteps:\n" ] }, {