add to final

2025-12-06 18:32:08 +09:00
parent ac1d2e744d
commit 0fc412e690
21 changed files with 935 additions and 0 deletions
--- a/final/1110.md
+++ b/final/1110.md
@@ -0,0 +1,104 @@
+# Study Guide: Generative Methods & Multivariate Gaussian Distributions
+
+**Date:** 2025.12.01
+**Topic:** Generative vs. Discriminative Models, Multivariate Gaussian Properties, Conditional and Marginal Distributions.
+
+---
+
+### **1. Generative vs. Discriminative Methods**
+
+The lecture begins by contrasting the new topic (Generative Methods) with previous topics (Discriminative Methods like Linear Regression, Logistic Regression, and SVM).
+
+* **Discriminative Methods (Separating):**
+    * These methods focus on finding a boundary (separating line or hyperplane) between classes.
+    * **Limitation:** They cannot generate new data samples because they do not model the data distribution; they only know the boundary.
+    * **Hypothesis:** They assume a linear line or function as the hypothesis to separate data.
+
+* **Generative Methods (Inferring Distribution):**
+    * **Goal:** To infer the **underlying distribution** (the rule or pattern) from which the data samples were drawn.
+    * **Assumption:** Data is not random; it follows a specific probabilistic structure (e.g., drawn from a distribution).
+    * **Capabilities:** Once the Joint Probability Distribution (underlying distribution) is known:
+        1.  **Classification:** Can be performed using Bayes' Rule.
+        2.  **Generation:** New samples can be created that follow the same patterns as the training data (e.g., generating new images or text).
+
+
+
+---
+
+### **2. The Gaussian (Normal) Distribution**
+
+The Gaussian distribution is the most popular choice for modeling the "hypothesis" of the underlying distribution in generative models.
+
+#### **Why Gaussian?**
+1.  **Simplicity:** Defined entirely by two parameters: Mean ($\mu$) and Covariance ($\Sigma$).
+2.  **Central Limit Theorem:** Sums of independent random events tend to follow a Gaussian distribution.
+3.  **Mathematical "Closure":** The most critical reason for its use in AI is that **Conditional** and **Marginal** distributions of a Multivariate Gaussian are *also* Gaussian.
+
+#### **Multivariate Gaussian Definition**
+For a $D$-dimensional vector $x$:
+$$P(x) = \frac{1}{(2\pi)^{D/2} |\Sigma|^{1/2}} \exp\left(-\frac{1}{2} (x-\mu)^T \Sigma^{-1} (x-\mu)\right)$$
+* $\mu$: Mean vector ($D$-dimensional).
+* $\Sigma$: Covariance Matrix ($D \times D$).
+
+
+
+[Image of multivariate gaussian distribution 3d plot]
+
+
+#### **Properties of the Covariance Matrix ($\Sigma$)**
+* **Symmetric:** $\Sigma_{ij} = \Sigma_{ji}$.
+* **Positive Definite:** All eigenvalues are positive.
+* **Diagonal Terms:** Represent the variance of individual variables.
+* **Off-Diagonal Terms:** Represent the correlation (covariance) between variables.
+    * If $\sigma_{12} = 0$, the variables are **independent** (for Gaussians).
+    * The matrix shape determines the geometry of the distribution contours (spherical vs. elliptical).
+
+---
+
+### **3. Independence and Factorization**
+
+If the Covariance Matrix is **diagonal** (all off-diagonal elements are 0), the variables are independent.
+* Mathematically, the inverse matrix $\Sigma^{-1}$ is also diagonal.
+* The joint probability factorizes into the product of marginals:
+    $$P(x_1, x_2) = P(x_1)P(x_2)$$
+* The "quadratic form" inside the exponential splits into a sum of separate squared terms.
+
+---
+
+### **4. Conditional Gaussian Distribution**
+
+The lecture derives what happens when we observe a subset of variables (e.g., $x_2$) and want to determine the distribution of the remaining variables ($x_1$). This is $P(x_1 | x_2)$.
+
+* **Concept:** Visually, this is equivalent to "slicing" the joint distribution at a specific value of $x_2$ (fixed constant).
+* **Result:** The resulting cross-section is **also a Gaussian distribution**.
+* **Parameters:** If we partition $x$, $\mu$, and $\Sigma$ into subsets, the conditional mean ($\mu_{1|2}$) and covariance ($\Sigma_{1|2}$) are given by:
+    * **Mean:** $\mu_{1|2} = \mu_1 + \Sigma_{12}\Sigma_{22}^{-1}(x_2 - \mu_2)$.
+    * **Covariance:** $\Sigma_{1|2} = \Sigma_{11} - \Sigma_{12}\Sigma_{22}^{-1}\Sigma_{21}$.
+    *(Note: The derivation involves completing the square to identify the Gaussian form).*
+
+
+
+---
+
+### **5. Marginal Gaussian Distribution**
+
+The lecture explains how to find the distribution of a subset of variables ($x_1$) by ignoring the others ($x_2$). This is $P(x_1)$.
+
+* **Concept:** This is equivalent to integrating out the unobserved variables:
+    $$P(x_1) = \int P(x_1, x_2) dx_2$$
+* **Result:** The marginal distribution is **also a Gaussian distribution**.
+* **Parameters:** Unlike the conditional case, calculating the marginal parameters is trivial. You simply select the corresponding sub-vector and sub-matrix from the joint parameters.
+    * Mean: $\mu_1$.
+    * Covariance: $\Sigma_{11}$.
+
+
+
+### **Summary Table**
+
+| Distribution | Type | Parameters Derived From Joint $(\mu, \Sigma)$ |
+| :--- | :--- | :--- |
+| **Joint** $P(x)$ | Gaussian | Given as $\mu, \Sigma$ |
+| **Conditional** $P(x_1 \| x_2)$ | Gaussian | Complex formula (involves matrix inversion of $\Sigma_{22}$) |
+| **Marginal** $P(x_1)$ | Gaussian | Simple subset (extract $\mu_1$ and $\Sigma_{11}$) |
+
+The lecture concludes by emphasizing that understanding these Gaussian properties is essential for the second half of the semester, as they form the basis for probabilistic generative models.