add to final

2025-12-06 18:32:08 +09:00
parent ac1d2e744d
commit 0fc412e690
21 changed files with 935 additions and 0 deletions
--- a/final/1106.md
+++ b/final/1106.md
@@ -0,0 +1,92 @@
+# Lecture Summary: Generative Methods & Probability Review
+
+**Date:** 2025.11.06
+**Topic:** Discriminative vs. Generative Models, Probability Theory, Probabilistic Inference, and Gaussian Distributions.
+
+---
+
+### 1. Classification Approaches: Discriminative vs. Generative
+
+The lecture begins by distinguishing between two fundamental approaches to machine learning classification, specifically for binary problems (labels 0 or 1).
+
+#### **Discriminative Methods (e.g., Logistic Regression)**
+* **Goal:** Directly model the decision boundary or the conditional probability $P(y|x)$.
+* **Mechanism:** Focuses on distinguishing classes. It learns a function that maps inputs $x$ directly to class labels $y$.
+* **Limitation:** It does not model the underlying distribution of the data itself.
+
+#### **Generative Methods**
+* **Goal:** Model the joint probability or the class-conditional density $P(x|y)$ and the class prior $P(y)$.
+* **Mechanism:** It learns "how the data is generated" for each class.
+* **Classification:** To classify a new point, it uses **Bayes' Rule** to invert the probabilities:
+    $$P(y|x) = \frac{P(x|y)P(y)}{P(x)}$$
+* **Advantage:** If you know the generative model, you can solve the classification problem *and* generate new data samples.
+
+---
+
+### 2. Probability Theory Review
+
+To understand Generative Methods, a strong foundation in probability is required.
+
+#### **Random Variables**
+* **Definition:** A random variable is technically a **function** (mapping) that assigns a real number to an outcome (event $\omega$) in the sample space $\Omega$.
+* **Example:** Tossing a coin 4 times. An event might be "HHTH", and the random variable $X(\omega)$ could be "number of heads" (which equals 3).
+
+#### **Probability vs. Probability Density Function (PDF)**
+The lecture emphasizes distinguishing between discrete probability ($P$) and continuous density ($p$).
+
+* **Discrete Probability ($P$):** Defined as the ratio of cardinalities (counts) or areas in discrete sets (e.g., Venn diagrams).
+    * **Probability Density Function ($p$):** Used for continuous variables.
+    * **Properties:** $p(x) \ge 0$ for all $x$, and $\int p(x)dx = 1$.
+    * **Relationship:** The probability of $x$ falling within a range is the **integral** (area under the curve) of the PDF. The probability of a specific point $P(x=x_0)$ is 0.
+    
+#### **Key Statistics**
+* **Expectation ($E[x]$):** The mean or weighted average of a random variable.
+    $$E[x] = \int x p(x) dx$$
+* **Covariance:** Measures the spread or variance of the data. For vectors, this results in a Covariance Matrix.
+    $$Cov[x] = E[(x - \mu)(x - \mu)^T]$$
+
+---
+
+### 3. The Trinity of Distributions: Joint, Conditional, and Marginal
+
+Understanding the relationship between these three is crucial for probabilistic modeling.
+
+#### **Joint PDF ($P(x_1, x_2)$)**
+* This represents the probability of $x_1$ and $x_2$ occurring together.
+* **Importance:** If you know the Joint PDF, you know *everything* about the system. You can derive all other probabilities (marginal, conditional) from it.
+
+#### **Conditional PDF ($P(x_1 | x_2)$)**
+* Represents the probability of $x_1$ given that $x_2$ is fixed to a specific value.
+* Visually, this is like taking a "slice" of the joint distribution 3D surface at $x_2 = a$.
+
+#### **Marginal PDF ($P(x_1)$)**
+* Represents the probability of $x_1$ regardless of $x_2$.
+* **Calculation:** You "marginalize out" (integrate or sum) the other variables.
+    * Continuous: $P(x_1) = \int P(x_1, x_2) dx_2$.
+    * Discrete: Summing rows or columns in a probability table.
+
+---
+
+### 4. Probabilistic Inference
+
+**Inference** is defined as calculating a desired probability (e.g., a prediction) starting from the Joint Probability function using rules like Bayes' theorem and marginalization.
+
+#### **Handling Missing Data**
+A major practical benefit of generative models (Joint PDF modeling) over discriminative models (like Logistic Regression) is robust handling of missing data.
+* **Scenario:** You have a model predicting disease ($y$) based on Age ($x_1$), Blood Pressure ($x_2$), and Oxygen ($x_3$).
+* **Problem:** A patient arrives, but you cannot measure Age ($x_1$). A discriminative model might fail or require value imputation (guessing averages).
+* **Probabilistic Solution:** You integrate (marginalize) out the missing variable $x_1$ from the joint distribution to get the probability based only on observed data:
+    $$P(y | x_2, x_3) = \frac{\int p(x_1, x_2, x_3, y) dx_1}{P(x_2, x_3)}$$.
+
+---
+
+### 5. The Gaussian Distribution
+
+The lecture concludes with a review of the Gaussian (Normal) distribution, the most important function in AI/ML.
+
+* **Univariate Gaussian:** Defined by mean $\mu$ and variance $\sigma^2$.
+* **Multivariate Gaussian:** Defined for a vector $x \in R^D$.
+    $$P(x) = \frac{1}{(2\pi)^{D/2} |\Sigma|^{1/2}} \exp\left(-\frac{1}{2} (x-\mu)^T \Sigma^{-1} (x-\mu)\right)$$.
+* **Parameters:**
+    * $\mu$: Mean vector ($D$-dimensional).
+    * $\Sigma$: Covariance Matrix ($D \times D$). It must be **Symmetric** and **Positive Definite**.