64 lines
4.0 KiB
Markdown
64 lines
4.0 KiB
Markdown
# Study Guide: Undirected Graphical Models (Markov Random Fields)
|
|
|
|
**Date:** 2025.11.27
|
|
**Topic:** Potential Functions, Partition Function, and Conditional Independence in MRFs.
|
|
|
|
---
|
|
|
|
### **1. Recap: Decomposition in Undirected Graphs**
|
|
Unlike Directed Graphical Models (Bayesian Networks) which use conditional probabilities, **Undirected Graphical Models (Markov Random Fields - MRFs)** cannot directly use probabilities because there is no direction/causality. Instead, they decompose the joint distribution based on **Maximal Cliques**.
|
|
|
|
* **Cliques:** Subsets of nodes where every node is connected to every other node.
|
|
* **Maximal Clique:** A clique that cannot be expanded (e.g., in the example graph, the maximal cliques covers the graph).
|
|
* **Decomposition Rule:** The joint distribution is the product of functions defined over these maximal cliques.
|
|
|
|
---
|
|
|
|
### **2. Potential Functions ($\psi$)**
|
|
* **Definition:** For each maximal clique $C$, we define a **Potential Function** $\psi_C(x_C)$ (often denoted as $\phi$ or $\psi$).
|
|
* It is a **positive function** ($\psi(x) \ge 0$) mapping the state of the clique variables to a real number.
|
|
* It represents the "compatibility" or "energy" of that configuration.
|
|
* **Key Distinction:** A potential function is **NOT a probability**. It does not sum to 1. It is just a score (non-negative function).
|
|
* *Example:* $\psi_{12}(x_1, x_2)$ scores the interaction between $x_1$ and $x_2$.
|
|
|
|
---
|
|
|
|
### **3. The Partition Function ($Z$)**
|
|
Since the product of potential functions is not a probability distribution (it doesn't sum to 1), we must normalize it.
|
|
|
|
* **Definition:** The normalization constant is called the **Partition Function** ($Z$).
|
|
$$Z = \sum_{x} \prod_{C} \psi_C(x_C)$$
|
|
* **Role:** It ensures that the resulting distribution sums to 1, making it a valid probability distribution.
|
|
$$P(x) = \frac{1}{Z} \prod_{C} \psi_C(x_C)$$
|
|
* **Calculation:** To find $Z$, we must sum the product of potentials over **all possible states** (combinations) of the random variables. This summation is often computationally expensive.
|
|
|
|
#### **Example Calculation**
|
|
The lecture walks through a simple example with 4 binary variables and two cliques: $\{x_1, x_2, x_3\}$ and $\{x_3, x_4\}$.
|
|
* **Step 1:** Define potential tables for $\psi_{123}$ and $\psi_{34}$.
|
|
* **Step 2:** Calculate the score for every combination.
|
|
* **Step 3:** Sum all scores to get $Z$. In the example, $Z=10$.
|
|
* **Step 4:** The probability of any specific state (e.g., $P(1,0,0,0)$) is its specific score divided by $Z$ (e.g., $(1 \times 3)/10$ or similar depending on values).
|
|
|
|
---
|
|
|
|
### **4. Parameter Estimation**
|
|
* **Discrete Case:** If variables are discrete (like the email spam example), the parameters are the entries in the potential tables. We estimate these values from data to maximize the likelihood.
|
|
* **Continuous Case:** If variables are continuous, potential functions are typically Gaussian distributions. We estimate means and covariances.
|
|
* **Reduction:** Just like in Bayesian Networks, using the graph structure reduces the number of parameters.
|
|
* *Without Graph:* A full table for 4 binary variables needs $2^4 = 16$ entries.
|
|
* *With Graph:* We only need tables for the cliques, significantly reducing complexity.
|
|
|
|
---
|
|
|
|
### **5. Verifying Conditional Independence**
|
|
The lecture demonstrates analytically that the potential function formulation preserves the conditional independence properties of the graph.
|
|
|
|
* **Scenario:** Graph with structure $x_1 - x_2 - x_3 - x_4$.
|
|
* Is $x_4$ independent of $x_1$ given $x_3$?
|
|
* **Analytical Check:**
|
|
* We calculate $P(x_4=1 | x_1=0, x_2=1, x_3=1)$.
|
|
* We also calculate $P(x_4=1 | x_1=0, x_2=0, x_3=1)$.
|
|
* **Result:** The calculation shows that as long as $x_3$ is fixed (given), the value of $x_1$ and $x_2$ cancels out in the probability ratio.
|
|
* $P(x_4|x_1, x_2, x_3) = \frac{\phi_1}{\phi_1 + \phi_0}$ (depends only on potentials involving $x_4$ and $x_3$).
|
|
* **Conclusion:** This confirms that $x_4 \perp \{x_1, x_2\} | x_3$. The formulation correctly encodes the global Markov property.
|