02 Multiple random variables

Random vector

A random vector is an n-dim vector $X = (A_{i})$ , which are a bunch of jointly distributed random variables. Similarly, $X$ can be a $m \times n$ random matrix.

Below, we consider $X = (X_{1}, X_{2})$ , where $X_{1} : (S_{1}; F_{1}, v_{1}) \to R_{1}$ and $X_{2} : (S_{2}; F_{2}, v_{2}) \to R_{2}$ .

A random vector is itself a random variable $X : (S_{1} \times S_{2}; F_{1} \times F_{2}, v) \to (R_{1} \times R_{2})$ .

Marginalization

The marginalization properties of the joint/ product probability space leads to: $P r (X_{1} \in E_{1}, X_{2} \in S_{2}) = P r (X_{1} \in E_{1})$ , so $\int_{E_{1} \times S_{2}} f_{X} (x) d v = \int_{E_{1}} \int_{x_{2} \in S_{2}} f_{X} (x_{1}, x_{2}) d v_{2} d v_{1} = \int_{E_{1}} f_{X_{1}} (x_{1}) d v_{1}$ .

Hence, $\int_{x_{2} \in S_{2}} f_{X} (x_{1}, x_{2}) d v_{2} = f_{X_{1}} (x_{1})$ .

Conditional pdf

Definition (described elsewhere) \of conditional probabilities of the form $P r (A | B)$ breaks down if $P r (B)$ , the probability measure of the event $B$ is 0.

One can craft a similar definition to cover events $X_{2} = b$ with $v_{2} (X_{2} = b) = 0$ . Then, \ $P r (X_{1} \in E_{1} | X_{2} = b) = \frac{P r (X_{1} \in E_{1} \land X_{2} = b)}{f_{X_{2}} (b)} = \int_{E_{1}} f_{X} (x_{1}, b) f_{X_{2}} (b)^{- 1} d v_{1}$ .

$f_{X} (x_{1}, b) f_{X_{2}} (b)^{- 1} = f_{X_{1} | X_{2} = b} (x_{1})$ is aka conditional pdf.

Inversion

Similar to the Bayes’s rule, using the definition, one can invert the conditional pdf.

$f_{X_{2} | X_{1} = x_{1}} (x_{2}) = \frac{f_{X_{1} | X_{2} = x_{2}} (x_{1}) f_{X_{2}} (x_{2})}{f_{X_{1}} (x_{1})} = \frac{f_{X_{1} | X_{2} = x_{2}} (x_{1}) f_{X_{2}} (x_{2})}{\int_{S_{2}} f_{X_{1} | X_{2} = x_{2}} (x_{1}) f_{X_{2}} (x_{2}) d v_{2}}$ .

Improper densities

Note that the construction of $f_{X_{2} | X_{1} = x_{1}} (x_{2})$ works even if the prior pdf $f_{X_{1}} (x_{1})$ is an improper density which does not sum to 1! This sometimes makes the task of modeling random processes easier.

Independence

One can extend the notion of independence of events to random variables, which represent a pair of algebras of events.

Suppose that $f_{X} (x) = f_{X_{1}} (x_{1}) f_{X_{2}} (x_{2})$ . Then, $\forall E_{1}, E_{2} : P r (X_{1} \in E_{1}, X_{2} \in E_{2}) = P r (X \in E_{1}) P r (X_{2} \in E_{2})$ . In such a case, $X_{1}$ and $X_{2}$ are independent. This is denoted by $I (X_{1}, X_{2})$ .

Also independence of events corresponds to independence of corresponding Indicator random variables: $A ⊥ B$ if $I_{A} ⊥ I_{B}$ .

Conditional Independence

Conditional: $X ⊥ Y | Z$ $\equiv f_{X Y | Z} (x, y | z) = f_{X | Z} (x | z) f_{Y | Z} (y | z) \equiv f_{X | Y, Z} (x | y, z) = f_{X | Z} (x | z) \equiv f_{X, Y, Z} (x, y, z) = \frac{f_{X, Z} (x, z) f_{Y, Z} (y, z)}{f_{Z} (z)}$ .\ Marginal: $X ⊥ Y$ when $Z = ϕ$ .

Amongst sets of vars: ${X_{i}} ⊥ {Y_{i}} | {Z_{i}}$ iff \ $f_{(X_{i} | Y_{j}, {Z_{k}})} (x_{i} | y_{j}, {z_{k}}) = f_{(X_{i} | {Z_{k}})} (x_{i} | {z_{k}}) \forall i, j$ .

Marginal independence without conditional independent: \ $X ⊬ Y | X + Y$ . Conditional independent sans marginal independent: consider suitable Bayesian network.

Graphical models can be used to specify this.