It is sometimes referred to as the **likelihood** of the data, and sometimes referred to as a **statistical model**. The difference is whether we are looking at \(p(x | \theta)\) asβ¦

## a function of \(x\), where \(\theta\) is known π

If \(\theta\) is a known model parameter, then \(p_x(x|\theta) = p(x; \theta) = p_\theta(x)\) is the probability of \(x\) according to a model parameterized by \(\theta\), also known as a model/statistical model/observation model measuring uncertainty about \(x\) given \(\theta\).

(If \(\theta\) is a known random variable, \(p(x|\theta)\) is just a **conditional probability**, \(\frac{p(x, \theta)}{p(\theta)}\).)

## a function of \(\theta\), where \(x\) is known π

Unlike the above, the emphasis is on investigating the unknown \(\theta\).

\(p(x|\theta)\) is the probability of some observed data \(x\), that resulted from the random variable \(\theta\) taking on different values.

When doing MLE to find the assignment \(\hat{\theta}\) for \(\theta\) that maximizes likelihood \(p(x|\theta)\), \(p(x|\hat{\theta})\) is also called the **maximum likelihood of \(\theta\) given \(x\), \(\mathcal L(\hat\theta|x)\)**.

In other words, itβs a function of \(\theta\) (written more explicitly as \(p_\theta(x|\theta)\)) that measures the extent to which observed \(x\) supports particular values of \(\theta\) in a parametric model.