Skip to content

Exponential Family of distributions

Exponential Family of distributions

The probability distributions we have discussed in the previous chapter (and many other frequently used ones) are members of the exponential family. The concept of exponential families is credited to E. J. G. Pitman, G. Darmois, and B. O. Koopman in 1935–36.

Definition: Exponential family is a parametrized set $\{p_{\theta}(x) | \theta \in \mathbb{R}^k, x \in \mathbb{R}^d \}$ of
PMF’s or PDF’s which can be written in the following way:
\begin{eqnarray}
p_{\theta}(x)=\frac{h(x)}{Z(\theta)}\mathrm{exp}\left( \sum_{i=1}^m \eta_i(\theta) s_i(x) \right)
\end{eqnarray}
where
\begin{eqnarray}
\eta_i : \mathbb{R}^k &\rightarrow& \mathbb{R} \\
s_i : \mathbb{R}^d &\rightarrow& \mathbb{R} \\
h : \mathbb{R}^d &\rightarrow& [0,\infty] \\
Z : \mathbb{R}^k &\rightarrow& [0,\infty]
\end{eqnarray}
We have not touched on the topic of “sufficient statistics” yet, but $s_i(x)$’s will turn out to be the “sufficient statistics” of $p_{\theta}(x)$. $h(x)$ is known as the ”support“ of the distribution. $Z(\theta)$ is known as the ”partition function”.
During our discussion of exponential families, we will need the following functions:

Definition: For $x \in \mathbb{R}$,
\begin{equation}
R_{[a,b]}(x)=
\begin{cases}
&1, \qquad x \in [a,b]\\
&0, \qquad \mathrm{Otherwise}\\
\end{cases}
\end{equation}
{\bf Definition:} For $x \in \mathbb{I}$,
\begin{equation}
I_{[a,b]}(x)=
\begin{cases}
&1, \qquad x \in [a,b]\\
&0, \qquad \mathrm{Otherwise}\\
\end{cases}
\end{equation}
Example 1: Bernoulli trials belong to the Exponential family. The PMF of a bernoulli trial is
\begin{equation}
P_{\theta}(x)=
\begin{cases}
&1-\theta, \qquad \mathrm{x=0}\\
&\theta, \quad \mathrm{x=1}\\
&0, \qquad \mathrm{Otherwise}\\
\end{cases}
\end{equation}
or, written more compactly
\begin{eqnarray}
P_{\theta}(x)= \theta^{x-1}(1-\theta)^{x} I_{[0,1]}(x) R_{[0,1]}(\theta)
\end{eqnarray}
which can be rewritten as
\begin{eqnarray}
&=&\mathrm{exp}\left[ \mathrm{log} \left( \theta^{(x-1)}(1-\theta)^{x} \right) \right]I_{[0,1]}(x) R_{[0,1]}(\theta)\\
&=&\mathrm{exp}\left[ \mathrm{log} \theta^{(x-1)}+\mathrm{log}(1-\theta)^{x} \right]I_{[0,1]}(x) R_{[0,1]}(\theta)\\
&=&\mathrm{exp}\left[ (x-1) \mathrm{log} \theta+ x\mathrm{log}(1-\theta) \right]I_{[0,1]}(x) R_{[0,1]}(\theta)
\end{eqnarray}
Hence
\begin{eqnarray}
\eta_1(\theta)&=&\mathrm{log}(\theta)\\
\eta_2(\theta)&=&\mathrm{log}(1-\theta)\\
s_1(x)&=&x-1\\
s_2(x)&=&x\\
h(x)&=&I_{[0,1]}(x)\\
\frac{1}{Z(\theta)}&=&R_{[0,1]}(\theta)
\end{eqnarray}
Exponential representation of a distribution is not unique.

Example 2: It is possible to write a different exponential representation for Bernoulli PMF:
\begin{eqnarray}
P_{\theta}(x)&=&\theta^x(1-\theta)^{1-x}I_{[0,1]}(x) R_{[0,1]}(\theta) \\
&=& (1-\theta)\left(\frac{\theta}{1-\theta}\right)^x I_{[0,1]}(x) R_{[0,1]}(\theta)\\
&=& (1-\theta)\mathrm{exp}\left[x\mathrm{log}\left(\frac{\theta}{1-\theta}\right)\right] I_{[0,1]}(x) R_{[0,1]}(\theta)
\end{eqnarray}
which will yield quite different exponential family parameters:
\begin{eqnarray}
\eta(\theta)&=&\mathrm{log}\left(\frac{\theta}{1-\theta}\right)\\
s(x)&=&x\\
h(x)&=&I_{[0,1]}(x)\\
\frac{1}{Z(\theta)}&=&(1-\theta)R_{[0,1]}(\theta)
\end{eqnarray}

Example 3: Binomial PMF is a member of the Exponential family. It can be written as
\begin{eqnarray}
P(X=k)&=&\binom{n}{k}\theta^k (1-\theta)^{n-k}I_{[0,n]}(k) R_{[0,1]}(\theta) \\
&=&\binom{n}{k}(1-\theta)^n \exp\left( k\log\left[\frac{\theta}{1-\theta} \right]\right)I_{[0,n]}(k) R_{[0,1]}(\theta) \nonumber
\end{eqnarray}
where
\begin{eqnarray}
\eta(\theta)&=&\mathrm{log}\left(\frac{\theta}{1-\theta}\right)\\
s(k)&=&k\\
h(k)&=&\binom{n}{k}I_{[0,n]}(k)\\
\frac{1}{Z(\theta)}&=&(1-\theta)^n R_{[0,1]}(\theta)
\end{eqnarray}

Example 4: Geometric PMF is a member of exponential family:
\begin{eqnarray}
P(k)&=&(1-\theta)^{k-1}\theta I_{[1,\infty]}(k) R_{[0,1]}(\theta)\\
&=& \exp\left( (k-1)\log(1-\theta)\right)\theta I_{[1,\infty]}(k) R_{[0,1]}(\theta)
\end{eqnarray}
Here
\begin{eqnarray}
\eta(\theta)&=&\mathrm{log}\left(1-\theta\right)\\
s(k)&=&k-1\\
h(k)&=&I_{[0,\infty]}(k)\\
\frac{1}{Z(\theta)}&=&\theta R_{[0,1]}(\theta)
\end{eqnarray}

Example 5: Pascal PMF is a member of exponential family:
\begin{eqnarray}
p_{L_r}(k) &=&\binom{k-1}{r-1}\theta^r (1-\theta)^{k-r} I_{[r,\infty]}(k) R_{[0,1]}(\theta)\\
&=& \binom{k-1}{r-1}\left(\frac{\theta}{1-\theta} \right)^r \exp( k \log(1-\theta))I_{[r,\infty]}(k) R_{[0,1]}(\theta) \nonumber
\end{eqnarray}
where
\begin{eqnarray}
\eta(\theta)&=&\mathrm{log}\left(1-\theta\right)\\
s(k)&=&k\\
h(k)&=&\binom{k-1}{r-1}I_{[0,\infty]}(k)\\
\frac{1}{Z(\theta)}&=&\left(\frac{\theta}{1-\theta} \right)^r R_{[0,1]}(\theta)
\end{eqnarray}

Example 6: Poisson PMF is a member of exponential family:
\begin{eqnarray}
P_X(k)&=&\frac{\theta^k e^{-\theta}}{k!} R_{[0,\infty]}(\theta)I_{[0,\infty]}(k)\\
&=&\frac{e^{k\log(\theta)} e^{-\theta}}{k!} R_{[0,\infty]}(\theta)I_{[0,\infty]}(k)
\end{eqnarray}
where
\begin{eqnarray}
\eta(\theta)&=&\mathrm{log}\left(\theta\right)\\
s(k)&=&k\\
h(k)&=&\frac{1}{k!} I_{[0,\infty]}(k)\\
\frac{1}{Z(\theta)}&=&e^{-\theta} R_{[0,1]}(\theta)
\end{eqnarray}

Example 7: Exponential PDF is \footnote{Note that in exp. pdf both $x$ and $\theta$ must be
positive, which is enforced by the unit step functions.}
\begin{eqnarray}
p_{\theta}(x)=\theta e^{-\theta x}U(x)U(\theta)
\end{eqnarray}
Here $\eta(\theta)=\theta$, $s(x)=-x$, $h(x)=U(x)$, $Z(\theta)=\frac{1}{\theta}$

Example 8: Erlang Pdf is a member of exponential family:
\begin{eqnarray}
1
\end{eqnarray}

Definition: An exponential family is said to be in natural or canonical form when $\eta_i(\theta)=\theta_i$. It is always possible to put an exponential family into natural/canonical form.

Example 9: A counterexample is the uniform distribution on $(0,\theta)$, which does not belong to exponential family. Here $\theta$ is the parameter. In exponential distibutions, parameter usually does not govern the support of the distribution. Support is decided by $h(x)$.

Example 10: A single uniform distribution on $(0,a)$ where $a$ is a constant is trivially an ”exponential family“ with a single
member, as it does not have a parameter.

Example 11: Poisson PMF constitutes an exponential family:
\begin{eqnarray}
P(x)=\frac{e^{-\lambda t} }{x!} e^{x \ln (\lambda t)}U(x)U(\lambda t)
\end{eqnarray}

Example 12: Gaussian PDF constitute an exponential family:
\begin{eqnarray}
f(x)=
\end{eqnarray}

Conjugate Priors

(in a certain sense) exponential families are the only distributions with conjugate priors, which are very important in Bayesian statistics.
Also, they arise as solutions to certain types of maximum entropy problems.