Lecture 11

This document summarizes exponential families and generalized linear models. It discusses how many common probability distributions like Gaussian, binomial, and Poisson are members of the exponential family. Exponential families have natural parameters, sufficient statistics, and a log normalizer function. Generalized linear models extend linear regression by linking the linear predictor to the response via a link function and assuming the response comes from an exponential family distribution.

Uploaded by

Tiffany Persad

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views

Lecture 11

Uploaded by

Tiffany Persad

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

LECTURE 11: EXPONENTIAL FAMILY AND GENERALIZED

LINEAR MODELS

HANI GOODARZI AND SINA JAFARPOUR

1. E XPONENTIAL FAMILY.
Exponential family comprises a set of flexible distribution ranging both
continuous and discrete random variables. The members of this family have
many important properties which merits discussing them in some general
format. Many of the probability distributions that we have studied so far are
specific members of this family:
• Gaussian: Rp
• Multinomial: categorical
• Bernoulli: binary {0, 1}
• Binomial: counts of success/failure
• Von mises: sphere
• Gamma: R+
• Poisson: N+
• Laplace: R+
• Exponential: R+
• Beta: (0, 1)
• Dirichlet: ∆ (Simplex)
• Weibull: R+
• Weishart: symmetric positive-definite matrices
All these distributions follow the general format:

p(x|η) = h(x) exp η > t(x) − a(η) ;

(1)

where, η is called “natural parameter”, t(x) is “sufficient statistic” (a statistic

is a function of data), h(x) is the “underlying measure” and a(η) is called
“log normalizer”, which ensures that the distribution integrates to one. Hence,
Z
h(x) exp η > t(x) dx.

a(η) = log

We start by showcasing a number of known distributions and illustrate that

they are indeed members of the exponential family.
1
2 HANI GOODARZI AND SINA JAFARPOUR

1.1. Bernoulli. Bernoulli distribution is defined on a binary (0 or 1) ran-

dom variable using parameter π where π = Pr(x = 1). The Bernoulli
distribution can be written as:
(2) p(x|π) = π x (1 − π)1−x .
In order to convert Equation (2) to the general exponential format (Equa-
tion (1)), we rewrite it as,
p(x|π) = exp{log π x (1 − π)1−x }

(3)
= exp{x log π + (1 − x) log(1 − π)}

π
= exp x log + log(1 − π)
1−π
In Equation (3),
π
• η = log 1−π ,
• t(x) = x,
• a(η) = − log(1 − π),
• and h(x) = 1.
To put a(η) in its correct form, we use the relationship between η and π:
π
(4) η = log ⇒
1−π
1−π 1
−η = log = log −1 ⇒
π π
1
e−η = −1⇒
π
1
π = = σ(η)
1 + e−η
Consequently,
(5) a(η) = log(1 + eη ),
and
(6) p(x|η) = σ(−η)e−ηx .

1.2. Multinomial. Although not discussed in the class, it is important to

see this process for the multinomial distribution as well. While the Bernoulli
is defined with the parameter π, multinomial has a vector of parameters µk
where k goes from 1 to M :
M
Y M
X
p(x|µ) = µxkk = exp{ xk log µk },
k=1 k=1
LECTURE 11: EXPONENTIAL FAMILY AND GENERALIZED LINEAR MODELS 3

M
where x = (x1 , x2 , · · · , xN )> and
P
µk = 1. Following the same process
k=1
as Bernoulli, we have:
M
X
p(x|η) = exp{η > x + log(1 + ηk )−1 },
k=1

where
exp(ηk )
(7) µk = P = softmax(k, η).
1 + j exp(ηj )

1.3. Poisson. Poisson is a discrete distribution defined to express the num-

ber events that occur in a unit of time or space. This distribution, which is
similar to Gaussian distribution but for count data, is given by

λx e−λ 1
(8) p(x|λ) = = exp{x log λ − λ},
x! x!
where
• η = λ,
• t(x) = x,
• a(η) = λ = eη ,
• and h(x) = x!1 .

1.4. Univariate Gaussian. Similarly, the Gaussian distribution can be also

rewritten in terms of the general exponential format;

P (x|µ, σ 2 ) =
−(x − µ)2

1
= √ exp
2πσ 2 2σ 2

1 µ 1 2 1 2
(9) = √ exp .x − 2 .x − 2 µ − log(σ) ,
2π σ2 2σ 2σ

where
• η = h σµ2 2σ
−1
2 i,

• t(x) = hx x2 i,
• h(x) = √12π ,
µ2 −η12
• and a(η) = 2σ 2
+ log(σ) = 4η2
− 12 log(−2η2 ).
4 HANI GOODARZI AND SINA JAFARPOUR

2. M OMENTS OF EXPONENTIAL FAMILY.

In the family of exponential distributions, the a(η) function is in fact the
generating function. We show this by derivatizing this term:
Z
d a(η) d >
(10) = log exp{η t(x)}h(x)dx
dη dη
d
exp η > t(x) h(x)dx
R
dη
= R
exp{η > t(x)}h(x)dx
t(x)h(x) exp{η > t(x)}dx
R
= R
exp{η > t(x)}h(x)dx
t(x) exp{η > t(x)}h(x)dx
R
=
exp{−a(η)}
Z
= t(x) exp{η > t(x) − a(η)}h(x)dx
= E [t(x)] .
Likewise, it can be shown that:
d2 a(η) 2
− E [t(x)]2 .

(11) 2
= Var (t(x)) = E t(x)
dη
For example, in Bernoulli distribution we have, a(η) = log(1 + eη ) is the
moment generating function. The first derivative of this function is given
by
d
d a(η) dη
(1 + eη ) 1
(12) = = = π = E[X].
dη 1+ eη 1 + e−η
In this context, µ defined as E [t(X)] can be computed from da(η) dη
which is
solely a function η. This relationship connects µ and η and since the func-
tion is convex (i.e. the second derivative is greater than 0), this relationship
is invertible. Thus we can define
(13) η = Ψ (µ).
where Ψ is a function which maps the natural (canonical) parameters to
the mean parameter.

3. G ENERALIZED LINEAR MODELS

The generalized linear model (GLM) is a powerful generalization of
linear regression to more general exponential family. Figure 3 demon-
strates the graphical model representation of a generalized linear model.
The model is based on the following assumptions:
LECTURE 11: EXPONENTIAL FAMILY AND GENERALIZED LINEAR MODELS 5

F IGURE 1. Representation of a generalized linear model

• The observed input enters the model through a linear function (β > X).
• The conditional mean of response, is represented as a function of
the linear combination:
.
(14) E[Y |X] = u = f (β > X).
• The observed response is drawn from an exponential family distri-
bution with conditional mean µ, as explained in Equation (13).
Figure 3 summarizes the relationships between the variables in a GLM. It

F IGURE 2. Relationship between the variables in a general-

ized linear model

is usually convenient to work with overdispersed exponential families. We

assume that the observed response comes from the following probability
distribution:
>
η y − a(η)
(15) p(y|η) = h(y, η) exp .
σ
For a fixed σ, Equation (15) is an exponential family, but as a function of σ,
it is not an exponential family since h is a function of both y and σ.
As a simple example, in the case of linear regression:
n 2o
1
• h(y, σ) = √2πσ exp −y2 ,
2
• a(η) = η2 ,
• f : identity,
• Ψ : identity.
6 HANI GOODARZI AND SINA JAFARPOUR

Consequently,
−y 2 ηy − η2/2

1
(16) p(y|η) = √ exp exp
2πσ 2 σ
−(y − η)2

1
= p(y|η) = √ exp .
2πσ 2σ
Generally, we have two choice points in order to specify the generalized
linear model. The choice of the response function f , or how to treat the
linear combination of the observed input, and the choice of the exponen-
tial family distribution of the observed output y. Note that Ψ is completely
determined by choosing the exponential family. As a result, choosing ap-
propriate response function and exponential family is one of the major tasks
in probabilistic modeling, and once the choices are made, the general frame-
work of the exponential family can be applied to the modeled data.

All Tasks
No ratings yet
All Tasks
7 pages
Immediate Download Multilevel and Longitudinal Modeling Using Stata Fourth Edition Volumes I and II Sophia Rabe Hesketh Anders Skrondal Ebooks 2024
100% (3)
Immediate Download Multilevel and Longitudinal Modeling Using Stata Fourth Edition Volumes I and II Sophia Rabe Hesketh Anders Skrondal Ebooks 2024
69 pages
Paper DCRE
No ratings yet
Paper DCRE
13 pages
Notes
No ratings yet
Notes
47 pages
Lecture 1 (Chapter 3) - Common Families of Distributions
No ratings yet
Lecture 1 (Chapter 3) - Common Families of Distributions
12 pages
Probability Distributions: Discrete and Continuous Univariate Probability Distributions. Let S Be A Sample Space With A Prob
No ratings yet
Probability Distributions: Discrete and Continuous Univariate Probability Distributions. Let S Be A Sample Space With A Prob
7 pages
Generalized Linear Models: FX Axb C DX Axb C DX
No ratings yet
Generalized Linear Models: FX Axb C DX Axb C DX
11 pages
Cycpoly
No ratings yet
Cycpoly
5 pages
Sums of Powers of Primes in Arithmetic P
No ratings yet
Sums of Powers of Primes in Arithmetic P
22 pages
A Proof of Goldbach Conjecture
No ratings yet
A Proof of Goldbach Conjecture
9 pages
Gammafunction PDF
No ratings yet
Gammafunction PDF
20 pages
Linear Classification: 1 1 N N I D I
No ratings yet
Linear Classification: 1 1 N N I D I
33 pages
Riesz Mean II
No ratings yet
Riesz Mean II
15 pages
The Exponential Family of Distributions: P (X) H (X) e
No ratings yet
The Exponential Family of Distributions: P (X) H (X) e
13 pages
chapter1
No ratings yet
chapter1
15 pages
Taller 3 (A. NG.) - Introducción Al Aprendizaje Supervisado
No ratings yet
Taller 3 (A. NG.) - Introducción Al Aprendizaje Supervisado
8 pages
Solutions To Assignment-3
No ratings yet
Solutions To Assignment-3
10 pages
STATSLIDE3
No ratings yet
STATSLIDE3
9 pages
Story Sheet
No ratings yet
Story Sheet
2 pages
STAT 538 Maximum Entropy Models C Marina Meil A Mmp@stat - Washington.edu
No ratings yet
STAT 538 Maximum Entropy Models C Marina Meil A Mmp@stat - Washington.edu
20 pages
Basic Calculus Lesson 4.1 Different Types of Discontinuities
No ratings yet
Basic Calculus Lesson 4.1 Different Types of Discontinuities
13 pages
Systems of Equations: Discussion
No ratings yet
Systems of Equations: Discussion
6 pages
Notes 92
No ratings yet
Notes 92
4 pages
Lesson 4
No ratings yet
Lesson 4
25 pages
Dialnet GeodesicDistributionInGraphTheory 5232963
No ratings yet
Dialnet GeodesicDistributionInGraphTheory 5232963
12 pages
Incomplet 15
No ratings yet
Incomplet 15
12 pages
Sec 4-7
No ratings yet
Sec 4-7
15 pages
CS19M016 PGM Assignment1
No ratings yet
CS19M016 PGM Assignment1
9 pages
SR 3 Exercises
100% (1)
SR 3 Exercises
138 pages
Dirichlet's Theorem On Arithmetic Progressions: Anthony V Arilly
No ratings yet
Dirichlet's Theorem On Arithmetic Progressions: Anthony V Arilly
13 pages
Dirichlet
No ratings yet
Dirichlet
13 pages
Linearclassification
No ratings yet
Linearclassification
31 pages
Abund 12
No ratings yet
Abund 12
35 pages
Chapter2 PDF
No ratings yet
Chapter2 PDF
22 pages
Math Mag 89 2 92
No ratings yet
Math Mag 89 2 92
12 pages
Newton's Cooling Law lecture 4
No ratings yet
Newton's Cooling Law lecture 4
6 pages
hw2_sol
No ratings yet
hw2_sol
5 pages
Cramer Rao
No ratings yet
Cramer Rao
11 pages
Sec 1-1
No ratings yet
Sec 1-1
10 pages
Lecture 8: Channel Capacity, Continuous Random Variables: 1.1 Examples
No ratings yet
Lecture 8: Channel Capacity, Continuous Random Variables: 1.1 Examples
6 pages
Entropy and Mutual Information
No ratings yet
Entropy and Mutual Information
4 pages
Mutinf PDF
No ratings yet
Mutinf PDF
4 pages
Mutual Information
No ratings yet
Mutual Information
4 pages
472notes PDF
No ratings yet
472notes PDF
98 pages
Klerk-2022 Bản Cần Dịch
No ratings yet
Klerk-2022 Bản Cần Dịch
21 pages
FOURIER SERIES_Math 21b_Spring 08_Harvard
No ratings yet
FOURIER SERIES_Math 21b_Spring 08_Harvard
6 pages
(Some) Solutions For HW Set # 2
No ratings yet
(Some) Solutions For HW Set # 2
3 pages
Jaafari Binonlocal
No ratings yet
Jaafari Binonlocal
15 pages
Math 5846 Chapter 2
No ratings yet
Math 5846 Chapter 2
102 pages
Mathematical Problems and Solutions On Information Theory
No ratings yet
Mathematical Problems and Solutions On Information Theory
28 pages
The Exponential Family
No ratings yet
The Exponential Family
7 pages
Proba Num GP
No ratings yet
Proba Num GP
116 pages
Generating Functions 2018
No ratings yet
Generating Functions 2018
8 pages
SM HW1
No ratings yet
SM HW1
5 pages
dis1_sol
No ratings yet
dis1_sol
8 pages
CH605 23 24 Tutorial2
No ratings yet
CH605 23 24 Tutorial2
3 pages
1409.1780v7
No ratings yet
1409.1780v7
10 pages
Engg Maths
No ratings yet
Engg Maths
5 pages
Poly Etc
No ratings yet
Poly Etc
8 pages
y65
No ratings yet
y65
6 pages
Differential Forms
From Everand
Differential Forms
Henri Cartan
5/5 (2)
Jeb 247530
No ratings yet
Jeb 247530
10 pages
Multilevel Modeling Using R 1st Edition Edition W. Holmes Finch 2024 Scribd Download
No ratings yet
Multilevel Modeling Using R 1st Edition Edition W. Holmes Finch 2024 Scribd Download
46 pages
Generalized Additive Models: An Introduction with R (Chapman & Hall/CRC Texts in Statistical Science) 1st Edition – Ebook PDF Version pdf download
100% (36)
Generalized Additive Models: An Introduction with R (Chapman & Hall/CRC Texts in Statistical Science) 1st Edition – Ebook PDF Version pdf download
49 pages
Driverless A I Booklet
No ratings yet
Driverless A I Booklet
120 pages
Rabe-Hesketh 2006
No ratings yet
Rabe-Hesketh 2006
23 pages
Exponential Family
No ratings yet
Exponential Family
13 pages
Immediate download The multivariate social scientist introductory statistics using generalized linear models Sofroniou ebooks 2024
100% (7)
Immediate download The multivariate social scientist introductory statistics using generalized linear models Sofroniou ebooks 2024
50 pages
Buy ebook Design of Experiments for Reliability Achievement 1e Steven E. Rigdon cheap price
100% (2)
Buy ebook Design of Experiments for Reliability Achievement 1e Steven E. Rigdon cheap price
40 pages
2-Way Poisson Interactions
No ratings yet
2-Way Poisson Interactions
2 pages
Data Analysis Finals1
No ratings yet
Data Analysis Finals1
10 pages
STA 421 Notes II
No ratings yet
STA 421 Notes II
36 pages
Model Definition11
No ratings yet
Model Definition11
6 pages
Julia Reference Sheet
No ratings yet
Julia Reference Sheet
1 page
Factors Influencing Academic I JR Et
No ratings yet
Factors Influencing Academic I JR Et
10 pages
[FREE PDF SAMPLE] Applied Survey Data Analysis 2nd Edition Steven G. Heeringa (Author) ebook full chapters
No ratings yet
[FREE PDF SAMPLE] Applied Survey Data Analysis 2nd Edition Steven G. Heeringa (Author) ebook full chapters
67 pages
Explore: Notes
No ratings yet
Explore: Notes
37 pages
Beta Regression For Modelling Rates and Proportions: Silvia L. P. Ferrari and Francisco Cribari-Neto
No ratings yet
Beta Regression For Modelling Rates and Proportions: Silvia L. P. Ferrari and Francisco Cribari-Neto
17 pages
R Programming for Actuarial Science Peter Mcquire download pdf
100% (5)
R Programming for Actuarial Science Peter Mcquire download pdf
66 pages
Statistical Modelling of The E On Malaria Occurrence in Abuja, Nigeria
No ratings yet
Statistical Modelling of The E On Malaria Occurrence in Abuja, Nigeria
12 pages
STA3043S Test 1 2019 - Solutions
No ratings yet
STA3043S Test 1 2019 - Solutions
6 pages
Machine Learning in Medicine - A Practical Introduction
No ratings yet
Machine Learning in Medicine - A Practical Introduction
18 pages
Contributions of Time, Temperature and Humidity On The Biting Behaviour of Anopheles Funestus at Lupiro Village in Morogoro, Tanzania
No ratings yet
Contributions of Time, Temperature and Humidity On The Biting Behaviour of Anopheles Funestus at Lupiro Village in Morogoro, Tanzania
7 pages
Ingram, J. Terril, W. Paoline E. 2018. Police Culture and Officer Behavior.
No ratings yet
Ingram, J. Terril, W. Paoline E. 2018. Police Culture and Officer Behavior.
32 pages
PLUM - Ordinal Regression: Notes
No ratings yet
PLUM - Ordinal Regression: Notes
4 pages
Handbook of Regression Analysis With Applications in R, Second Edition Samprit Chatterjeepdf download
100% (2)
Handbook of Regression Analysis With Applications in R, Second Edition Samprit Chatterjeepdf download
58 pages
BAUDM Assignment2
No ratings yet
BAUDM Assignment2
16 pages
Manual Stata 13
100% (1)
Manual Stata 13
371 pages
edgeRUsersGuide PDF
No ratings yet
edgeRUsersGuide PDF
110 pages
Complete Download SAS for Mixed Models 2nd ed Edition Ramon C. Littell PDF All Chapters
100% (8)
Complete Download SAS for Mixed Models 2nd ed Edition Ramon C. Littell PDF All Chapters
29 pages