/
Home |

Generalized linear models (GLMs) are used to do regression modelling for non-normal data with a minimum of extra complication compared with normal linear regression. GLMs are flexible enough to include a wide range of common situations, but at the same time allow most of the familiar ideas of normal linear regression to carry over.

**The normal linear model.**
Let **y** be a vector of observations, and let X be a matrix of
covariates. The usual multiple regression model takes the form

(1) **m** = X**b**

where **m** = E(**y**) and **b** is a vector of regression coefficients. Typically we
assume that the y_{i }are normal and independent with standard deviation s, so that we estimate **b**
by minimizing the sum of squares

(**y** - **m**)^{T}(**y**
- **m**)

**Why is linearity not enough?**
The most important and common case is that in which the y_{i }and m_{i} are bounded. For example, if y represents the amount of
some physical substance then we may have y > 0 and m > 0.
On the other hand if y is binary, y = 1 if an animal survives and y =0 if it does not say,
then 0 < m < 1. The linear model (1) is inadequate in
these cases because complicated and unnatural constraints on **b**
would be required to make sure that **m** stays in
the possible range. Generalized linear models instead assume a *link linear*
relationship

(2) g(**m**) = X**b**

where g() is some known monotonic function which acts pointwise on **m**. Typically g() is used to transform the m_{i} to a scale on which they are unconstrained. For example
we might use g(m) = log(m) if m_{i} > 0 or g(m) = log[ m / (1-m) ] if 0 < m_{i}
< 1.

**Why is normality not enough?**
In some situations, typically cases when s is small, the
normal approximation to the distribution of **y** is accurate. More
typically, responses are not normal.

If y is bounded then the variance of y must depend on its mean. Specifically if m is close to a boundary for y then the var(y) must also be small. For example, if y > 0, then we must have var(y) -> 0 as m -> 0. For this reason strictly positive data almost always shows increasing variability with increased size. If 0 < y < 1, then var(y) -> 0 as m -> 0 or m -> 1. For this reason, generalized linear models assume that

(3) var(y) = f V(m)

where V() is some known variance function appropriate for the data at hand.

We therefore estimate the nonlinear regression equation (2) weighting the observations
inversely according to the variance functions V(m_{i}).
This weighting procedure turns out to be exactly equivalent to maximum likelihood
estimation when the observations actually come from an exponential family distribution.

Home - About Us -
Contact Us Copyright © Gordon Smyth 1996-2003. Last modified:
13 January 2003 |