models with random effects
- Fits a generalized linear model including random factors using the method of Schall
(1991). Produces an object of class "glm" with some additional components.
- reglm(formula = formula(data), random = NULL, family = gaussian(), data =
sys.parent(), weights = NULL, dispersion = NULL, subset = NULL, na.action = na.fail,
contrasts = NULL, start = NULL, mustart = NULL, betastart = NULL, sigmastart = NULL,
control = reglm.control(...), ykeep = T, xkeep = F, ukeep = F, ...)
- REQUIRED ARGUMENTS
||a formula expression as for glm, of the form response ~ predictors.
This specifies the fixed part of the model. See the documentation of lm and formula
for details. As for glm, this specifies the linear predictor for modelling the
mean. A term of the form offset(expression) is allowed.
- OPTIONAL ARGUMENTS
- Almost all arguments for reglm are as for glm. The following arguments
are new or altered. The first is the important one. The others are likely to be of only
||a list object, the components of which are the factors to be treated as random. This specifies the random part of the
||specifies a prior value for the dispersion of the glm. This argument defaults to 1 for
the binomial and Poisson families. Same as the $scale command in GLIM.
||numeric vector giving starting values for the fitted values or expected responses
(including the predicted values of the random effects). This is the same as the usual glm
argument start except on the scale of the means rather than of the linear
predictors. Must be of the same length as the response, or of length 1 if a constant
starting vector is desired. Ignored if betastart is supplied.
||numeric vector giving starting values for the regression coefficients in the
link-linear model for the mean (including predicted values of the random effects).
||numeric vector giving starting values for the variance components
||a list of iteration and algorithmic constants. See reglm.control for their
names and default values. These can also be set as arguments to dglm itself. This
argument is the same as control.glm except that the defaults have been changed.
||logical flag: if TRUE, the vector of responses is returned. Same as y
||logical flag: if TRUE, the model.matrix for the mean model is
returned. Same as x for glm.
||logical flag: if TRUE, the model.matrix for the variance components
- an object of class glm is returned. The following components are additional to
the usual glm components:
||vector of estimated variance components.
||estimated dispersion. If the family had a prior dispersion, then the dispersion is
estimated only at convergence of the iteration and returned to diagnose over- or
||prior value of the dispersion for the glm. Same as the input argument dispersion if
that was specified.
||vector of dimensions of the random factors..
||vector of effective degrees of freedom used to estimate the variance components.
||design matrix for the random factors, if ukeep=T on input.
- We assume the linear predictor for the generalized linear model is of the form
g(m) = Xb + U1b1
+ ... + Ucbc
where b is a vector of fixed coefficients and b1
... bc are vectors of random effects. The random effects
are assumed to have variances s1 ... sc. The design matrix X is
specified using the model formula and the design matrices U1 ... Uc
are specified using the input argument random.
The parameters are estimated here using the REML method described in Schall (1991).
- Schall, R (1991). Estimation in generalized linear models with random effects. Biometrika,
Candy, S. G. (2004). Modelling catch and effort data using generalized linear
models, the Tweedie distribution, random vessel effects and random
stratum-by-year effects. CCAMLR Science 11, 59-80.
- Thanks for Dr Steve Candy, Australian Antarctic Division, for contributing
a bug fix, 6 Nov 2004.
- SEE ALSO
- The following program is included in the reglm software distribution, but is
not separately documented: reglm.control.
- On output the regression parameters don't have labels.
- Here we demonstrate the reglm function on the radiation of cancer cells data
example from Schall (1991).
Basic analysis of the data shows that there is strong evidence for differences between the occasions, and also good evidence for overdispersion between the dishes (observations) within occasions.
> radiatio <- read.table("radatio.txt",header=T)
> n <- rep(400,27)
We specify a binomial model with two random effects. The factor Occasion represents the
radiation occasion. There is also a random effect at the observation level, represented by
(1:27) since there are 27 observations, which picks up any over-dispersion in the data.
Trace is turned on so that we can see the progress of the iteration. The variance
components and dispersion are output at each iteration.
> out <- reglm(Survived/n~1,list(Occasion=Occasion,Units=(1:27)),
Iter 1 0.0000505604239786021 0.0000166989428102248 1
Iter 2 0.00283092775872843 0.000306570522656396 1
Iter 3 0.0914639350937484 0.00247127680331739 1
Iter 4 0.207417308402352 0.00381847728525275 1
Iter 5 0.222886503891888 0.00539847167615311 1
Iter 6 0.223151486991139 0.00690093186214274 1
Iter 7 0.222683195620167 0.00805926586181473 1
Iter 8 0.222303061660932 0.00882599204674674 1
Iter 9 0.222049310739124 0.00928435931728272 1
Iter 10 0.22189665203915 0.0095420092039735 1
Iter 11 0.221810447442714 0.00968186587710608 1
Iter 12 0.221763517700708 0.0097563502955682 1
Iter 13 0.221738481620298 0.00979561756518506 1
Iter 14 0.221725270519245 0.00981620792679515 1
Iter 15 0.221718339587564 0.00982697440004381 1
Iter 16 0.221714714509054 0.00983259577809794 1
Iter 17 0.221712821520788 0.00983552854676926 1
 0.221712822 0.009835529
The final variance components, and residual variance estimate, are the same as in
Schall (1991). Note that the prior dispersion in this case is 1.
 9 27
 7.744181 8.232174
The random effects have dimensions 9 and 27 respectively. After smoothing the random
effects, the effective degrees of freedom are 7.7 and 8.2 respectively. Including Occasion
as a fixed effect would have used 8 df. In this case the effect of Occasion is so strong
that there is very little difference between the fixed and random models for it.
out$rank gives the total number of parameters in the linear predictor including both
the fixed and random parameters. Therefore out$rank-sum(out$q) gives the number of fixed
parameters, in this case 1. The total effective degrees of freedom used in modelling the
means is just under 17.0, including the 1 degree of freedom for the fixed model.
 0.69126760 -0.18631757 -0.70870278 -0.01694652 0.19448300 -0.02442938
 0.60688086 -0.54311626 -0.01311896
> contrasts(Occasion) <- contr.sum(levels(Occasion))
> out.fixed <- glm(Survived/n~Occasion,family=binomial,weights=n)
Occasion1 Occasion2 Occasion3 Occasion4 Occasion5 Occasion6
0.7128812 -0.1915751 -0.7354824 -0.01624658 0.2027526 -0.02395969
Note that the estimated coefficients for Occasion (there are 9 of them and they add to
zero) are similar to the equivalent parameters from the fixed model, but shrunk somewhat
towards zero. In the fixed model, the last coefficient is not included explicitly because
the coefficients are constrained to add to zero.
Copyright © 1996-2016. Last modified:
10 February 2004