/ Home
S-Archive Download Script
dglm Double generalized linear models
DESCRIPTION
Fits a generalized linear model with a link-linear model for the dispersion as well as for the mean. Produces an object of class "dglm" which inherits from "glm" and "lm".

Note: for this routine to work correctly with gamma responses you will need also the functions associated with the Digamma family and the polygamma functions.  If you wish to use general power variance functions you will also need to the functions associated Tweedie family.
 
USAGE
function(formula, dformula = ~ 1, family = gaussian(), dlink = "log", data = sys.parent(), subset = NULL, weights = NULL, contrasts = NULL, method = "ml", mustart = NULL, betastart = NULL, phistart = NULL, control = dglm.control(...), ykeep = T, xkeep = F, zkeep = F, ...)
 
REQUIRED ARGUMENTS
formula a formula expression as for glm, of the form response ~ predictors. See the documentation of lm and formula for details. As for glm, this specifies the linear predictor for modelling the mean. A term of the form offset(expression) is allowed.
 
OPTIONAL ARGUMENTS
Most arguments for dglm are as for glm. The following arguments are new or altered:
dformula a formula expression of the form  ~ predictor, the response being ignored. This specifies the linear predictor for modelling the dispersion. A term of the form offset(expression) is allowed.
dlink link function for modelling the dispersion. Any link function accepted by the quasi family is allowed, including power(x). See details below.
method the method used to estimate the dispersion parameters; the default is "reml" for restricted maximum likelihood and the alternative is "ml" for maximum likelihood. Upper case and partial matches are allowed.
mustart numeric vector giving starting values for the fitted values or expected responses. Must be of the same length as the response, or of length 1 if a constant starting vector is desired. Ignored if betastart is supplied.
betastart numeric vector giving starting values for the regression coefficients in the link-linear model for the mean.
phistart numeric vector giving starting values for the dispersion parameters.
control a list of iteration and algorithmic constants. See dglm.control for their names and default values. These can also be set as arguments to dglm itself.
ykeep logical flag: if TRUE, the vector of responses is returned.
xkeep logical flag: if TRUE, the model.matrix for the mean model is returned.
zkeep logical flag: if TRUE, the model.matrix for the dispersion model is returned.
 
VALUE
an object of class dglm is returned, which inherits from glm and lm. See dglm.object for details.
 
DETAILS
Write mi = E( yi) for the expectation of the ith response. Then Var( yi) = siV(mi) where V is the variance function and si is the dispersion of the ith response (often denoted as the Greek character phi). We assume the link linear models

g(mi) = xiTb,    h(si) = ziTa,

where xi and zi are vectors of covariates, and b and a are vectors of regression cofficients affecting the mean and dispersion respectively. The argument dlink specifies h. See family in the S-Plus help for how to specify g. The optional arguments mustart, betastart and phistart specify starting values for mi, b and si respectively.

The parameters b are estimated as for an ordinary glm. The parameters a are estimated by way of a dual glm in which the deviance components of the ordinary glm appear as responses. The estimation procedure alternates between one iteration for the mean submodel and one iteration for the dispersion submodel until overall convergence.

The output from dglm, out say, consists of two glm objects (that for the dispersion submodel is out$dispersion.fit) with a few more components for the outer iteration and overall likelihood. The summary and anova functions have special methods for dglm objects. Any generic function which has methods for glms or lms will work on out, giving information about the mean submodel. Information about the dispersion submodel can be obtained by using out$dispersion.fit as argument rather than out itself. In particular drop1(out,scale=1) gives correct score statistics for removing terms from the mean submodel, while drop1(out$dispersion.fit,scale=2) gives correct score statistics for removing terms from the dispersion submodel.

The dispersion submodel is treated as a gamma family unless the original reponses are gamma, in which case the dispersion submodel is digamma. (Note that the digamma and trigamma functions are required to fit a digamma family.) This is exact if the original glm family is gaussian, Gamma or inverse.gaussian. In other cases it can be justified by the saddle-point approximation to the density of the responses. The results will therefore be close to exact ML or REML when the dispersions are small compared to the means. In all cases the dispersion submodel as prior weights 1, and has its own dispersion parameter which is 2.
 
REFERENCES
Smyth, G. K. (1989). Generalized linear models with varying dispersion. J. R. Statist. Soc. B, 51, 47--60.

Smyth, G. K., and Verbyla, A. P. (1999). Adjusted likelihood methods for modelling dispersion in generalized linear models. Environmetrics 10, 696-709. (Abstract - Zipped PostScript)

Smyth, GK, and Verbyla, AP (2009). Leverage adjustments for dispersion modelling in generalized nonlinear models. Australian and New Zealand Journal of Statistics 51, 433-448.
 
SEE ALSO
dglm.object, Digamma Family, Polygamma Functions
 
The following programs are included in the dglm software distribution, but are not separately documented: dglm.control, summary.dglm, print.summary.dglm, anova.dglm, glm.constant.
 
WARNING
The anova method is questionable when applied to an dglm object with method="reml" (stick to "ml").
 
EXAMPLES
# Fit a Gamma double generalized linear model.
# A log-linear model with covariates z is used for the dispersion.
out <- dglm(y~x,~z,family=Gamma(link=log))
summary(out)
anova(out)

# Summarize the mean model as for a glm
summary.glm(out)

# Summarize the dispersion model as for a glm
summary(out$dispersion.fit)

# Examine goodness of fit of dispersion model by plotting residuals
plot(fitted(out$dispersion.fit),residuals(out$dispersion.fit))

 

S-Archive Download Script

Gordon Smyth. Copyright © 1996-2016. Last modified: 10 February 2004