/ Home
S-Archive Download Script
tariff Estimate insurance tariffs
DESCRIPTION
Estimate a mean and dispersion model for the cost and frequency of insurance claims. Allows the estimation of insurance tariffs. Produces a double generalized linear model object of class "dglm" which inherits from "glm" and "lm".

Note: To use this function, you will also need to the functions associated with dglm and the Tweedie family.
 
USAGE
tariff <- function(formula = formula(data), dformula = ~1, nclaims = NULL, exposure = NULL, link.power = 0, dlink.power = 0, var.power = 1.5, data = sys.parent(), subset = NULL, contrasts = NULL, method = "ml", mustart = NULL, betastart = NULL, phistart = NULL, control = dglm.control(...), ykeep = T, xkeep = F, zkeep = F, ...)
 
REQUIRED ARGUMENTS
formula a formula expression as for glm, of the form response ~ predictors. See the documentation of lm and formula for details. As for glm, this specifies the linear predictor for modelling the mean. A term of the form offset(expression) is allowed. The response should be the total cost of claims divided by the number of claims.
 
OPTIONAL ARGUMENTS
dformula a formula expression of the form  ~ predictor, the response being ignored. This specifies the linear predictor for modelling the dispersion. A term of the form offset(expression) is allowed. For insurance modelling, this will often be the same as the mean model.
nclaims vector giving the number of claims.
exposure vector giving a measure of exposure to risk, usually proportional to policy years.
link.power link function for modelling the mean. A linear predictor is used for the mean raised to link.power, with 0 indicating the log-link.
dlink.power link function for modelling the dispersion. A linear predictor is used for the dispersion raised to link.power, with 0 indicating the log-link.
var.power Scalar. The variance is assumed proportion to the mean raised to this power. Must be between 1 and 2.
data as for the glm function; see S-Plus documentation.
subset as for the glm function; see S-Plus documentation.
contrasts as for the glm function; see S-Plus documentation.
method the method used to estimate the dispersion parameters; the default is "ml" for maximum likelihood and the alternative is "reml" for restricted maximum likelihood. Upper case and partial matches are allowed.
mustart numeric vector giving starting values for the fitted values or expected responses. Must be of the same length as the response, or of length 1 if a constant starting vector is desired. Ignored if betastart is supplied.
betastart numeric vector giving starting values for the regression coefficients in the link-linear model for the mean.
phistart numeric vector giving starting values for the dispersion parameters.
control a list of iteration and algorithmic constants. See dglm.control for their names and default values. These can also be set as arguments to tariff itself.
ykeep logical flag: if TRUE, the vector of responses is returned.
xkeep logical flag: if TRUE, the model.matrix for the mean model is returned.
zkeep logical flag: if TRUE, the model.matrix for the dispersion model is returned.
 
VALUE
an object of class dglm is returned, which inherits from glm and lm. See dglm.object for details.
 
DETAILS
Let  zi be the total cost of claims in the ith category, and let ni be the numbe of claims. We assume that the ni are Poisson and that the size of each claim follows a gamma distribution. This implies that the average observed claim size yi = zi/ni follows Tweedie's compound Poisson distribution. The function tariff computes maximum likelihood or restricted maximum likelihood estimators for the parameters based on the joint likelihood of yi and ni.

The function is similar in structure to the double generalized linear model function dglm, and it returns an object of the same class.
 
REFERENCES
Smyth, G. K., and Verbyla, A. P. (1999). Adjusted likelihood methods for modelling dispersion in generalized linear models. Environmetrics 10, 696-709. Read article

Smyth, G. K., and Jørgensen, B. (To appear). Fitting Tweedie's Compound Poisson Model to Insurance Claims Data: Dispersion Modelling. ASTIN Bulletin. Read article
 
SEE ALSO
dglm, dglm.object, Tweedie family.
 
WARNING
The anova method is questionable when applied to an dglm object with method="reml" (stick to "ml").
 
EXAMPLES
Estimate tariffs for the Swedish 3rd party motor insurance data. This reproduces results from Smyth and Jørgensen (in press).
motorins <- read.table("c:/gordon/www/data/general/motorins.txt",header=T)
motorins <- motorins[motorins$Zone == 1 & motorins$Make != 9,]
motorins$Bonus <- factor(motorins$Bonus)
motorins$Make <- factor(motorins$Make)
motorins$Kilometres <- factor(motorins$Kilometres)
contrasts(motorins$Bonus) <- contr.treatment(levels(motorins$Bonus))
contrasts(motorins$Make) <- contr.treatment(levels(motorins$Make))
contrasts(motorins$Kilometres) <- contr.treatment(levels(motorins$Kilometres))
attach(motorins)

out <- tariff(Payment/Insured~Bonus+Make+Kilometres,~Bonus+Make+Kilometres,nclaims=Claims,exposure=Insured,var.power=1.72)
summary(out)

# Base risk
tapply(fitted(out),list(Bonus,Make,Kilometres),mean)[1,1,1]

# Multiplative tariff factors for other factor levels
exp(coef(out))
S-Archive Download Script

Gordon Smyth. Copyright © 1996-2016. Last modified: 10 February 2004