/ Home

OzDASL

# Accidental Oil Spills

Keywords: regression, weighted regression, regression through the origin.

## Description

The following data from the Statistical Abstract of the United States give the number of accidental oil spills at sea and the amount of oil lost in these spills for the years 1973 - 1985.

 Variable Description Year Year Spills Number of spills Oil Amount of oil lost (thousands of metric tonnes)

Data File (tab-delimited text)

## Source

 Statistical Abstract of the United States 1987. Hamilton, L. C. (1992). Regression with Graphics. Duxbury Press, Belmont, California, page 49. Hamilton (1992) gives the units for Oil as millions of metric tons. Berwin Turlach, University of Western Australia, points out using historical data that the units are more likely to be thousands of metric tonnes.

## Analysis

Clearly the number of spills is clearly decreasing over time. It is reasonable to look then at the amount of oil per spill.

An unweighted regression of Oil on Spill is inappropriate:

#### Regression Analysis

```The regression equation is
Oil = - 44 + 6.96 Spills```
```Predictor        Coef       StDev          T        P
Constant        -44.5       102.7      -0.43    0.673
Spills          6.957       2.832       2.46    0.032```
`S = 166.5       R-Sq = 35.4%     R-Sq(adj) = 29.6%`
`Analysis of Variance`
```Source            DF          SS          MS         F        P
Regression         1      167218      167218      6.03    0.032
Residual Error    11      304844       27713
Total             12      472062```
```Unusual Observations
Obs     Spills        Oil         Fit   StDev Fit    Residual    St Resid
7       65.0      723.5       407.7       103.3       315.8        2.42R
11       17.0      387.8        73.8        63.5       314.0        2.04R ```
`R denotes an observation with a large standardized residual`

The variance is proportional to the number of spills included. This changes the regression coefficient somewhat and reduces the standard errors:

#### Regression Analysis

`Weighted analysis using weights in 1/Spills`
```The regression equation is
Oil = - 13.6 + 6.00 Spills```
```Predictor        Coef       StDev          T        P
Constant       -13.56       68.32      -0.20    0.846
Spills          6.002       2.564       2.34    0.039```
`S = 29.89       R-Sq = 33.3%     R-Sq(adj) = 27.2%`
`Analysis of Variance`
```Source            DF          SS          MS         F        P
Regression         1      4896.2      4896.2      5.48    0.039
Residual Error    11      9826.5       893.3
Total             12     14722.7```
```Unusual Observations
Obs     Spills        Oil         Fit   StDev Fit    Residual    St Resid
11       17.0     387.80       88.47       40.82      299.33        2.57R ```
`R denotes an observation with a large standardized residual`

The intercept should in principle not be required. The average size of a spill is about 5.6:

#### Regression Analysis

`Weighted analysis using weights in 1/Spills`
```The regression equation is
Oil = 5.58 Spills```
```Predictor        Coef       StDev          T        P
Noconstant
Spills          5.583       1.397       4.00    0.002```
```S = 28.67
Analysis of Variance```
```Source            DF          SS          MS         F        P
Regression         1       13123       13123     15.97    0.002
Residual Error    12        9862         822
Total             13       22985```
```Unusual Observations
Obs     Spills        Oil         Fit   StDev Fit    Residual    St Resid
11       17.0     387.80       94.91       23.75      292.89        2.53R ```
`R denotes an observation with a large standardized residual`

Is the average spill size changing with time? Now we must weight proportion to the number of spills:

#### Regression Analysis

`Weighted analysis using weights in Spills`
```The regression equation is
Oil/Spills = - 991 + 0.504 Year```
```Predictor        Coef       StDev          T        P

Constant       -991.2       863.4      -1.15    0.275
Year           0.5040      0.4366       1.15    0.273```
`S = 28.28       R-Sq = 10.8%     R-Sq(adj) = 2.7%`
`Analysis of Variance`
```Source            DF          SS          MS         F        P
Regression         1      1065.8      1065.8      1.33    0.273
Residual Error    11      8795.9       799.6
Total             12      9861.7```
```Unusual Observations
Obs       Year   Oil/Spil         Fit   StDev Fit    Residual    St Resid
11       1983      22.81        8.25        2.69       14.57        2.31R ```
`R denotes an observation with a large standardized residual`

The year 1983 appears to be an outlier.