Purpose
To work out exercise 6.3 from Hobson’s book ''

> folder <- "C:/Cauldron/garage/R/soulcraft/Volatility/Learn/Dobson-GLM/"
> file.input <- paste(folder, "Table 6.3 Carbohydrate diet.csv",
+     sep = "")
> data <- read.csv(file.input, header = T, stringsAsFactors = F)

Plot the responses y against each ofthe explanatory variables x1, x2 and
x3 to see if y appears to be linearly related to them.

> par(mfrow = c(1, 1))
> plot(data$carbohydrate, data$age, pch = 19, col = "sienna")

Chap-6_3-002.jpg

> par(mfrow = c(1, 1))
> plot(data$carbohydrate, data$weight, pch = 19, col = "sienna")

Chap-6_3-003.jpg

> par(mfrow = c(1, 1))
> plot(data$carbohydrate, data$protein, pch = 19, col = "sienna")

Chap-6_3-004.jpg

Fit the model (6.6) and examine the residuals to assess the adequacy of the model and the assumptions.

> fit1 <- lm(carbohydrate ~ age + weight + protein, data)
> summary(fit1)
Call:
lm(formula = carbohydrate ~ age + weight + protein, data = data)
Residuals: Min 1Q Median 3Q Max -10.3424 -4.8203 0.9897 3.8553 7.9087
Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 36.96006 13.07128 2.828 0.01213 * age -0.11368 0.10933 -1.040 0.31389 weight -0.22802 0.08329 -2.738 0.01460 * protein 1.95771 0.63489 3.084 0.00712 ** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 5.956 on 16 degrees of freedom Multiple R-squared: 0.4805, Adjusted R-squared: 0.3831 F-statistic: 4.934 on 3 and 16 DF, p-value: 0.01297 > fit2 <- lm(carbohydrate ~ age + protein, data) > summary(fit2) Call: lm(formula = carbohydrate ~ age + protein, data = data)
Residuals: Min 1Q Median 3Q Max -11.2692 -5.9968 0.9902 5.7952 9.5474
Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 15.08848 12.16239 1.241 0.2316 age -0.09167 0.12818 -0.715 0.4842 protein 1.68189 0.73693 2.282 0.0356 * --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 7.002 on 17 degrees of freedom Multiple R-squared: 0.2372, Adjusted R-squared: 0.1475 F-statistic: 2.643 on 2 and 17 DF, p-value: 0.1001 > fit3 <- lm(carbohydrate ~ protein, data) > summary(fit3) Call: lm(formula = carbohydrate ~ protein, data = data)
Residuals: Min 1Q Median 3Q Max -12.498 -5.983 0.902 4.887 10.662
Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 12.4787 11.4435 1.090 0.2899 protein 1.5800 0.7131 2.216 0.0399 * --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 6.907 on 18 degrees of freedom Multiple R-squared: 0.2143, Adjusted R-squared: 0.1706 F-statistic: 4.909 on 1 and 18 DF, p-value: 0.03986 > fit4 <- lm(carbohydrate ~ weight + protein, data) > summary(fit4) Call: lm(formula = carbohydrate ~ weight + protein, data = data)
Residuals: Min 1Q Median 3Q Max -10.6812 -3.9135 0.9464 4.0880 9.7948
Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 33.13032 12.57155 2.635 0.0174 * weight -0.22165 0.08326 -2.662 0.0164 * protein 1.82429 0.62327 2.927 0.0094 ** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 5.971 on 17 degrees of freedom Multiple R-squared: 0.4454, Adjusted R-squared: 0.3802 F-statistic: 6.827 on 2 and 17 DF, p-value: 0.006661

anova model

> summary(aov(carbohydrate ~ age + weight + protein, data))
            Df Sum Sq Mean Sq F value   Pr(>F)
age          1   3.82    3.82  0.1076 0.747136
weight       1 183.98  183.98  5.1856 0.036859 *
protein      1 337.34  337.34  9.5082 0.007121 **
Residuals   16 567.66   35.48
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residuals 16 567.66

> summary(aov(carbohydrate ~ weight + protein, data))
            Df Sum Sq Mean Sq F value  Pr(>F)
weight       1 181.38  181.38   5.088 0.03756 *
protein      1 305.40  305.40   8.567 0.00941 **
Residuals   17 606.02   35.65
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residuals 17 606.02

> pf((606.02 - 567.66)/(567.66/16), 1, 16)
[1] 0.6861142

Cannot do away with age variable

> summary(aov(carbohydrate ~ age + protein, data))
            Df Sum Sq Mean Sq F value  Pr(>F)
age          1   3.82    3.82  0.0779 0.78358
protein      1 255.41  255.41  5.2089 0.03562 *
Residuals   17 833.57   49.03
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residuals 17 833.57

> summary(aov(carbohydrate ~ protein, data))
            Df Sum Sq Mean Sq F value  Pr(>F)
protein      1 234.15  234.15  4.9085 0.03986 *
Residuals   18 858.65   47.70
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residuals 18 858.65

> pf((858.65 - 833.57)/(833.57/17), 1, 17)
[1] 0.5157968

Cannot do away with age variable