Analysis of Variance

The ASTA team

One way analysis of variance

Example

library(mosaic)
gf_boxplot(weight ~ feed, data = chickwts)

The ANOVA Model

Estimation of mean values

Estimates

mean(weight ~ feed, data = chickwts)
##    casein horsebean   linseed  meatmeal   soybean sunflower 
##  323.5833  160.2000  218.7500  276.9091  246.4286  328.9167

Contrast coding

Example

model <- lm(weight ~ feed, data = chickwts)
summary(model)
## 
## Call:
## lm(formula = weight ~ feed, data = chickwts)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -123.909  -34.413    1.571   38.170  103.091 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)    323.583     15.834  20.436  < 2e-16 ***
## feedhorsebean -163.383     23.485  -6.957 2.07e-09 ***
## feedlinseed   -104.833     22.393  -4.682 1.49e-05 ***
## feedmeatmeal   -46.674     22.896  -2.039 0.045567 *  
## feedsoybean    -77.155     21.578  -3.576 0.000665 ***
## feedsunflower    5.333     22.393   0.238 0.812495    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 54.85 on 65 degrees of freedom
## Multiple R-squared:  0.5417, Adjusted R-squared:  0.5064 
## F-statistic: 15.36 on 5 and 65 DF,  p-value: 5.936e-10

Overall test for effect

Graphical representation of models

## Warning in geom_point(aes(x = red_dot), color = "red"): All aesthetics have length 1, but the data has 71 rows.
## ℹ Please consider using `annotate()` or provide this layer with data containing
##   a single row.

Hypotheses and test statistic

Interpretation of \(F\) statistic - Variance between/within groups

Example

model <- lm(weight ~ feed, data = chickwts)
summary(model)
## 
## Call:
## lm(formula = weight ~ feed, data = chickwts)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -123.909  -34.413    1.571   38.170  103.091 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)    323.583     15.834  20.436  < 2e-16 ***
## feedhorsebean -163.383     23.485  -6.957 2.07e-09 ***
## feedlinseed   -104.833     22.393  -4.682 1.49e-05 ***
## feedmeatmeal   -46.674     22.896  -2.039 0.045567 *  
## feedsoybean    -77.155     21.578  -3.576 0.000665 ***
## feedsunflower    5.333     22.393   0.238 0.812495    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 54.85 on 65 degrees of freedom
## Multiple R-squared:  0.5417, Adjusted R-squared:  0.5064 
## F-statistic: 15.36 on 5 and 65 DF,  p-value: 5.936e-10

Two way analysis of variance

Additive effects

Dummy coding

\[ \begin{array}{cccc} & LO & ME & HI \\ OJ & \mu & \mu+\beta_2 & \mu+ \beta_3\\ VC & \mu +\beta_1 & \mu+\beta_1 + \beta_2 & \mu+ \beta_1 + \beta_3\\ \end{array} \]

Main effect model in R

MainEff <- lm(len ~ supp + dose, data = ToothGrowth)
summary(MainEff)
## 
## Call:
## lm(formula = len ~ supp + dose, data = ToothGrowth)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -7.085 -2.751 -0.800  2.446  9.650 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  12.4550     0.9883  12.603  < 2e-16 ***
## suppVC       -3.7000     0.9883  -3.744 0.000429 ***
## doseME        9.1300     1.2104   7.543 4.38e-10 ***
## doseHI       15.4950     1.2104  12.802  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.828 on 56 degrees of freedom
## Multiple R-squared:  0.7623, Adjusted R-squared:  0.7496 
## F-statistic: 59.88 on 3 and 56 DF,  p-value: < 2.2e-16

Testing effect of supp

doseEff <- lm(len ~ dose, data = ToothGrowth)
anova(doseEff, MainEff)
## Analysis of Variance Table
## 
## Model 1: len ~ dose
## Model 2: len ~ supp + dose
##   Res.Df     RSS Df Sum of Sq      F    Pr(>F)    
## 1     57 1025.78                                  
## 2     56  820.43  1    205.35 14.017 0.0004293 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Testing effect of dose

suppEff <- lm(len ~ supp, data = ToothGrowth)
anova(suppEff, MainEff)
## Analysis of Variance Table
## 
## Model 1: len ~ supp
## Model 2: len ~ supp + dose
##   Res.Df    RSS Df Sum of Sq      F    Pr(>F)    
## 1     58 3246.9                                  
## 2     56  820.4  2    2426.4 82.811 < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Interaction

Example

with(ToothGrowth, interaction.plot(dose, supp, len, col = 2:3))

Dummy coding

\[ \begin{array}{cccc} & LO & ME & HI \\ OJ & \mu & \mu+\beta_2 & \mu+ \beta_3\\ VC & \mu +\beta_1 & \mu+\beta_1 + \beta_2 +\beta_4 & \mu+ \beta_1 + \beta_3 + \beta_5\\ \end{array} \]

Example

Interaction <- lm(len ~ supp*dose, data = ToothGrowth)
anova(MainEff, Interaction)
## Analysis of Variance Table
## 
## Model 1: len ~ supp + dose
## Model 2: len ~ supp * dose
##   Res.Df    RSS Df Sum of Sq     F  Pr(>F)  
## 1     56 820.43                             
## 2     54 712.11  2    108.32 4.107 0.02186 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
summary(Interaction)
## 
## Call:
## lm(formula = len ~ supp * dose, data = ToothGrowth)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
##  -8.20  -2.72  -0.27   2.65   8.27 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)     13.230      1.148  11.521 3.60e-16 ***
## suppVC          -5.250      1.624  -3.233  0.00209 ** 
## doseME           9.470      1.624   5.831 3.18e-07 ***
## doseHI          12.830      1.624   7.900 1.43e-10 ***
## suppVC:doseME   -0.680      2.297  -0.296  0.76831    
## suppVC:doseHI    5.330      2.297   2.321  0.02411 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.631 on 54 degrees of freedom
## Multiple R-squared:  0.7937, Adjusted R-squared:  0.7746 
## F-statistic: 41.56 on 5 and 54 DF,  p-value: < 2.2e-16

Hierarchical principle