The ASTA team
living
, which is a Danish survey on standard of living. The variables are
B
(Housing, Bolig): bad/acceptable/goodH
(Health, Helbred): bad/goodI
(Isolated, Isoleret): yes/noA
(Anxiety, Angst): yes/noN
: The number of respondents for each combination of the 4 factors aboveliving <- read.delim("https://asta.math.aau.dk/datasets?file=living.txt", stringsAsFactors = TRUE)
living$B <- relevel(living$B,"Bad")
living$I <- relevel(living$I,"Yes")
living$A <- relevel(living$A,"Yes")
B
and H
without controlling for I
and A
:BH <- aggregate(N ~ B + H, FUN = sum, data = living)
BH
## B H N
## 1 Bad Bad 211
## 2 Acceptable Bad 327
## 3 Good Bad 1734
## 4 Bad Good 145
## 5 Acceptable Good 211
## 6 Good Good 1855
B
and H
(the reference level has all dummy variables equal to 0): \[
\log(f_e) = \alpha + \beta_{b1}z_{b1} + \beta_{b2}z_{b2} +
\beta_{h1}z_{h1} + \beta_{b1h1}z_{b1}z_{h1} + \beta_{b2h1}z_{b2}z_{h1}.
\]B
and H
in the model.glm
:model <- glm(N ~ B * H, family = poisson, data = BH)
B: Bad, H: Bad
) aresummary(model)
##
## Call:
## glm(formula = N ~ B * H, family = poisson, data = BH)
##
## Deviance Residuals:
## [1] 0 0 0 0 0 0
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 5.35186 0.06884 77.740 < 2e-16 ***
## BAcceptable 0.43810 0.08830 4.961 7.00e-07 ***
## BGood 2.10633 0.07291 28.889 < 2e-16 ***
## HGood -0.37512 0.10787 -3.478 0.000506 ***
## BAcceptable:HGood -0.06298 0.13940 -0.452 0.651438
## BGood:HGood 0.44258 0.11292 3.919 8.88e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for poisson family taken to be 1)
##
## Null deviance: 4.2103e+03 on 5 degrees of freedom
## Residual deviance: -2.9532e-14 on 0 degrees of freedom
## AIC: 59.485
##
## Number of Fisher Scoring iterations: 2
Good:Good
increases the response, i.e. an over representation of people with good housing conditions and good health.drop1
:drop1(model, test = "Chisq")
## Single term deletions
##
## Model:
## N ~ B * H
## Df Deviance AIC LRT Pr(>Chi)
## <none> 0.000 59.485
## B:H 2 40.766 96.251 40.766 1.405e-09 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
satmodel <- glm(N ~ B * H * A * I, family = poisson, data = living)
summary(satmodel)
##
## Call:
## glm(formula = N ~ B * H * A * I, family = poisson, data = living)
##
## Deviance Residuals:
## [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## [24] 0
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 2.079e+00 3.536e-01 5.882 4.06e-09 ***
## BAcceptable 5.596e-01 4.432e-01 1.263 0.206710
## BGood 1.417e+00 3.941e-01 3.596 0.000323 ***
## HGood -2.438e+01 4.225e+04 -0.001 0.999540
## ANo 4.855e-01 4.494e-01 1.080 0.279943
## INo 1.749e+00 3.831e-01 4.566 4.96e-06 ***
## BAcceptable:HGood 2.284e+01 4.225e+04 0.001 0.999569
## BGood:HGood 2.268e+01 4.225e+04 0.001 0.999572
## BAcceptable:ANo 5.349e-02 5.613e-01 0.095 0.924076
## BGood:ANo -3.077e-02 5.015e-01 -0.061 0.951069
## HGood:ANo 2.343e+01 4.225e+04 0.001 0.999558
## BAcceptable:INo 6.192e-03 4.801e-01 0.013 0.989710
## BGood:INo 5.473e-01 4.244e-01 1.290 0.197159
## HGood:INo 2.405e+01 4.225e+04 0.001 0.999546
## ANo:INo 6.557e-01 4.802e-01 1.365 0.172142
## BAcceptable:HGood:ANo -2.345e+01 4.225e+04 -0.001 0.999557
## BGood:HGood:ANo -2.254e+01 4.225e+04 -0.001 0.999574
## BAcceptable:HGood:INo -2.303e+01 4.225e+04 -0.001 0.999565
## BGood:HGood:INo -2.267e+01 4.225e+04 -0.001 0.999572
## BAcceptable:ANo:INo -2.516e-01 6.007e-01 -0.419 0.675370
## BGood:ANo:INo 2.827e-01 5.329e-01 0.531 0.595707
## HGood:ANo:INo -2.339e+01 4.225e+04 -0.001 0.999558
## BAcceptable:HGood:ANo:INo 2.365e+01 4.225e+04 0.001 0.999553
## BGood:HGood:ANo:INo 2.301e+01 4.225e+04 0.001 0.999565
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for poisson family taken to be 1)
##
## Null deviance: 1.0965e+04 on 23 degrees of freedom
## Residual deviance: 4.1199e-10 on 0 degrees of freedom
## AIC: 179.1
##
## Number of Fisher Scoring iterations: 20
drop1
and update
as explained in the following.drop1(satmodel, test = "Chi")
## Single term deletions
##
## Model:
## N ~ B * H * A * I
## Df Deviance AIC LRT Pr(>Chi)
## <none> 0.0000 179.10
## B:H:A:I 2 3.5112 178.61 3.5112 0.1728
drop1
reveals that the four-way interaction is insignificant and we remove it and save the updated model like this:reducedmodel <- update(satmodel, .~.-B:H:A:I)
drop1
again:drop1(reducedmodel, test = "Chi")
## Single term deletions
##
## Model:
## N ~ B + H + A + I + B:H + B:A + H:A + B:I + H:I + A:I + B:H:A +
## B:H:I + B:A:I + H:A:I
## Df Deviance AIC LRT Pr(>Chi)
## <none> 3.5112 178.61
## B:H:A 2 7.1419 178.24 3.6307 0.1628
## B:H:I 2 3.5316 174.63 0.0205 0.9898
## B:A:I 2 5.5332 176.63 2.0220 0.3639
## H:A:I 1 4.5857 177.68 1.0745 0.2999
B:H:I
could be removed and use update
again:reducedmodel <- update(reducedmodel, .~.-B:H:I)
anova
as shown in the following.simplemodel <- glm(N ~ B*H + B*A + B*I + H*A + H*I + I*A, family = poisson, data = living)
anova
:anova(simplemodel, satmodel, test = "Chisq")
## Analysis of Deviance Table
##
## Model 1: N ~ B * H + B * A + B * I + H * A + H * I + I * A
## Model 2: N ~ B * H * A * I
## Resid. Df Resid. Dev Df Deviance Pr(>Chi)
## 1 9 10.697
## 2 0 0.000 9 10.697 0.2971
drop1
.drop1(simplemodel, test = "Chisq")
## Single term deletions
##
## Model:
## N ~ B * H + B * A + B * I + H * A + H * I + I * A
## Df Deviance AIC LRT Pr(>Chi)
## <none> 10.697 171.79
## B:H 2 38.170 195.27 27.474 1.082e-06 ***
## B:A 2 37.912 195.01 27.215 1.231e-06 ***
## B:I 2 35.245 192.34 24.548 4.671e-06 ***
## H:A 1 41.947 201.04 31.250 2.268e-08 ***
## H:I 1 56.048 215.14 45.352 1.646e-11 ***
## A:I 1 26.449 185.54 15.753 7.218e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
simplemodel
.summary(simplemodel)
##
## Call:
## glm(formula = N ~ B * H + B * A + B * I + H * A + H * I + I *
## A, family = poisson, data = living)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.72955 -0.48421 -0.00248 0.31663 1.23187
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 2.17042 0.22962 9.452 < 2e-16 ***
## BAcceptable 0.65323 0.26323 2.482 0.013081 *
## BGood 1.13785 0.23365 4.870 1.12e-06 ***
## HGood -1.76785 0.21028 -8.407 < 2e-16 ***
## ANo 0.35067 0.19249 1.822 0.068488 .
## INo 1.74803 0.23116 7.562 3.97e-14 ***
## BAcceptable:HGood -0.04121 0.14119 -0.292 0.770370
## BGood:HGood 0.37773 0.11441 3.302 0.000961 ***
## BAcceptable:ANo -0.12759 0.15834 -0.806 0.420355
## BGood:ANo 0.39413 0.13265 2.971 0.002966 **
## BAcceptable:INo -0.14021 0.25873 -0.542 0.587878
## BGood:INo 0.72038 0.22799 3.160 0.001579 **
## HGood:ANo 0.44076 0.07938 5.552 2.82e-08 ***
## HGood:INo 1.12352 0.17982 6.248 4.16e-10 ***
## ANo:INo 0.66875 0.16277 4.109 3.98e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for poisson family taken to be 1)
##
## Null deviance: 10964.662 on 23 degrees of freedom
## Residual deviance: 10.697 on 9 degrees of freedom
## AIC: 171.79
##
## Number of Fisher Scoring iterations: 4
We see that all significant interactions are positive - as expected(why?).
(Intercept)
is log of the expected number, when all factors are “bad”. So the expected number is \(\exp(2.17) = 8.76\), whereas the observed number is \(8\).
summary(simplemodel)
##
## Call:
## glm(formula = N ~ B * H + B * A + B * I + H * A + H * I + I *
## A, family = poisson, data = living)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.72955 -0.48421 -0.00248 0.31663 1.23187
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 2.17042 0.22962 9.452 < 2e-16 ***
## BAcceptable 0.65323 0.26323 2.482 0.013081 *
## BGood 1.13785 0.23365 4.870 1.12e-06 ***
## HGood -1.76785 0.21028 -8.407 < 2e-16 ***
## ANo 0.35067 0.19249 1.822 0.068488 .
## INo 1.74803 0.23116 7.562 3.97e-14 ***
## BAcceptable:HGood -0.04121 0.14119 -0.292 0.770370
## BGood:HGood 0.37773 0.11441 3.302 0.000961 ***
## BAcceptable:ANo -0.12759 0.15834 -0.806 0.420355
## BGood:ANo 0.39413 0.13265 2.971 0.002966 **
## BAcceptable:INo -0.14021 0.25873 -0.542 0.587878
## BGood:INo 0.72038 0.22799 3.160 0.001579 **
## HGood:ANo 0.44076 0.07938 5.552 2.82e-08 ***
## HGood:INo 1.12352 0.17982 6.248 4.16e-10 ***
## ANo:INo 0.66875 0.16277 4.109 3.98e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for poisson family taken to be 1)
##
## Null deviance: 10964.662 on 23 degrees of freedom
## Residual deviance: 10.697 on 9 degrees of freedom
## AIC: 171.79
##
## Number of Fisher Scoring iterations: 4
living$expected <- fitted(simplemodel)
living[1:12,]
## B H I A N expected
## 1 Bad Good Yes No 5 3.300261
## 2 Bad Good No No 107 113.784094
## 3 Bad Good Yes Yes 0 1.495663
## 4 Bad Good No Yes 33 26.419981
## 5 Bad Bad Yes No 13 12.442145
## 6 Bad Bad No No 144 139.473499
## 7 Bad Bad Yes Yes 8 8.761930
## 8 Bad Bad No Yes 46 50.322426
## 9 Acceptable Good Yes No 5 5.357140
## 10 Acceptable Good No No 155 160.536219
## 11 Acceptable Good Yes Yes 3 2.758237
## 12 Acceptable Good No Yes 48 42.348403
I.e. the model cannot contain \(\text{A}*\text{H}*\text{I}\), \(\text{A}*\text{B}*\text{H}\) or \(\text{A}*\text{B}*\text{H}*\text{I}\).
E.g. \(\text{A} * \text{H} + \text{A}*\text{I}*\text{B} + \text{B}*\text{H}\)
\[ \text{B} * \text{I} + \text{H} * \text{I} + \text{I} * \text{A} + \text{B}*\text{H}+ \text{B}*\text{A} + \text{H}*\text{A} \]