Applied statistics (ESB: RISK)
Frontpage
Lecture plan
Vidcasts
Datasets
R install help
Multiple linear regression
Literature
[WMMY] Chapter 12.1, 12.2, 12.4, 12.5 until p.476m, 12.6, 12.8
Lecture material
This lecture as: slideshow (html), Rmarkdown (Rmd), notes (pdf).
The entire module: notes (pdf).
Exercises
Consider the following prediction equation without interaction: ŷ = 2 + 0.4x1 + 0.7x2. What does the equation for the predicted response look like when x1 = 1? What does the equation for the predicted response look like when x1 = 5?
Consider the following prediction equation with interaction: ŷ = 2 + 0.4x1 + 0.7x2 − 0.1x1x2. What does the equation for the predicted response look like when x1 = 1? What does the equation for the predicted response look like when x1 = 5? Explain the difference between the models in 1. and 2.
For this exercise we use data from [WMMY] Exercise 12.5. You find the exercise in the Rmarkdown file Exercise_12-5.Rmd.
Answer the questions below using the Rmarkdown file GNP.Rmd.
- In this exercise we consider a dataset containing macro economical numbers for USA collected in the years 1947 to 1962. The dataset contains 7 variables:
- GNP.deflator: GNP implicit price deflator
- GNP: Gross National Product
- Unemployed: Number of people that are unemployed
- Armed.Forces: Number of staff in the armed forces
- Population: Population size (age >=14)
- Year: Year
- Employed: Number of people that are employed
- Make a multiple linear regression model with ‘GNP’ as response and ‘Population’ as explanatory variable.
- What is the interpretation of the estimates? Is the population size significant for GNP?
- Repeat the analysis above, but now with ‘Year’ as explanatory variable. Is time significant?
- Repeat the analysis above, but now with both ‘Population’ and ‘Year’ as explanatory variables. Conclusion?
- Make pairwise scatter plots (e.g.
ggscatmat
from the GGally
package) of GNP, Population and Year
- Does the plot explain your results?