Applied statistics

Frontpage Lecture plan Vidcasts Datasets R install help

Multiple linear regression

Literature

[A] Chapter 11, except sections 11.6 - 11.8.

Lecture material

This lecture as: slideshow (html), Rmarkdown (Rmd), notes (pdf).

The entire module: notes (pdf).

Exercises

  1. Agresti exercise 11.1 and 11.5 (can the model be simplified?). These exercises should just be answered with pen and paper (using RStudio as a calculator).

  2. Solve Agresti exercise 11.11 (You can read the description of data at Alan Agresti's homepage ) and 11.21 using the Rmarkdown file Agresti-11-11_and_11-21.Rmd.

  3. Answer the questions below using the Rmarkdown file GNP.Rmd.
    • In this exercise we consier a dataset containing macro economical numbers for USA collected in the years 1947 to 1962. The dataset contains 7 variables:
      1. GNP.deflator: GNP implicit price deflator
      2. GNP: Gross National Product
      3. Unemployed: Number of people that are unemployed
      4. Armed.Forces: Number of staff in the armed forces
      5. Population: Population size (age >=14)
      6. Year: Year
      7. Employed: Number of people that are employed
    • Make a multiple linear regression model with 'GNP' as response and 'Population' as explanatory variable.
    • What is the interpretation of the estimates? Is the population size significant for GNP?
    • Repeat the analysis above, but now with 'Year' as explanatory variable. Is time significant?
    • Repeat the analysis above, but now with both 'Population' and 'Year' as explanatory variables. Conclusion?
    • Make pairwise scatter plots (splom) of GNP, Population and Year
      - Does the plot explain your results?