Applied statistics (CPH: BD)

Frontpage Lecture plan Vidcasts Datasets R install help

Multiple linear regression

Literature

[A] Chapter 11, except sections 11.6 - 11.8.

Lecture material

This lecture as: slideshow (html), Rmarkdown (Rmd), notes (pdf).

The entire module: notes (pdf).

Answers to exercises in the book

Exercises

  1. Agresti exercise 11.1 and 11.5 (can the model be simplified?). These exercises should just be answered with pen and paper (using RStudio as a calculator).

  2. Solve Agresti exercise 11.11 (You can read the description of data at Alan Agresti's homepage ) and 11.21 using the Rmarkdown file Agresti-11-11_and_11-21.Rmd.

  3. Answer the questions below using the Rmarkdown file GNP.Rmd.
    • In this exercise we consier a dataset containing macro economical numbers for USA collected in the years 1947 to 1962. The dataset contains 7 variables:
      1. GNP.deflator: GNP implicit price deflator
      2. GNP: Gross National Product
      3. Unemployed: Number of people that are unemployed
      4. Armed.Forces: Number of staff in the armed forces
      5. Population: Population size (age >=14)
      6. Year: Year
      7. Employed: Number of people that are employed
    • Make a multiple linear regression model with 'GNP' as response and 'Population' as explanatory variable.
    • What is the interpretation of the estimates? Is the population size significant for GNP?
    • Repeat the analysis above, but now with 'Year' as explanatory variable. Is time significant?
    • Repeat the analysis above, but now with both 'Population' and 'Year' as explanatory variables. Conclusion?
    • Make pairwise scatter plots (splom) of GNP, Population and Year
      - Does the plot explain your results?