Applied statistics (CPH: BD)
Frontpage
Lecture plan
Vidcasts
Datasets
R install help
Data collection 2
Literature
[A] Section 10.2.
Also look at R for Data Science. Especially chapter 5 (Data transformation), 11 (Data import) and 12 (Tidy data).
Lecture material
This lecture as: slideshow (html), Rmarkdown (Rmd), notes (pdf).
The entire module: notes (pdf).
Exercises
In any order (read about both before starting):
- PISA 2015
- Go to http://www.oecd.org/pisa/data/2015database/
- Download questionnaires in English (extract and delete all except
CY6_QST_MS_SCQ_CBA_Final.pdf)
- Download SPSS
School questionnaire data file data and extract
- Discuss the PISA study (in terms of sampling etc.), see the general site http://www.oecd.org/pisa/
- Make relevant analyses (e.g. plots, tables, statistical)
- Data for energy sectors (technical exercise practising R):
- Go to https://ens.dk/service/statistik-data-noegletal-og-kort/data-oversigt-over-energisektoren and download "Data for eksisterende og afmeldte møller"
- Load the data in R (use the sheet
IkkeAfmeldte-Existing turbines, but maybe not the entire sheet, see range in ?readxl::read_xlsx)
- Get an impression of the data (format, variables, ..)
- Construct some relevant tables and plots
- Discuss the concepts of sample and population
- Restrict the dataset to
Manufacture %in% c("Vestas Wind Systems A/S", "NEG Micon", "BONUS", "SIEMENS") (hint: tidyverse::filter())
- Do a descriptive analysis of
Manufacture (Fabrikat) and Capacity (kW) (Kapacitet (kW)), e.g. tables and/or figures
- Assuming this is a random sample from a larger population, do a statistical analysis (e.g. are the population means of
Capacity (kW) the same for each of the four manufactures?)
- Now include
Rotor-diameter (m) and do a descriptive analysis
- Assuming this is a random sample from a larger population, do a statistical analysis
Afterwards, finish exercises from previous lectures.