Applied statistics (CPH: BD)
Frontpage
Lecture plan
Vidcasts
Datasets
R install help
Data collection 2
Literature
[A] Section 10.2.
Also look at R for Data Science. Especially chapter 5 (Data transformation), 11 (Data import) and 12 (Tidy data).
Lecture material
This lecture as: slideshow (html), Rmarkdown (Rmd), notes (pdf).
The entire module: notes (pdf).
Exercises
In any order (read about both before starting):
- PISA 2015
- Go to http://www.oecd.org/pisa/data/2015database/
- Download questionnaires in English (extract and delete all except
CY6_QST_MS_SCQ_CBA_Final.pdf
)
- Download SPSS
School questionnaire data file
data and extract
- Discuss the PISA study (in terms of sampling etc.), see the general site http://www.oecd.org/pisa/
- Make relevant analyses (e.g. plots, tables, statistical)
- Data for energy sectors (technical exercise practising R):
- Go to https://ens.dk/service/statistik-data-noegletal-og-kort/data-oversigt-over-energisektoren and download "Data for eksisterende og afmeldte møller"
- Load the data in R (use the sheet
IkkeAfmeldte-Existing turbines
, but maybe not the entire sheet, see range
in ?readxl::read_xlsx
)
- Get an impression of the data (format, variables, ..)
- Construct some relevant tables and plots
- Discuss the concepts of sample and population
- Restrict the dataset to
Manufacture %in% c("Vestas Wind Systems A/S", "NEG Micon", "BONUS", "SIEMENS")
(hint: tidyverse::filter()
)
- Do a descriptive analysis of
Manufacture
(Fabrikat
) and Capacity (kW)
(Kapacitet (kW)
), e.g. tables and/or figures
- Assuming this is a random sample from a larger population, do a statistical analysis (e.g. are the population means of
Capacity (kW)
the same for each of the four manufactures?)
- Now include
Rotor-diameter (m)
and do a descriptive analysis
- Assuming this is a random sample from a larger population, do a statistical analysis
Afterwards, finish exercises from previous lectures.