Data collection and wrangling

The ASTA team

Data collection

Data collection

Ronald Fisher (1890-1962):

To consult the statistician after an experiment is finished is often merely to ask him to conduct a post mortem examination. He can perhaps say what the experiment died of.

Said about Fisher:

Data collection

Population and sample

Population and sample

Sample 3 of size \(n = 30\):

shape color n_sample p_sample p_pop p_diff
baby black 2 0.07 0.04 -0.02
baby blue 1 0.03 0.04 0.01
baby red 0 0.00 0.01 0.01
man black 5 0.17 0.12 -0.04
man blue 8 0.27 0.22 -0.04
man red 3 0.10 0.08 -0.02
woman black 3 0.10 0.23 0.13
woman blue 8 0.27 0.22 -0.05
woman red 0 0.00 0.02 0.02

Example: United States presidential election, 1936

Example: United States presidential election, 1936

(Based on Agresti, this and this.)

Example: United States presidential election, 1936

Example: United States presidential election, 1936

Example: Bullet holes of honor

Example: Bullet holes of honor

(Based on this.)

Example: Bullet holes of honor

Theory: Biases / sampling

Biases

Agresti section 2.3:

Sampling

Agresti section 2.4:

Data wrangling

Data wrangling

This will be illustrated with two specific cases.

The material is on Moodle.