---
output: html_document
---

# Exam exercise: Reading scores and sample size calculation

It is highly recommended that you answer the exam using Rmarkdown
(you can simply use the exam Rmarkdown file as a starting point).

Remember to load the `mosaic` package first:
```{r message=FALSE}
library(mosaic)
```

## Part I: Directed reading activities

An educator conducted an experiment to test whether new directed reading
activities in the classroom will help elementary school pupils improve
some aspects of their reading ability.

She arranged for a third grade class of 21 students
to follow these activities for an 8-week period. A control classroom of
23 third graders followed the same curriculum without the activities. At
the end of the 8 weeks, all students took a Degree of Reading Power
(DRP) test, which measures the aspects of reading ability that the
treatment is designed to improve.


Read in the data:
```{r}
reading <- read.delim("https://asta.math.aau.dk/datasets?file=reading.txt")
head(reading)
```

Use a boxplot to compare the of measurements of `Score` (the student's DRP score) 
for `Treated`(direct reading activities) and `Control` visually.
```{r}
## Delete this line and write a command using gf_boxplot(...)
```

Use `favstats` to make a numerical summary of the measurements for `Treated` and `Control`.

```{r}
## Delete this line and write a command using favstats(...)
```

-   Write down a point estimate of the mean of the DRP score for students
    following the new *directed reading activities* and explain how this
    is calculated.

-   Write down a point estimate of the standard deviation of the DRP score for
    this group and explain how this is calculated.

-   Write down a 95% confidence interval for the mean of the DRP score for this
    group and explain how this is calculated.

Use the command `t.test` to compare the mean of the DRP score of the two groups.

```{r}
## Delete this line and write a command using t.test(...)
```

Go through the details of the output from `t.test`. Your analysis must
include an account of

-   What the relevant null hypothesis and the corresponding alternative
    hypothesis is.

-   Choice and calculation of test statistic.

-   Calculation of $p$-value and its interpretation in connection to a
    conclusion of the analysis.

-   Calculation and interpretation of a relevant confidence interval.

## Part II: Determining sample size

_In this part there is no dataset to load into R and analyze. You are only supposed to
use R as a calculator where you apply the relevant formulas (which you find towards
the end of the lecture notes for Module 1)._


A study is being planned to estimate the proportion of the Danish population who smokes regularly. How large a sample size is needed to obtain an estimate which is at most
0.05 away from the true proportion with
a confidence of 0.90? A similar study from 2015 found the estimate of the proportion of smokers to be 22.5%.