--- title: 'Module 3: Exercises for binomial model' author: "" date: "" output: pdf_document: default html_document: default --- ### Exercise 1 (solve by inserting code in the Rmd file) The following figure shows the likelihood and log-likelihood for the binomial model when $n=10$ and $y=3$: ```{r out.width="90%", fig.height=5, fig.width=12} lik <- function(parm, y, n){parm^y * (1 - parm)^(n - y)} loglik <- function(parm, y, n){y * log(parm) + (n - y) * log(1 - parm)} par(mfrow=c(1, 2), mar = c(3,3,3,1)) n <- 10; y <- 3 curve(lik(x, y, n), main = "Likelihood") abline(v = y/n, lty = 2, col = 2) curve(loglik(x, y, n), main = "Log-likelihood", ylim = c(-20, -5)) abline(v = y/n, lty = 2, col = 2) ``` Redo the likelihood plot above for $n=20, y=6$, for $n=50, y=15$ and for $n=500, y=150$. ```{r out.width="90%", fig.height=5, fig.width=15} par(mfrow=c(1, 3), mar = c(3,3,3,1)) n <- 20; y <- 6 curve(lik(x, y, n), main = paste("Likelihood:", "n =", n, "y =", y)) n <- 50; y <- 15 curve(lik(x, y, n), main = paste("Likelihood:", "n =", n, "y =", y)) n <- 500; y <- 150 curve(lik(x, y, n), main = paste("Likelihood:", "n =", n, "y =", y)) ``` - Do the plots surprise you? - What do you conclude? As $n$ (the number of trials) grows the uncertainty of the parameter estimate becomes smaller. ### Exercise 2 (solve by hand with pen and paper) For the binomial model - Differentiate $l(\theta)$ to obtain $l'(\theta)$ and verify that the solution to $l'(\theta)=0$ is $\hat \theta= y/n$. $$l'(\theta) = \frac{y}{\theta} - \frac{n-y}{(1-\theta)}$$ $$l'(\theta)=0 \Leftrightarrow \frac{y}{\theta} = \frac{n-y}{1-\theta} \Leftrightarrow \theta=y/n$$ **Only do the next two bullets if you have solved all other exercises (also Exercise 3):** - Differentiate $l'(\theta)$ to obtain $l''(\theta)$. $$l''(\theta) = -\frac{y}{\theta^2} - \frac{n-y}{(1-\theta)^2}$$ - For the MLE it generally holds that the variance of $\hat\theta$ is approximately $$ \text{Var}(\hat\theta) \approx - 1 / l''(\hat\theta). $$ Verify by a direct computation that this in fact results in the estimated variance found in the text: $$ \hat\theta(1-\hat\theta)/n = y(n-y)/n^3. $$ $$ \ell''(\hat\theta) = -\frac{y}{(y/n)^2} - \frac{n-y}{(1-y/n)^2} = -\frac{n^2}{y} - \frac{n-y}{(n-y)^2/n^2} = -n^2(\frac{1}{y} + \frac{1}{(n-y)}) = -n^2\frac{n-y+y}{y(n-y)} $$ $$ - 1 / l''(\hat\theta) = \frac{y(n-y)}{n^3} $$ ### Exercise 3 (solve by inserting code in the Rmd file) For the Bayesian example with discrete prior: 1. Think about the effect data has on the posterior when compared to the prior. 2. Repeat the computations (mean and variance of posterior) and plots of the text but with $n=100,y=30$. Do the results surprise you? ```{r fig.height=2.5, fig.width=7.5} theta <- c(.1, .3, .5, .7, .9) prior <- c(0.10, 0.15, 0.25, 0.30, 0.20) n <- 100; y <- 30 likval <- lik(theta, y, n) posterior <- likval * prior posterior <- posterior / sum( posterior ) par(mfrow=c(1,3), mar = c(3, 3, 3, 0.5)) barplot(prior, main="prior", names.arg=theta, ylim = c(0, .55)) barplot(likval, main="likelihood", names.arg=theta) barplot(posterior, main="posterior", names.arg=theta, ylim = c(0, .55)) sum(theta * posterior) # Posterior mean closer to pure data driven estimate y/n sum(theta^2 * posterior) - sum(theta * posterior)^2 # Posterior variance decreases ``` 3. Repeat the computations and plots for the case where the prior has a uniform distribution (i.e. if all five values have prior probability $0.20$), and $n=10,y=3$. What is the ``relationship'' between the posterior and the likelihood in this case? **The posterior and likelihood are now directly proportional:** ```{r fig.height=2.5, fig.width=7.5} theta <- c(.1, .3, .5, .7, .9) prior <- rep(.2, 5) n <- 10; y <- 3 likval <- lik(theta, y, n) posterior <- likval * prior posterior <- posterior / sum( posterior ) par(mfrow=c(1,3), mar = c(3, 3, 3, 0.5)) barplot(prior, main="prior", names.arg=theta, ylim = c(0, .55)) barplot(likval, main="likelihood", names.arg=theta) barplot(posterior, main="posterior", names.arg=theta, ylim = c(0, .55)) sum(theta * posterior) sum(theta^2 * posterior) - sum(theta * posterior)^2 ``` 4. Lastly, repeat the computations and plots for the case where $\pi(0.1)=\pi(0.3)=\pi(0.5)=\pi(0.9)=0.01$ and $\pi(0.7)=0.96$ (still $n=10,y=3$). Comment on the result. **Now the strong prior shifts the results away from the pure data driven results:** ```{r fig.height=2.5, fig.width=7.5} theta <- c(.1, .3, .5, .7, .9) prior <- rep(.01, 5) prior[4] <- .96 n <- 10; y <- 3 likval <- lik(theta, y, n) posterior <- likval * prior posterior <- posterior / sum( posterior ) par(mfrow=c(1,3), mar = c(3, 3, 3, 0.5)) barplot(prior, main="prior", names.arg=theta, ylim = c(0, .9)) barplot(likval, main="likelihood", names.arg=theta) barplot(posterior, main="posterior", names.arg=theta, ylim = c(0, .9)) sum(theta * posterior) sum(theta^2 * posterior) - sum(theta * posterior)^2 ```