{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "6e898384",
   "metadata": {},
   "source": [
    "# Experiments with random numbers"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 47,
   "id": "0beabb3c",
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import matplotlib.pyplot as plt\n",
    "import scipy.stats as stats"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5f8c03ce",
   "metadata": {},
   "source": [
    "## Coin flip\n",
    "\n",
    "Make a variable `coin` corresponding to 1000 fair coin flips using `rbinom` by the following command:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "id": "21f1350c",
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "coin = np.random.binomial(1, 0.5, 1000)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1cb06f80",
   "metadata": {},
   "source": [
    "The command `np.cumsum(coin)` successively sums up the values in the vector `coin`. Create a variable `cumsumcoin` by `cumsum_coin = np.cumsum(coin)` and inspect the first 10 entries in `coin` and `cumsum_coin` by the commands `coin[0:9]` and `cumsum_coin[0:9]`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "id": "67ad161f",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Delete this line and add the correct code yourself\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "543e1bba",
   "metadata": {},
   "source": [
    "The command `x = np.arange(1, 1001)` generates a vector of integers from 1\n",
    "to 1000. Therefore `y = cumsum_coin/x` corresponds to the relative\n",
    "frequency of ones through the vector `cumsum_coin`. Plot\n",
    "`x` vs. `y` and add a horizontal red line at 0.5 on the\n",
    "y-axis (hint: use `plt.axhline(y=0.5, color='red')` for the horizontal line).\n",
    "Discuss the look of the curve compared to your expectations."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "id": "ff2857e4",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Delete this line and add the correct code yourself"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0dd67e6e",
   "metadata": {},
   "source": [
    "## Uniform random numbers\n",
    "\n",
    "Make a variable with 1000 uniformly distributed random numbers drawn in the interval from 0 to 1:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "id": "179fcfe8",
   "metadata": {},
   "outputs": [],
   "source": [
    "rand1 = np.random.rand(1000)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5d67ab31",
   "metadata": {},
   "source": [
    "Make a histogram of the variable. Try to change the arguments `bins`, `color` and `edgecolor` in the `plt.hist` command."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 39,
   "id": "45bcd84e",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Delete this line and add the correct code yourself"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "624041ca",
   "metadata": {},
   "source": [
    "The histogram probably doesn't look like a normal distribution at all. \n",
    "Convince yourselves that the theoretical frequency curve - i.e. the density function - is a horizontal line.\n",
    "\n",
    "Convince yourselves that the population mean (expected value) is 1/2.\n",
    "\n",
    "It can be shown that the population standard deviation is approximately 0.289.\n",
    "How do these theoretical values fit with your empirical quantities? (Use the commands `rand1.std()` and `rand1.mean()`.)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 40,
   "id": "c7b1b2c8",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Delete this line and add the correct code yourself"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5946efc2",
   "metadata": {},
   "source": [
    "Make two extra random variables `rand2` and `rand3` like `rand1` above."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 41,
   "id": "563e5145",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Delete this line and add the correct code yourself"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "687600ca",
   "metadata": {},
   "source": [
    "Make a new variable `mean12` with the average of random variables 1 and 2:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 42,
   "id": "a52494a3",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Use code like this: mean12 = (rand1 + rand2) / 2"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b5576aeb",
   "metadata": {},
   "source": [
    "Make a histogram for this variable."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 43,
   "id": "69107ec9",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Delete this line and add the correct code yourself"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8d8e53a1",
   "metadata": {},
   "source": [
    "The histogram probably looks more like a normal distribution curve.\n",
    "It can be shown that the theoretical frequency curve - i.e. the density function - is a triangle.\n",
    "Convince yourselves that the population mean (expected value) is 1/2.\n",
    "Convince yourselves that the population standard deviation is approximately 0.289 divided by the square root of 2 (remember the CLT).\n",
    "\n",
    "How does this fit with your empirical quantities?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 44,
   "id": "2d0e55a5",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Delete this line and add the correct code yourself"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "443869f6",
   "metadata": {},
   "source": [
    "Make a new variable `mean123` with the average of the three random variables. Draw the histogram."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 45,
   "id": "a430dae2",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Delete this line and add the correct code yourself"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c95106d9",
   "metadata": {},
   "source": [
    "Hopefully this illustrates that when variables are averaged they tend to approach a normal distribution.\n",
    "This is one of the reasons that the normal distribution by far is the most used distribution to describe measurement data."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "44452d31",
   "metadata": {},
   "source": [
    "## Quantile comparison plots\n",
    "\n",
    "Another way of comparing a sample to the normal distribution is by a quantile comparison plot (QQ-plot).\n",
    "A QQ-plot for the `rand1` variable is made like this (try arguments `col`, `alpha`and `lwd`):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "60b0d520",
   "metadata": {},
   "outputs": [],
   "source": [
    "stats.probplot(rand1, plot=plt)\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "42a7fbc2",
   "metadata": {},
   "source": [
    "For a normally distributed sample this plot should look approximately linear. \n",
    "Try this for `mean12` and `mean123`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e2c7d0d7",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Delete this line and add the correct code yourself"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "bc2a9aa5",
   "metadata": {},
   "source": [
    "Finally, make a variable which is a sample from a normal distribution with mean 0 and standard deviation 1:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 54,
   "id": "4e084b83",
   "metadata": {},
   "outputs": [],
   "source": [
    "x = np.random.normal(loc=0, scale=1, size=1000)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "119d404f",
   "metadata": {},
   "source": [
    "Make the QQ-plot."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 55,
   "id": "c92f14b5",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Delete this line and add the correct code yourself"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.12"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}