The ASTA team
\[ X \sim N(\mu, \sigma) \]
Assumes \(\mu\) and \(\sigma\) are some fixed values:
\[ f(x) = f(X = x; \mu, \sigma) = {\frac {1}{\sigma {\sqrt {2\pi }}}} \exp \left [ {-{\frac {1}{2}}\left({\frac {x-\mu }{\sigma }}\right)^{2}} \right ] \]
E.g. for \(\mu = 0\) and \(\sigma = 1\):
What if instead the data is fixed and we let \(\mu\) and \(\sigma\) vary?
\[ L(\mu, \sigma) = L(\mu, \sigma; X = x) = f(X = x; \mu, \sigma) \]
E.g. assume that \(X=2\) was observed:
The value of \(\mu\) that gives the highest probability (density) of observing \(X=2\) is \(\mu = 2\).
Assume independence for \(n\) observations, then
\[\begin{align} L(\mu, \sigma) = L(\mu, \sigma; X_1, X_2, \ldots, X_n) = \prod_{i=1}^n L(\mu, \sigma; X_i) = \prod_{i=1}^n L(\mu, \sigma; X_i) . \end{align}\]
Do you know a function that turns products into sums?
\[ \log (a \cdot b) = \log(a) + \log(b) \]
Assume independence for \(n\) observations, then
\[\begin{align} L(\mu, \sigma) = L(\mu, \sigma; X_1, X_2, \ldots, X_n) &= \prod_{i=1}^n L(\mu, \sigma; X_i) = \prod_{i=1}^n L(\mu, \sigma; X_i) \\ l(\mu, \sigma) = \log L(\mu, \sigma) &= \log L(\mu, \sigma; X_1, X_2, \ldots, X_n) \\ &= \log \left ( \prod_{i=1}^n L(\mu, \sigma; X_i) \right ) \\ &= \sum_{i=1}^n \log L(\mu, \sigma; X_i) \\ &= \sum_{i=1}^n l(\mu, \sigma; X_i) . \end{align}\]
Note that \(\log\) is a monotonic (increasing) function, so maximum of \(l = \log L\) is the same as that of \(L\).
\[\begin{align} L(\mu, \sigma) &= L(\mu, \sigma; X = x) = f(X = x; \mu, \sigma) \\ l(\mu, \sigma; X) &= \log L(\mu, \sigma; X) = \log f(X; \mu, \sigma) \end{align}\]
## Girth Height Volume
## 1 8.3 70 10
## 2 8.6 65 10
## 3 8.8 63 10
## 4 10.5 72 16
## 5 10.7 81 19
## 6 10.8 83 20
If Height is normally distributed, what are the parameters (mean and standard deviation)?
## [1] 6.4
single_loglik <- function(mu) {
sum(log(dnorm(trees$Height, mean = mu, sd = 6)))
}
single_loglik(65)
## [1] -153
## [1] -159
## [1] -211 -153 -116
Maximum around 76.
## $minimum
## [1] 4.6e-05
##
## $objective
## [1] -2588
## $maximum
## [1] 76
##
## $objective
## [1] -101
Maximum of \(l\) is minimum of \(-l\).
single_loglik <- function(mu) {
-sum(log(dnorm(trees$Height, mean = mu, sd = 6)))
}
optimise(single_loglik, interval = c(0, 100))
## $minimum
## [1] 76
##
## $objective
## [1] 101
Both parameters:
single_loglik <- function(pars) {
mu <- pars[1]
sigma <- pars[2]
-sum(log(dnorm(trees$Height, mean = mu, sd = sigma)))
}
optim(c(1, 50), single_loglik)
## Warning in dnorm(trees$Height, mean = mu, sd = sigma): NaNs produced
## Warning in dnorm(trees$Height, mean = mu, sd = sigma): NaNs produced
## Warning in dnorm(trees$Height, mean = mu, sd = sigma): NaNs produced
## $par
## [1] 76.0 6.3
##
## $value
## [1] 101
##
## $counts
## function gradient
## 87 NA
##
## $convergence
## [1] 0
##
## $message
## NULL
single_loglik <- function(pars) {
mu <- pars[1]
sigma <- exp(pars[2])
-sum(log(dnorm(trees$Height, mean = mu, sd = sigma)))
}
mle <- optim(c(1, 50), single_loglik)
mle
## $par
## [1] 76.0 1.8
##
## $value
## [1] 101
##
## $counts
## function gradient
## 113 NA
##
## $convergence
## [1] 0
##
## $message
## NULL
## [1] 6.3
## [1] 76
## [1] 6.4
## [1] 6.4