Monte Carlo simulations

The ASTA team

Basic principle

Monte Carlo (MC) estimation of the probability of event \(A\):

\[ P(A) \approx \frac{N_A}{N} \]

(Frequentist)

Many, many simulations

Letting the number of experiments approach infinity, \(\approx\) becomes \(=\):

\[ P(A) = \lim_{N \rightarrow \infty} \frac{N_A}{N} \]

Example 1

Recall (lecture 1.2):

(The horizontal axis is on a log scale).

(Pseudo) random number generation

sample(1:10, 2)
## [1] 9 2
sample(1:10, 2)
## [1] 10  1
set.seed(48)
sample(1:10, 2)
## [1] 5 3
set.seed(48)
sample(1:10, 2)
## [1] 5 3

Example 1 - cont’d

Let the computer toss the coin (MC simulation):

set.seed(48)
x2 <- sample(c(0, 1), 10000, replace = TRUE, prob = c(0.5, 0.5))
head(x2)
## [1] 0 1 1 1 1 1

Example 2

Recall (lection 2.1):

prob <- 1 - pdist("norm", mean = 0, sd = 1, q = 2.7, xlim = c(-4, 4)) 

prob
## [1] 0.0035

Use MC simulation:

set.seed(48)
N <- 10000
result <- rnorm(n = N, mean = 5, sd = 0.1/sqrt(100))
sprintf("%0.5f", head(result))
## [1] "5.00200" "4.97220" "4.99304" "5.02075" "5.00790" "5.00490"
hist(result)

sum(result >= 5.027)
## [1] 34
sum(result >= 5.027)/N
## [1] 0.0034
prob1 <- mean(result >= 5.027)
prob1
## [1] 0.0034
prob1
## [1] 0.0034
prob
## [1] 0.0035
prob1 - prob
## [1] -6.7e-05
(prob1 - prob)/prob
## [1] -0.019

More complicated simulations

set.seed(48)
x2 <- sample(c(0, 1), 10000, replace = TRUE, prob = c(0.5, 0.5))
mean(x2)
## [1] 0.49
set.seed(48)
x2 <- replicate(N, {
  sample(c(0, 1), 1, prob = c(0.5, 0.5))
})
mean(x2)
## [1] 0.49

Variance is important

Averages are not always sufficient. Think of budgets:

Example 3: Bivariate reliability problem

Limit state function:

Model

\[ g(B, C) = B - C \leq 0 \]

Assess

\[ P(g(B, C) < 0) \]

Simulations

set.seed(48)
N <- 100000
Bdata <- rnorm(N, mean = 100, sd = 10)
Cdata <- rnorm(N, mean = 50, sd = 10)
gdata <- Bdata - Cdata
sum(gdata < 0)
## [1] 21
prob1 <- mean(gdata < 0)
prob1
## [1] 0.00021

Results

plot(Bdata, Cdata)
abline(a = 0, b = 1, col = "red", lwd = 3)
points(Bdata[gdata < 0], Cdata[gdata < 0], col = "red")

Theoretical answer

For the particular case, we can exploit that the sum of two normal distributions is also a normal distribution, and due to independence the variance is the sum of the two variances:

g_mean <- 100 - 50
g_var <- 10^2 + 10^2
prob <- pdist("norm", q = 0, 
              mean = g_mean, 
              sd = sqrt(g_var), plot = FALSE)
prob
## [1] 2e-04
prob - prob1
## [1] -6.5e-06
(prob - prob1) / prob
## [1] -0.032

Sensitivity

How does \[ P(g(B, C) < 0) \] depend on the variance/standard deviations?

Illustrating \(g(B, C) < 0\)

Area where red is larger than blue illustrates \(P(g(B, C) < 0)\):

plotDist("norm", mean = 100, sd = 10, 
         xlim = c(10, 175), col = "blue")
plotDist("norm", mean =  50, sd = 10, 
         add = TRUE, col = "red")

plotDist("norm", mean = 100, sd = 50, 
         xlim = c(10, 175), col = "blue")
plotDist("norm", mean =  50, sd = 50, 
         add = TRUE, col = "red")

sapply, lapply, for loops

Using sapply (other possibilities: lapply, for loops):

set.seed(48)
N <- 100000
sigmas <- seq(5, 50, length.out = 50)
probs <- sapply(sigmas, function(sigma) {
  Bdata <- rnorm(N, mean = 100, sd = sigma)
  Cdata <- rnorm(N, mean = 50, sd = sigma)
  gdata <- Bdata - Cdata
  mean(gdata < 0)
})
plot(sigmas, probs, type = "l")
abline(h = prob1)
abline(v = 10)

Example 4: Wholesales offer

No patent

The expected sales for next year is $1-$9 mio. with $3 mio. being the most likely value.

Triangular distribution:

plotDist("tri", params = list(min = 1, max = 9, mode = 3))

With patent

Sales goes up 25%-75% with 50% percent being the most likely value

Triangular distribution:

plotDist("tri", params = list(min = 0.25, max = 0.75, mode = 0.5))

Simulating sales

set.seed(48)
N <- 10000
got_patent <- sample(x = c(0, 1), prob = c(0.6, 0.4), size = N, replace = TRUE)
table(got_patent)
## got_patent
##    0    1 
## 5935 4065
sales_base <- rtri(N, min = 1, max = 9, mode = 3)
patent_markup <- rtri(N, min = 0.25, max = 0.75, mode = 0.5)
sales <- sales_base + got_patent*patent_markup*sales_base
hist(sales)

sales_avg <- mean(sales)
sales_avg
## [1] 5.2
quantile(sales, probs = c(0.025, 0.975))
##  2.5% 97.5% 
##   1.8  10.9

Wholesale offer

Wholesale offer to buy your entire production for the next year for $6 mio.

Take the deal?

Many, many aspects:

Initially: How likely is it that we can sell the production for more ourselves?

prob <- mean(sales > 6)
prob
## [1] 0.33

Does not answer the question on its own, but it helps (e.g. in assessing the risk).