---
title: "Stochastic processes III"
author: "The ASTA team"
output:
  slidy_presentation:
    fig_caption: no
    highlight: tango
  pdf_document:
    fig_caption: no
    highlight: tango
    keep_tex: yes
    number_section: yes
    toc: yes
---

# Continuous time stochastic processes

## Data example

In this lecture we will study a type of data that on the surface looks
like data from time series analysis as presented in the previous
lectures. As an illustration, we shall study the data set `Irates`, that
contains information about interest rates in the U.S. between 1946 and
1991.

We are interested in the relation between the two variables:

-   `t=time`: The time point of the measurement.

-   `X_t=rate`: The interest rate at the corresponding time point.

```{r echo=FALSE,message=FALSE,warning=FALSE}
library(Sim.DiffProc)
data(Irates)
plot(Irates[,'r1'],ylab='rates')
```

The plot should be understood as follows: For each time point (between
1946 and 1991) there is a value of the interest rate called $X_t$. So
$X_t$ could be seen as a function of $t$, and this function is plotted.

In principle we imagine that there are infinitely many data points,
simply because there are infinitely many time points between 1946 and
1991. Of course this is never true: In practice we will always only have
finitely many data points.

But it makes sense to believe that the real data actually contains all
the data points. We are just not able to measure them (and to store them
in a computer).

With a model for all datapoints, we are - through simulation - able to
describe the behaviour of data. Also between the observations.

# Wiener process

A key example of a process in continuous time will be the so–called
Wiener process.

Three simulated realizations (black, blue and red) of this process can be seen
here

```{r echo=FALSE}
set.seed(123)
f<- expression(0)    ## the drift term as a function of x
g<-expression(1)      ## the diffusion term as a function of x
res <- snssde1d(drift=f,diffusion=g,M=3,N=1000,t0=0,T=10,x0=0)
plot(res,plot.type='single',col=c('black','blue', 'red'))
```

A Wiener process has the following properties:

-   It starts in 0: $W_0=0$.

-   It has independent increments: For $0<s<t$ it holds that $W_t-W_s$
    is independent of everything that has happened up to time $s$, that
    is $W_u$ for all $u\leq s$.

-   It has normally distributed increments: For $0<s<t$ it holds that the increment $W_t-W_s$ is normally distributed with variance $t-s$: $$W_t-W_s\sim N(\mu=0,\sigma^2=t-s).$$

The intuition of this process is that it somehow changes direction all
the time: How the process changes after time $s$ will be independent of
what has happened before time $s$. So whether the process should
increase or decrease after $s$ will not be affected by how much it was
increasing or decreasing before.

This gives the very bumpy behaviour over time.

<!-- Stochastic process with drift and variance -->
<!-- ================== -->

<!-- If we start out with a Wiener process $W_t$ and two parameters $a$ and -->
<!-- $\sigma$, where $\sigma\geq 0$, we can define a new process $X_t$ by -->

<!-- $$X_t=at+\sigma W_t$$ -->

<!-- When $X_t$ is defined like this, we say that $X_t$ is a continuous time -->
<!-- stochastic process with **drift** $a$ and **variance** $\sigma^2$ -->
<!-- (that we use $\sigma^2$ instead of $\sigma$ is explained further below). -->

<!-- Now we simulate three Brownian motions to illustrate the impact of the -->
<!-- drift parameter $a$. -->

<!-- In the plot three Brownian motions are simulated. All have $\sigma=1$. -->
<!-- Furthermore: -->

<!-- -   The black has $a=0$ (so this is in fact a Wiener process. It may -->
<!--     look differently because of the changed scale on the $y$–axis) -->

<!-- -   The blue has $a=2$. Here the drift is positive, so the process has -->
<!--     an increasing tendency. -->

<!-- -   The red has $a=-4$. Here the drift is negative, so the process has a -->
<!--     decreasing tendency. -->

<!-- Now we shall try to illustrate the impact of the variance parameter. -->

<!-- All of these three Brownian motions have $a=0$. Furthermore: -->

<!-- -   The black has $\sigma^2=1$ (so this is again a Wiener process) -->

<!-- -   The blue has $\sigma^2=2$. Here higher variance makes this process -->
<!--     vary more. -->

<!-- -   The red has $\sigma^2a=6$. This process varies even more than the -->
<!--     blue process. -->

# Differential equation models

## Ordinary differential equations

A common way to define a continuous time stochastic process model is through a stochastic differential
equation (SDE) which we will turn to shortly, but before doing so we will recall some
basic things about ordinary differential equations.

Suppose $f$ is a differentiable function. Recall the mathematical
description of a differential equation

$$\frac{df(t)}{dt}=-4f(t)$$

This equation has the solution

$$f(t)=C\cdot \exp(-4t)$$
for any constant $C$. If we furthermore know
that $f(0)=1$, then $C=1$ and

$$f(t)=\exp(-4t)$$

The solution can be seen below

```{r echo=FALSE}
t <- seq(0,10,by=0.1)
plot(t, exp(-4*t),xlab='t',ylab='f(t)',type='l',lwd=2)
```

With a slightly unusual notation we can
rewrite this as $$df(t)=-4\cdot f(t)dt$$
This equation has the
following (hopefully intuitive) interpretation:

-   We imagine that we increase the time point from $t$ to $t+dt$, where
    $dt$ is something small. So the time is increased by $dt$.

-   Then the value of $f$ is (approximately) changed from $f(t)$ to
    $f(t)-4f(t)dt$. So actually the value of $f$ is decreased by
    $4f(t)\:dt$

So when $t$ is increased, then $f(t)$ is decreased. And the decrease is
determined by the value of $f(t)$. That is why $f$ decreases slower and
slower, when $t$ is increased.

We say that the function has a **drift** towards zero, and this drift is
determined by the value of the function.

----

## Stochastic differential equations

It will probably never be true that data behaves exactly like the
exponentially decreasing curve on the previous slide.

Instead we will consider a model, where some random noise from a
Wiener process has been added. Two different (black/blue) simulated
realizations can be seen below

```{r echo=FALSE}
f <- expression(-4*x)    ## the drift term as a function of x
g <-expression(0.1)      ## the diffusion term as a function of x
res <- snssde1d(drift=f,diffusion=g,M=2,N=1000,t0=0,T=1,x0=1)
plot(res, plot.type='single', col=c('black','blue'))
```

The type of process that is simulated above is most often described
formally by the equation
$$d X_t=-4 X_tdt+0.1d W_t$$
This is called a
**Stochastic Differential Equation** (SDE), and the processes simulated
above are called solutions of the stochastic differential equation.

----

The SDE $d X_t=-4 X_tdt+0.1d W_t$ has two terms:

-   $-4X_t dt$ is the **drift term**.

-   $0.1dW_t$ is the **diffusion term**.

The intuition behind this notation is very similar to the intuition in
the equation $df(t)=-4\cdot f(t)\;dt$ for an ordinary differential
equation. When the time is increased by the small amount $dt$, then the
process $X_t$ is increased by $-4X_t\,dt$ AND by how much the process
$0.1W_t$ has increased on the time interval $[t,t+dt]$.

So this process has a **drift** towards zero, but it is also pushed in a
random direction (either up or down) by the Wiener process (more
precisely, the process $0.1W_t$)

<!-- We can see the Brownian motion as a special case of a stochastic -->
<!-- differential equation. The Brownian motion with drift $3$ and variance -->
<!-- $2$ simply has the equation -->

<!-- $$d X_t=3dt+2dW_t$$ Or in the other words: The drift term is $3dt$ and -->
<!-- the diffusion term is $2dW_t$. -->

----

### Simulation examples

Firstly we simulate the SDE from before
$$d X_t=-4 X_tdt+0.1d W_t$$
For
this, we need the package `Sim.DiffProc`. We use
the function `snssde1` for which we have to specify the drift term and
the diffusion term as R-functions of $x$. In this case, the function for
the diffusion term is constantly equal to $0.1$.

The parameters needed in the function input are:

-   `drift` is the function determining the drift term.

-   `diffusion` is the function determining the diffusion term.

-   `M` is the desired number of realizations of the process.

-   `N` is the number of simulation steps (`R` does not simulate a
    continuous curve but a lot of connected dots, and this is the number
    of dots).

-   `t0` is the initial time of the simulated process.

-   `T` is the ending time of the simulated process.

-   `x0` is the initial value of the process (the value of $X$ at
    time `t0`).
    
```{r}
library(Sim.DiffProc)
f <- expression(-4*x)    ## the drift term as a function of x
g <- expression(0.1)      ## the diffusion term as a function of x
res <- snssde1d(drift=f, diffusion=g, M=2, N=1000, t0=0, T=10, x0=1)
plot(res, plot.type = 'single', col = c('black','blue'))
```

With the ending time being larger than before, we see that the process
stabilizes around 0: There is a drift towards 0, but also some noise
pushing the process away from 0.

----

Increasing the diffusion parameter from $0.1$ to $1$ (i.e. $dX_t = -4X_tdt + 1 dW_t$) makes the process more varying: 
```{r}
f <- expression(-4*x)
g <-expression(1)
res <- snssde1d(drift=f, diffusion=g, M=2, N=1000, t0=0, T=10, x0=1)
plot(res, plot.type='single', col=c('black','blue'))
```

----

Making the drift parameter positive (e.g. $dX_t = 0.2X_tdt + 0.1 dW_t$) drives the process away from 0:

```{r}
f <- expression(0.2*x)
g <-expression(0.1)
res <- snssde1d(drift=f, diffusion=g, M=2, N=1000, t0=0, T=10, x0=0.2)
plot(res,plot.type='single',col=c('black','blue'))
```

----

Sometimes the noise depends on the value of the process itself. In this case the
diffusion term includes $X_t$. E.g. $dX_t = 0.2X_t dt + 0.1 \sqrt{X_t} dW_t$:

```{r}
f <- expression(0.2*x)
g <- expression(0.1*sqrt(x))
res <- snssde1d(drift=f,diffusion=g,M=2,N=1000,t0=0,T=10,x0=0.2)
plot(res,plot.type='single',col=c('black','blue'))
```

----

## Fitting SDE models to data

When we want to fit a model to data, we will work with the following more general stochastic
differential equation
$$dX_t = (\theta_1+\theta_2 X_t)dt + \theta_3X_t^{\theta_4}dW_t$$
where
$\theta_1,\theta_2,\theta_3,\theta_4$ are parameters, such that
$\theta_3,\theta_4\geq 0$. For a given dataset the goal is then to find (estimate)
parameter values such that the model describes the data as well as possible.

We note that:

-   The SDE above given by $d X_t=-4 X_tdt+0.1d W_t$ is the special case
    with $\theta_1=0$, $\theta_2=-4$, $\theta_3=0.1$ and $\theta_4=0$
    (recall the mathematical convention that $x^0=1$).

-   The Wiener process $X_t = W_t$ is the special case with $\theta_1=0$, $\theta_2=0$, $\theta_3=1$ and $\theta_4=0$.
    
-   The ordinary differential equation is the special case with
    $\theta_1=0$, $\theta_2=-4$, $\theta_3=0$ and $\theta_4=0$ (in principle,
    $\theta_4$ could be anything, when $\theta_3=0$).

-   The parameters $\theta_1$ and $\theta_2$ control the drift, and in fact it
    can be shown that the process will drift towards $-\frac{\theta_1}{\theta_2}$
    if this is positive, and otherwise it will drift away from this. For example
    if $dX_t = (2-4X_t) dt + 0.1 dW_t$ then $X_t$ will drift towards $-\frac{2}{-4}=0.5$:

```{r}
f <- expression(2-4*x)
g <- expression(0.1)
res <- snssde1d(drift=f, diffusion=g, M=2, N=1000, t0=0, T=10, x0=1)
plot(res, plot.type='single', col=c('black','blue'))
abline(h = 0.5, col = "red")
```

----

### Fitting an SDE model to interest rate data

Recall the data for U.S. interest rates between 1946 and 1991.

This could look like a stochastic differential equation with
$\theta_1=0$ and $\theta_4$ being positive, since the process varies
more, when it has high values. We can use the function `fitsde` to find
the best choice of parameters. Note that in the function we have to give
a (good) guess on the parameters (here we use $\theta_1=0$,
$\theta_2=0$, $\theta_3=0.5$ and $\theta_4=0.5$ )

```{r}
data(Irates)
rate <- Irates[ , "r1"]
fx <- expression(theta[1]+theta[2]*x)
## the drift term as a function of x
gx <- expression(theta[3]*x^theta[4])
## the diffusion term as a function of x
fitmod <- fitsde(rate, drift = fx, diffusion = gx,
                 start = list(theta1=0, theta2=0, theta3=0.5, theta4=0.5))
summary(fitmod)
```

Thus, the estimated model is
$$dX_t = (0.89 - 0.16 X_t) dt + 0.71X_t^{0.59} dW_t.$$

We can have the confidence intervals by
```{r}
confint(fitmod)
```

To draw random realizations from the fitted model we just have to extract the fitted
parameters and then use `snssde1` as before:
```{r}
theta <- coef(fitmod)
t <- time(Irates)
s <- snssde1d(drift=fx, diffusion = gx, M = 3, t0 = min(t), T = max(t), x0 = 0.325)
```

A plot of three realizations overlayed the original data:
```{r}
plot(s, plot.type = "single", col = c("red", "blue", "cyan"))
lines(Irates[,'r1'], col = "black", lwd = 2)
```

----

## Comparing fitted SDE models

If we believe that the data can be better (or equally well) described by another
model we can compare the model using the AIC as we did previously for discrete processes.

If we propose a model with no drift (i.e. $\theta_1=0$ and $\theta_2=0$) we get the following
fitted model:

```{r}
f2 <- expression(0)
## No drift term
g2 <- expression(theta[1]*x^theta[2])
## the diffusion term as a function of x
fitmod2 <- fitsde(rate, drift = f2, diffusion = g2,
                  start = list(theta1=0.5, theta2=0.5))
summary(fitmod2)
```

The AIC of the original model with four parameters is lower than the AIC of this new
model so we prefer the original model:
```{r}
AIC(fitmod)
AIC(fitmod2)
```
