Stochastic processes I

The ASTA team

Concepts, terminology and examples

AP <- AirPassengers
plot(AP, ylab = "Bookings (1000s)")

APyear <- aggregate(AP, FUN = mean)
plot(APyear)

Such a systematic change in a time series that is not periodic is known as the trend of the series.

cyc <- cycle(AP)
cyc <- factor(cyc, labels = month.abb)
boxplot(AP ~ cyc)

APJun55Dec56 <- window(AP, start = c(1955, 6), end = c(1956, 12))
APJun55Dec56
##      Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
## 1955                     315 364 347 312 274 237 278
## 1956 284 277 317 313 318 374 413 405 355 306 271 306
APAug <- window(AP, start = c(1949, 8), end = end(AP), freq = TRUE)

Stochastic trend

www <- "https://asta.math.aau.dk/eng/static/datasets?file=pounds_nz.dat"
exchange_data <- read.table(www, header = TRUE)
exchange <- ts(exchange_data, start = 1991, freq = 4)
plot(exchange)

Multiple series

CBEdata <- read.table("https://asta.math.aau.dk/eng/static/datasets?file=cbe.dat",
                      header = TRUE)
CBE <- ts(CBEdata, start = 1958, freq = 12)
plot(CBE)

APCBE <- ts.intersect(AP, CBE)
plot(APCBE)

cor(APCBE)
##                  AP   CBE.choc   CBE.beer   CBE.elec
## AP        1.0000000  0.6375808 -0.4434747  0.8841668
## CBE.choc  0.6375808  1.0000000 -0.6204467  0.7644756
## CBE.beer -0.4434747 -0.6204467  1.0000000 -0.3120179
## CBE.elec  0.8841668  0.7644756 -0.3120179  1.0000000

However, this does not imply that US air passengers really influence the electricity consumption in Australia (or vice versa)! Time series with similar trends and seasonality will typically show strong correlation even though they are unrelated. Better explanations may be due to similar population growth, seasonal patterns, etc.

Decomposition

Decomposition - trend term

choc <- CBE[, "choc"]
i <- 7
(choc[i-6]/2 + sum(choc[(i-5):(i+5)]) + choc[i+6]/2)/12
## [1] 2413.625

This computation can be done for all time points (except the first six and last six) using decompose, which returns a list with a component trend (and other things to be discussed shortly):

choc_decomp <- decompose(choc)
choc_trend <- choc_decomp$trend
window(choc_trend, end = c(1958, 8))
##           Jan      Feb      Mar      Apr      May      Jun      Jul
## 1958       NA       NA       NA       NA       NA       NA 2413.625
##           Aug
## 1958 2409.500

Decomposition - seasonal term

choc_no_trend <- choc - choc_trend
plot(choc_no_trend)

choc_jan <- mean(window(choc_no_trend, st = c(1958, 1), freq = TRUE), na.rm = TRUE)
choc_jan
## [1] -2450.538
choc_may <- mean(window(choc_no_trend, st = c(1958, 5), freq = TRUE), na.rm = TRUE)
choc_may
## [1] 1171.215

It appears that chocolate consumption typically is much higher in May than in January. The season effects are also calculated by decompose and stored in the list as seasonal:

plot(choc_decomp$seasonal)

Decomposition - random term

plot(choc_decomp)

Stationarity and autocorrelation

choc_rand <- na.omit(choc_decomp$random)
plot(choc_rand)

This is a single realization of something we may think of as a random experiment. If things had been different (different weather, different chocolate advertisements, different economic fluctuations, …) we could imagine another series.

This hypothetical scenario applies to any time series \(x_t\) where we imagine our dataset is somehow randomly selected among an entire ensemble of possible time series.

i <- 1:(length(choc)-1)
plot(choc[i], choc[i+1])
abline(lsfit(choc[i], choc[i+1]))

cor(choc[i], choc[i+1])
## [1] 0.8025919
i <- 1:(length(choc_rand)-1)
plot(choc_rand[i], choc_rand[i+1])
abline(lsfit(choc_rand[i], choc_rand[i+1]))

cor(choc_rand[i], choc_rand[i+1])
## [1] -0.02278652

Correlogram (empirical acf)

choc_acf <- acf(choc)

This is called the correlogram.

acf(choc_rand)

Typical patterns in correlograms

It can be illustrative to consider a few specific examples of correlograms:

x <- rnorm(100)
par(mfrow = c(2,1), mar = c(4,4,0.5,0.5))
ts.plot(x)
acf(x, main = "", lag.max = 30)

x <- 1:100
par(mfrow = c(2,1), mar = c(4,4,0.5,0.5))
ts.plot(x)
acf(x, main = "", lag.max = 30)

x <- sin((1:100)*2*pi/10)
par(mfrow = c(2,1), mar = c(4,4,0.5,0.5))
ts.plot(x)
acf(x, main = "", lag.max = 100)

x0 <- c(3,5,2,1,4)
x <- rep(x0, 20)
par(mfrow = c(2,1), mar = c(4,4,0.5,0.5))
ts.plot(x)
acf(x, main = "", lag.max = 30)

Partial autocorrelation function

x <- 1:100
par(mfrow = c(2,1), mar = c(4,4,0.5,0.5))
ts.plot(x)
pacf(x, main = "")

Models for time series data

There are two fundamentally different model classes for time series data.

Next lecture will focus on discrete time models and in the remainder we will touch upon continuous time models.

Continuous time stochastic processes

In this setup we see the underlying \(X_t\) as a continuous function of \(t\) for \(t\) in some interval \([0,T]\).

In principle we imagine that there are infinitely many data points, simply because there are infinitely many time points between 0 and \(T\). Of course this is never true: In practice we will always only have finitely many data points.

But it makes sense to believe that the real data actually contains all the data points. We are just not able to measure them (and to store them in a computer).

With a model for all datapoints, we are - through simulation - able to describe the behaviour of data. Also between the observations.

Wiener process

A key example of a process in continuous time will be the so–called Wiener process.

Three simulated realizations (black, blue and red) of this process can be seen here

## Warning in rgl.init(initValue, onlyNULL): RGL: unable to open X11 display
## Warning: 'rgl_init' failed, running with rgl.useNULL = TRUE
## Warning: .onUnload failed in unloadNamespace() for 'rgl', details:
##   call: fun(...)
##   error: object 'rgl_quit' not found
## Package 'Sim.DiffProc', version 4.0
## browseVignettes('Sim.DiffProc') for more informations.

A Wiener process has the following properties:

The intuition of this process is that it somehow changes direction all the time: How the process changes after time \(s\) will be independent of what has happened before time \(s\). So whether the process should increase or decrease after \(s\) will not be affected by how much it was increasing or decreasing before.

This gives the very bumpy behaviour over time.

Stochastic differential equations

A common way to define a continuous time stochastic process model is through a stochastic differential equation (SDE) which we will turn to shortly, but before doing so we will recall some basic things about ordinary differential equations.

Suppose \(f\) is an unknown differentiable function and \(f(0)=1\). Recall the mathematical description of a differential equation

\[\frac{df(t)}{dt}=-4f(t)\]

With the condition \(f(0)=1\) this equation has the solution

\[f(t)=\exp(-4t)\]

With a slightly unusual notation we can rewrite this as \[df(t)=-4\cdot f(t)dt\] This equation has the following (hopefully intuitive) interpretation:

So when \(t\) is increased, then \(f\) is decreased. And the decrease is determined by the value of \(f(t)\). That is why \(f\) decreases slower and slower, when \(t\) is increased.

We say that the function has a drift towards zero, and this drift is determined by the value of the function.

Stochastic differential equations

It will probably never be true that data behaves exactly like the exponentially decreasing curve on the previous slide.

Instead we will consider a model, where some random noise from a Wiener process has been added. Two different (black/blue) simulated realizations can be seen below

The type of process that is simulated above is described formally by the equation \[d X_t=-4 X_tdt+0.1d W_t\] This is called a Stochastic Differential Equation (SDE), and the processes simulated above are called solutions of the stochastic differential equation.

The SDE \(d X_t=-4 X_tdt+0.1d W_t\) has two terms:

The intuition behind this notation is very similar to the intuition in the equation \(df(t)=-4\cdot f(t)\;dt\) for an ordinary differential equation. When the time is increased by the small amount \(dt\), then the process \(X_t\) is increased by \(-4X_t\,dt\) AND by how much the process \(0.1W_t\) has increased on the time interval \([t,t+dt]\).

So this process has a drift towards zero, but it is also pushed in a random direction (either up or down) by the Wiener process (more precisely, the process \(0.1W_t\))