The ASTA team
alpha = 0.9; gamma = 1; n = 100
x = as.ts(5*sin(1:n/5))
eps = arima.sim(model=list(ar=alpha),n=n)
y = gamma*x+eps
ts.plot(x,y,col=1:2)
- We should think of the red curve as some data we want to model, and the black curve as another variable which we believe may influence the data. - We can also plot \(x_t\) against \(y_t\) to get a view of the relation between the two variables rather the time evolution.
plot(as.numeric(x),as.numeric(y))
mod=arima(y,order=c(1,0,0),xreg=x); mod
##
## Call:
## arima(x = y, order = c(1, 0, 0), xreg = x)
##
## Coefficients:
## ar1 intercept x
## 0.8168 -0.2677 1.0179
## s.e. 0.0597 0.5480 0.1121
##
## sigma^2 estimated as 1.063: log likelihood = -145.49, aic = 298.98
plot(resid(mod))
acf(resid(mod))
nnew = 20
xnew = lag(as.ts(5*sin(((n+1):(n+nnew))/5)),-n)
ts.plot(x,y,xnew,col=c(1,2,1),lty=c(1,1,2))
- we use the predict function.
p = predict(mod,n.ahead=nnew,newxreg=xnew)
ts.plot(x,y,xnew,p$pred,p$pred+2*p$se,p$pred-2*p$se,col=c(1,2,1,2,2,2),lty=c(1,1,2,2,3,3))
alpha = 0.5; gamma = 1; n = 100; delay = 5
x = as.ts(5*sin(1:(n+delay)/5))
eps = arima.sim(model=list(ar=alpha),n=n+delay)
y = gamma*lag(x,-delay)+eps
dat = ts.intersect(x,y)
ts.plot(dat[,1],dat[,2],col=1:2)
cc = ccf(dat[,1],dat[,2],lag.max=10)
estlag = cc$lag[which(cc$acf==max(cc$acf))];estlag
## [1] -5
dat2 = ts.intersect(as.ts(dat[,1]),lag(dat[,2],k=-estlag))
ts.plot(dat2[,1],dat2[,2],col=1:2)
mod=arima(dat2[,2],order=c(1,0,0),xreg=dat2[,1]); mod
##
## Call:
## arima(x = dat2[, 2], order = c(1, 0, 0), xreg = dat2[, 1])
##
## Coefficients:
## ar1 intercept dat2[, 1]
## 0.4150 -0.2019 1.0210
## s.e. 0.0939 0.1797 0.0493
##
## sigma^2 estimated as 1.064: log likelihood = -137.86, aic = 283.72
There are two fundamentally different model classes for time series data.
Discrete time stochastic processes
Continuous time stochastic processes
So far we have only looked at the discrete time case. We will finish todays lecture by looking a bit at the continuous time case, just to give you an idea of this topic.
In this setup we see the underlying \(X_t\) as a continuous function of \(t\) for \(t\) in some interval \([0,T]\).
In principle we imagine that there are infinitely many data points, simply because there are infinitely many time points between 0 and \(T\). Of course this is never true: In practice we will always only have finitely many data points.
But it makes sense to believe that the real data actually contains all the data points. We are just not able to measure them (and to store them in a computer).
With a model for all datapoints, we are - through simulation - able to describe the behaviour of data. Also between the observations.
A key example of a process in continuous time will be the so–called Wiener process.
Three simulated realizations (black, blue and red) of this process can be seen here
## Warning in rgl.init(initValue, onlyNULL): RGL: unable to open X11 display
## Warning: 'rgl_init' failed, running with rgl.useNULL = TRUE
## Warning: .onUnload failed in unloadNamespace() for 'rgl', details:
## call: fun(...)
## error: object 'rgl_quit' not found
## Package 'Sim.DiffProc', version 4.0
## browseVignettes('Sim.DiffProc') for more informations.
A Wiener process has the following properties:
It starts in 0: \(W_0=0\).
It has independent increments: For \(0<s<t\) it holds that \(W_t-W_s\) is independent of everything that has happened up to time \(s\), that is \(W_u\) for all \(u\leq s\).
It has normally distributed increments: For \(0<s<t\) it holds that the increment \(W_t-W_s\) is normally distributed with variance \(t-s\): \[W_t-W_s \sim \text{Normal}(\mu=0,\sigma^2=t-s).\]
The intuition of this process is that it somehow changes direction all the time: How the process changes after time \(s\) will be independent of what has happened before time \(s\). So whether the process should increase or decrease after \(s\) will not be affected by how much it was increasing or decreasing before.
This gives the very bumpy behaviour over time.
A common way to define a continuous time stochastic process model is through a stochastic differential equation (SDE) which we will turn to shortly, but before doing so we will recall some basic things about ordinary differential equations.
Suppose \(f\) is an unknown differentiable function and \(f(0)=1\). Recall the mathematical description of a differential equation
\[\frac{df(t)}{dt}=-4f(t)\]
With the condition \(f(0)=1\) this equation has the solution
\[f(t)=\exp(-4t)\]
With a slightly unusual notation we can rewrite this as \[df(t)=-4\cdot f(t)dt\] This equation has the following (hopefully intuitive) interpretation:
So when \(t\) is increased, then \(f\) is decreased. And the decrease is determined by the value of \(f(t)\). That is why \(f\) decreases slower and slower, when \(t\) is increased.
We say that the function has a drift towards zero, and this drift is determined by the value of the function.
It will probably never be true that data behaves exactly like the exponentially decreasing curve on the previous slide.
Instead we will consider a model, where some random noise from a Wiener process has been added. Two different (black/blue) simulated realizations can be seen below
The type of process that is simulated above is described formally by the equation \[d X_t=-4 X_tdt+0.1d W_t\] This is called a Stochastic Differential Equation (SDE), and the processes simulated above are called solutions of the stochastic differential equation.
The SDE \(d X_t=-4 X_tdt+0.1d W_t\) has two terms:
\(-4X_t dt\) is the drift term.
\(0.1dW_t\) is the diffusion term.
The intuition behind this notation is very similar to the intuition in the equation \(df(t)=-4\cdot f(t)\;dt\) for an ordinary differential equation. When the time is increased by the small amount \(dt\), then the process \(X_t\) is increased by \(-4X_t\,dt\) AND by how much the process \(0.1W_t\) has increased on the time interval \([t,t+dt]\).
So this process has a drift towards zero, but it is also pushed in a random direction (either up or down) by the Wiener process (more precisely, the process \(0.1W_t\))