Overview
On this article, I'll make the local level model with explanatory variable to time series data on Stan.Before, I made the simple local level model on Stan. In the practical situation, we frequently need to make model with some explanatory variables. So, I'll make simple local level model with explanatory variables here.
As a reference, I’m using the following book. This article is dealing with the chapter 5 of the book.
Data
For the small experiment, I made the following artificial data.import numpy as np
# make data
np.random.seed(59)
x = np.random.normal(100, 5, 100)
beta = 10.0
data = []
u = []
u_start = 500.0
data.append(u_start + beta * x[0] + np.random.normal(0, 2))
u.append(u_start)
for i in range(100):
if i == 0:
u_temp = u_start
else:
next_u = u_temp + np.random.normal(0, 2)
data.append(next_u + beta * x[i] + np.random.normal(0, 2))
u.append(next_u)
u_temp = next_u
The variable ‘data’ is time series data with explanatory variable x. By visualizing, let’s check how it is.
import matplotlib.pyplot as plt
plt.plot(list(range(len(data))), data)
plt.show()
This time, the data has one explanatory variable. If we check the relations between explanatory and explained variables, it becomes like this.
plt.scatter(x, data)
plt.show()
In a nutshell, this data is time series data with one explanatory variable.
Stan Modeling
Mathematically, the model can be expressed as followings.The state disturbances are usually fixed on zero.
data {
int N;
vector[N] X;
vector[N] Y;
}
parameters {
vector[N] u;
real beta;
real<lower=0> s_y;
real<lower=0> s_u;
}
transformed parameters {
vector[N] y_hat;
y_hat = u + beta * X;
}
model {
u[2:N] ~ normal(u[1:(N-1)], s_u);
Y ~ normal(y_hat, s_y);
}
Execute
On Python, we can execute.import pystan
data_feed = {'N': len(data), 'X': x, 'Y': data}
fit = pystan.stan(file='local_level_experiment.stan', data=data_feed, iter=1000)
Check the outcome
By the code below, we can get sampled points.samples = fit.extract(permuted=True)
beta_mean = samples['beta'].mean()
u_mean = samples['u'].mean(axis=0)
data_pred = u + beta_mean * x
By visualizing, we can check if it traces the real data or not.
plt.plot(list(range(len(data))), data)
plt.plot(list(range(len(data))), data_pred)
plt.show()
Prediction
On the article below, I showed how to predict future points by using generated quantities block on Stan.Time series analysis to predict future points on Stan
By the same manner as this, I'll predict the future points. As a premise, here we already have the explanatory variable’s values for the future.
The model is like this. On the data block, I added two items. On the generated quantities block, I'm making points for future, using the obtained parameters.
data {
int N;
int pred_N;
vector[N] X;
vector[pred_N] X_future;
vector[N] Y;
}
parameters {
vector[N] u;
real beta;
real<lower=0> s_y;
real<lower=0> s_u;
}
transformed parameters {
vector[N] y_hat;
y_hat = u + beta * X;
}
model {
u[2:N] ~ normal(u[1:(N-1)], s_u);
Y ~ normal(y_hat, s_y);
}
generated quantities {
vector[N + pred_N] u_pred;
vector[pred_N] y_pred;
u_pred[1:N] = u;
for (i in 1:pred_N){
u_pred[N+i] = normal_rng(u_pred[N+i-1], s_u);
y_pred[i] = normal_rng(u_pred[N+i] + beta * X_future[i] , s_y);
}
}
On Python, we can do sampling. Before that, I made the explanatory data for that.
# future explanatory variable
np.random.seed(32)
x_future = np.random.normal(100, 5, 50)
data_feed = {'N': len(data), 'pred_N': len(x_future), 'X': x, 'X_future': x_future, 'Y': data}
fit_generated = pystan.stan(file='local_level_experiment_generated.stan', data=data_feed, iter=1000)
By visualizing, we can check how it works.
samples_generated = fit_generated.extract(permuted=True)
y_pred = samples_generated['y_pred']
pred_time = list(range(len(data) + len(x_future)))
all_data = list(data_pred) + list(y_pred.mean(axis=0))
plt.plot(list(range(len(data))), data)
plt.plot(pred_time, all_data)
plt.show()