Vector autoregressions
Vector autoregressions
Proud Economy 0007
Looking at “Forecasting: principles and practice: An online textbook by Rob J Hyndman and George Athanasopoulos, they have an example similiar to what I was thinking about in a couple of the prior posts
using the vars R package. These use multiple time series in a forecast.
Using these libraries:
library(fpp)
library(vars)
Their section 9/2 Vector autoregressions contains R code like this:
VARselect(usconsumption, lag.max = 8, type = "const")$selection
## AIC(n) HQ(n) SC(n) FPE(n)
## 5 1 1 5
var <- VAR(usconsumption, p = 3, type = "const")
serial.test(var, lags.pt = 10, type = "PT.asymptotic")
##
## Portmanteau Test (asymptotic)
##
## data: Residuals of VAR object var
## Chi-squared = 33.38, df = 28, p-value = 0.2219
## $serial
##
## Portmanteau Test (asymptotic)
##
## data: Residuals of VAR object var
## Chi-squared = 33.38, df = 28, p-value = 0.2219
summary(var)
##
## VAR Estimation Results:
## =========================
## Endogenous variables: consumption, income
## Deterministic variables: const
## Sample size: 161
## Log Likelihood: -338.797
## Roots of the characteristic polynomial:
## 0.767 0.553 0.524 0.524 0.318 0.318
## Call:
## VAR(y = usconsumption, p = 3, type = "const")
##
##
## Estimation results for equation consumption:
## ============================================
## consumption = consumption.l1 + income.l1 + consumption.l2 + income.l2 + consumption.l3 + income.l3 + const
##
## Estimate Std. Error t value Pr(>|t|)
## consumption.l1 0.2228 0.0858 2.60 0.0103 *
## income.l1 0.0404 0.0623 0.65 0.5180
## consumption.l2 0.2014 0.0900 2.24 0.0267 *
## income.l2 -0.0983 0.0641 -1.53 0.1273
## consumption.l3 0.2351 0.0882 2.66 0.0085 **
## income.l3 -0.0242 0.0614 -0.39 0.6944
## const 0.3197 0.0912 3.51 0.0006 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
##
## Residual standard error: 0.63 on 154 degrees of freedom
## Multiple R-Squared: 0.218, Adjusted R-squared: 0.188
## F-statistic: 7.17 on 6 and 154 DF, p-value: 9.38e-07
##
##
## Estimation results for equation income:
## =======================================
## income = consumption.l1 + income.l1 + consumption.l2 + income.l2 + consumption.l3 + income.l3 + const
##
## Estimate Std. Error t value Pr(>|t|)
## consumption.l1 0.4871 0.1164 4.19 4.8e-05 ***
## income.l1 -0.2488 0.0845 -2.94 0.00374 **
## consumption.l2 0.0322 0.1221 0.26 0.79213
## income.l2 -0.1111 0.0870 -1.28 0.20317
## consumption.l3 0.4030 0.1197 3.37 0.00096 ***
## income.l3 -0.0915 0.0833 -1.10 0.27348
## const 0.3628 0.1237 2.93 0.00386 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
##
## Residual standard error: 0.855 on 154 degrees of freedom
## Multiple R-Squared: 0.211, Adjusted R-squared: 0.18
## F-statistic: 6.87 on 6 and 154 DF, p-value: 1.76e-06
##
##
##
## Covariance matrix of residuals:
## consumption income
## consumption 0.397 0.196
## income 0.196 0.731
##
## Correlation matrix of residuals:
## consumption income
## consumption 1.000 0.364
## income 0.364 1.000
fcst <- forecast(var)
plot(fcst, xlab = "Year")
Conference Boards referenced datasets as in my prior post:
Now to format our data to do the same.
library(xts)
library(Quandl)
# Quandl.auth('yourauthenticationtoken')
getts <- function(series, name) {
y <- Quandl(series, collapse = "quarterly", type = "xts", start_date = "1992-01-01",
end_date = "2013-01-31")
names(y)[1] <- name
return(y)
}
gdp <- getts("FRED/GDPC1", "gdp")
pce <- getts("FRED/PCE", "pce")
houst <- getts("FRED/HOUST", "houst")
atcgno <- getts("FRED/ATCGNO", "atcgno")
bopgstb <- getts("FRED/BOPGSTB", "bopgstb")
USeconomy <- cbind(gdp, pce, houst, atcgno, bopgstb)
summary(USeconomy)
## Index gdp pce houst
## Min. :1992-03-31 Min. : 8151 Min. : 4156 Min. : 505
## 1st Qu.:1997-06-30 1st Qu.: 9801 1st Qu.: 5527 1st Qu.:1046
## Median :2002-09-30 Median :11587 Median : 7482 Median :1475
## Mean :2002-09-29 Mean :11357 Mean : 7637 Mean :1368
## 3rd Qu.:2007-12-31 3rd Qu.:12949 3rd Qu.: 9745 3rd Qu.:1649
## Max. :2013-01-01 Max. :13726 Max. :11290 Max. :2151
## atcgno bopgstb
## Min. : 44972 Min. :-64214
## 1st Qu.: 59285 1st Qu.:-46860
## Median : 67165 Median :-34269
## Mean : 68753 Mean :-32265
## 3rd Qu.: 80431 3rd Qu.:-10475
## Max. :102161 Max. : -2641
VARselect(USeconomy, lag.max = 8, type = "const")$selection
## AIC(n) HQ(n) SC(n) FPE(n)
## 2 2 2 2
var <- VAR(USeconomy, p = 2, type = "const")
serial.test(var, lags.pt = 10, type = "PT.asymptotic")
##
## Portmanteau Test (asymptotic)
##
## data: Residuals of VAR object var
## Chi-squared = 226, df = 200, p-value = 0.1002
## $serial
##
## Portmanteau Test (asymptotic)
##
## data: Residuals of VAR object var
## Chi-squared = 226, df = 200, p-value = 0.1002
summary(var)
##
## VAR Estimation Results:
## =========================
## Endogenous variables: gdp, pce, houst, atcgno, bopgstb
## Deterministic variables: const
## Sample size: 83
## Log Likelihood: -2971.243
## Roots of the characteristic polynomial:
## 1 0.935 0.935 0.926 0.717 0.717 0.412 0.358 0.358 0.0955
## Call:
## VAR(y = USeconomy, p = 2, type = "const")
##
##
## Estimation results for equation gdp:
## ====================================
## gdp = gdp.l1 + pce.l1 + houst.l1 + atcgno.l1 + bopgstb.l1 + gdp.l2 + pce.l2 + houst.l2 + atcgno.l2 + bopgstb.l2 + const
##
## Estimate Std. Error t value Pr(>|t|)
## gdp.l1 9.33e-01 1.47e-01 6.35 1.7e-08 ***
## pce.l1 4.45e-01 1.39e-01 3.19 0.0021 **
## houst.l1 5.39e-02 7.25e-02 0.74 0.4598
## atcgno.l1 9.84e-04 1.02e-03 0.97 0.3359
## bopgstb.l1 1.62e-03 2.02e-03 0.80 0.4273
## gdp.l2 2.38e-02 1.40e-01 0.17 0.8656
## pce.l2 -3.78e-01 1.39e-01 -2.71 0.0084 **
## houst.l2 7.13e-02 8.09e-02 0.88 0.3807
## atcgno.l2 -1.41e-03 1.02e-03 -1.38 0.1716
## bopgstb.l2 2.42e-03 2.00e-03 1.21 0.2306
## const -1.76e+00 1.80e+02 -0.01 0.9922
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
##
## Residual standard error: 59.5 on 72 degrees of freedom
## Multiple R-Squared: 0.999, Adjusted R-squared: 0.999
## F-statistic: 6.61e+03 on 10 and 72 DF, p-value: <2e-16
##
##
## Estimation results for equation pce:
## ====================================
## pce = gdp.l1 + pce.l1 + houst.l1 + atcgno.l1 + bopgstb.l1 + gdp.l2 + pce.l2 + houst.l2 + atcgno.l2 + bopgstb.l2 + const
##
## Estimate Std. Error t value Pr(>|t|)
## gdp.l1 3.92e-01 1.60e-01 2.46 0.016 *
## pce.l1 9.83e-01 1.51e-01 6.49 9.6e-09 ***
## houst.l1 -3.33e-02 7.87e-02 -0.42 0.673
## atcgno.l1 2.73e-05 1.10e-03 0.02 0.980
## bopgstb.l1 3.81e-04 2.20e-03 0.17 0.863
## gdp.l2 -3.58e-01 1.53e-01 -2.35 0.021 *
## pce.l2 1.02e-02 1.51e-01 0.07 0.947
## houst.l2 5.79e-02 8.78e-02 0.66 0.512
## atcgno.l2 -1.08e-03 1.11e-03 -0.97 0.335
## bopgstb.l2 6.47e-04 2.17e-03 0.30 0.767
## const -1.94e+02 1.96e+02 -0.99 0.325
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
##
## Residual standard error: 64.6 on 72 degrees of freedom
## Multiple R-Squared: 0.999, Adjusted R-squared: 0.999
## F-statistic: 9.37e+03 on 10 and 72 DF, p-value: <2e-16
##
##
## Estimation results for equation houst:
## ======================================
## houst = gdp.l1 + pce.l1 + houst.l1 + atcgno.l1 + bopgstb.l1 + gdp.l2 + pce.l2 + houst.l2 + atcgno.l2 + bopgstb.l2 + const
##
## Estimate Std. Error t value Pr(>|t|)
## gdp.l1 5.57e-03 2.57e-01 0.02 0.98
## pce.l1 1.93e-01 2.44e-01 0.79 0.43
## houst.l1 9.12e-01 1.27e-01 7.19 4.8e-10 ***
## atcgno.l1 -1.33e-03 1.78e-03 -0.75 0.46
## bopgstb.l1 3.40e-03 3.54e-03 0.96 0.34
## gdp.l2 4.37e-02 2.46e-01 0.18 0.86
## pce.l2 -2.09e-01 2.44e-01 -0.86 0.39
## houst.l2 6.99e-02 1.41e-01 0.49 0.62
## atcgno.l2 -1.21e-03 1.78e-03 -0.68 0.50
## bopgstb.l2 -6.53e-04 3.50e-03 -0.19 0.85
## const -1.65e+02 3.15e+02 -0.52 0.60
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
##
## Residual standard error: 104 on 72 degrees of freedom
## Multiple R-Squared: 0.954, Adjusted R-squared: 0.948
## F-statistic: 150 on 10 and 72 DF, p-value: <2e-16
##
##
## Estimation results for equation atcgno:
## =======================================
## atcgno = gdp.l1 + pce.l1 + houst.l1 + atcgno.l1 + bopgstb.l1 + gdp.l2 + pce.l2 + houst.l2 + atcgno.l2 + bopgstb.l2 + const
##
## Estimate Std. Error t value Pr(>|t|)
## gdp.l1 4.12e+00 1.43e+01 0.29 0.77404
## pce.l1 5.84e+01 1.36e+01 4.29 5.4e-05 ***
## houst.l1 -2.00e+01 7.06e+00 -2.83 0.00596 **
## atcgno.l1 3.84e-01 9.90e-02 3.89 0.00023 ***
## bopgstb.l1 -8.74e-03 1.97e-01 -0.04 0.96478
## gdp.l2 -9.36e+00 1.37e+01 -0.68 0.49622
## pce.l2 -5.17e+01 1.36e+01 -3.81 0.00029 ***
## houst.l2 2.63e+01 7.88e+00 3.33 0.00135 **
## atcgno.l2 3.49e-01 9.94e-02 3.51 0.00077 ***
## bopgstb.l2 1.56e-01 1.95e-01 0.80 0.42522
## const 1.89e+04 1.76e+04 1.08 0.28534
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
##
## Residual standard error: 5800 on 72 degrees of freedom
## Multiple R-Squared: 0.848, Adjusted R-squared: 0.827
## F-statistic: 40.2 on 10 and 72 DF, p-value: <2e-16
##
##
## Estimation results for equation bopgstb:
## ========================================
## bopgstb = gdp.l1 + pce.l1 + houst.l1 + atcgno.l1 + bopgstb.l1 + gdp.l2 + pce.l2 + houst.l2 + atcgno.l2 + bopgstb.l2 + const
##
## Estimate Std. Error t value Pr(>|t|)
## gdp.l1 -4.25e+00 7.09e+00 -0.60 0.55130
## pce.l1 -3.00e+01 6.73e+00 -4.45 3.0e-05 ***
## houst.l1 -2.85e+00 3.50e+00 -0.81 0.41781
## atcgno.l1 -7.33e-02 4.90e-02 -1.49 0.13942
## bopgstb.l1 7.38e-01 9.78e-02 7.55 1.1e-10 ***
## gdp.l2 5.09e-01 6.78e+00 0.08 0.94033
## pce.l2 3.10e+01 6.73e+00 4.61 1.7e-05 ***
## houst.l2 1.01e+00 3.91e+00 0.26 0.79646
## atcgno.l2 1.83e-01 4.93e-02 3.71 0.00041 ***
## bopgstb.l2 7.18e-02 9.67e-02 0.74 0.45974
## const 2.56e+04 8.70e+03 2.94 0.00439 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
##
## Residual standard error: 2870 on 72 degrees of freedom
## Multiple R-Squared: 0.981, Adjusted R-squared: 0.978
## F-statistic: 369 on 10 and 72 DF, p-value: <2e-16
##
##
##
## Covariance matrix of residuals:
## gdp pce houst atcgno bopgstb
## gdp 3535.64 2448 2171 93577 -6.73e+00
## pce 2448.09 4172 2569 47103 -4.54e+04
## houst 2171.30 2569 10821 21586 3.27e+04
## atcgno 93576.88 47103 21586 33588697 5.13e+04
## bopgstb -6.73 -45358 32702 51264 8.25e+06
##
## Correlation matrix of residuals:
## gdp pce houst atcgno bopgstb
## gdp 1.00e+00 0.637 0.3510 0.27154 -3.94e-05
## pce 6.37e-01 1.000 0.3823 0.12582 -2.45e-01
## houst 3.51e-01 0.382 1.0000 0.03581 1.09e-01
## atcgno 2.72e-01 0.126 0.0358 1.00000 3.08e-03
## bopgstb -3.94e-05 -0.245 0.1095 0.00308 1.00e+00
# str(var)
plot(var)
fcstUSeconomy <- forecast(var)
## Error: non-numeric argument to mathematical function
# plot(fcstUSeconomy)
The results as far as they got look like what I was expecting. There isn't much hint as to why 'forecast' gets the error though.
Gary Young