Thursday, June 27, 2013

Another shot at multivariate economic model

Another shot at multivariate economic model

Another shot at multivariate economic model

Proud Economy 0004

Well, I didn't get very far last time. I was able to run an auto.arima model with forecast on a single timeseries, the GDP. That's a start with Box-Jenkins, and I'm sure we'll do a lot of those as a rough cut whenever we see a forecast in the future. We'll need to study the foundations behind the methods and learn how to evaluate the results.

Collect the time series again:

library(Quandl)
# Quandl.auth('yourauthenticationtoken')

Here's my modified function to load the data and plot initial summaries. I'm setting the start and end dates so that I have the same number of data points in the linear regression at the bottom. It would be nice if there were a function call to automatically trim data points as needed.

getts <- function(series) {
    y <- Quandl(series, start_date = "1959-01-01", end_date = "2012-12-31", 
        collapse = "quarterly", type = "xts")
    layout(matrix(c(1, 1, 2, 3), 2, 2, byrow = TRUE))
    plot.ts(y, main = series, ylab = series)
    acf(y, main = "Autocorrelations", ylim = c(-1, 1))
    pacf(y, main = "Partial Autocorrelations", ylim = c(-1, 1))
    return(y)
}

I'll look at the Real GDP, Real Consumer Spending and Housing Starts this time.

gdp <- getts("FRED/GDPC1")
str(gdp)
## An 'xts' object on 1959-03-31/2012-10-01 containing:
##   Data: num [1:216, 1] 2708 2776 2773 2783 2845 ...
##  - attr(*, "dimnames")=List of 2
##   ..$ : NULL
##   ..$ : chr "Value"
##   Indexed by objects of class: [Date] TZ: UTC
##   xts Attributes:  
##  NULL
pce <- getts("FRED/PCE")

plot of chunk unnamed-chunk-4

str(pce)
## An 'xts' object on 1959-03-31/2012-12-01 containing:
##   Data: num [1:216, 1] 313 319 325 323 331 ...
##  - attr(*, "dimnames")=List of 2
##   ..$ : NULL
##   ..$ : chr "Value"
##   Indexed by objects of class: [Date] TZ: UTC
##   xts Attributes:  
##  NULL
houst <- getts("FRED/HOUST")

plot of chunk unnamed-chunk-4 plot of chunk unnamed-chunk-4

str(houst)
## An 'xts' object on 1959-03-31/2012-12-01 containing:
##   Data: num [1:216, 1] 1620 1503 1540 1601 1109 ...
##  - attr(*, "dimnames")=List of 2
##   ..$ : NULL
##   ..$ : chr "Value"
##   Indexed by objects of class: [Date] TZ: UTC
##   xts Attributes:  
##  NULL

Look at cross correlation and attempt linear model:

I found several websites, including one at Princeton to follow. I had to do too much fiddling around to get anything to run. There should be automatic ways with all builtin defaults. I had to do explicite data type conversions without understanding what the R packages require.

ccf(as.ts(gdp), as.ts(houst))

plot of chunk unnamed-chunk-5

ccf(as.ts(gdp), as.ts(pce))

plot of chunk unnamed-chunk-5

lmfit <- lm(as.ts(gdp) ~ as.ts(pce) + as.ts(houst))
summary(lmfit)
## 
## Call:
## lm(formula = as.ts(gdp) ~ as.ts(pce) + as.ts(houst))
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1353.3  -339.4    99.7   411.8   813.3 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  2.62e+03   1.57e+02    16.7  < 2e-16 ***
## as.ts(pce)   1.02e+00   1.08e-02    94.2  < 2e-16 ***
## as.ts(houst) 6.92e-01   9.22e-02     7.5  1.7e-12 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 513 on 213 degrees of freedom
## Multiple R-squared:  0.978,  Adjusted R-squared:  0.978 
## F-statistic: 4.68e+03 on 2 and 213 DF,  p-value: <2e-16
anova(lmfit)
## Analysis of Variance Table
## 
## Response: as.ts(gdp)
##               Df   Sum Sq  Mean Sq F value  Pr(>F)    
## as.ts(pce)     1 2.45e+09 2.45e+09  9300.8 < 2e-16 ***
## as.ts(houst)   1 1.48e+07 1.48e+07    56.3 1.7e-12 ***
## Residuals    213 5.61e+07 2.63e+05                    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
plot(lmfit)

plot of chunk unnamed-chunk-5 plot of chunk unnamed-chunk-5 plot of chunk unnamed-chunk-5 plot of chunk unnamed-chunk-5

Well, that's not exactly what I had in mind. I was hoping to figure out an arima type transfer function model, but I didn't find anything I could run without understanding. We'll have to work on that later. There's plenty here to understand, explain, and validate. I wonder what all that residual stuff in the plot is?

To be useful for forecasting, the linear model would have to use prior period values of pce and houst to find the current forecast value of gdp. We need a linear model with more flexibility.

I looked at Time Series Analysis with Applications in R by Jonathan D. Cryer and Kung-Sik Chan and replicated their results.

Gary Young

No comments:

Post a Comment