Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Looping with the tin() function

From   Nick Cox <[email protected]>
To   [email protected]
Subject   Re: st: Looping with the tin() function
Date   Sun, 1 Jul 2012 10:23:32 +0100

Points of general note include:

1. References on Statalist are like references everywhere else.
Journal title, volume number, page numbers do no harm and their
absence is lack of information.

2. We can't access your data on your c: drive; it is much better to
replicate a problem with a mutually accessible dataset.

3. You give specific code (good) but don't say what the specific
problem with that code is. We can't comment easily on other problems
that occur with other code that you don't give us.

4. If you want the minimum or the mean, don't use -egen- to put the
constant result in a variable. Use the results from -summarize-. (This
is a style point.)

      su AIC, meanonly
      gen opt_lag_aic = lag if AIC == r(min)
      su opt_lag_aic, meanonly
      scalar aic_lag = r(mean)

5. You should explain where user-written programs you refer to come
from. (-eststo-, -esttab-).

I don't follow your general aim -- I rarely do time series forecasting
--- but if all else fails, you can evaluate scalars on the fly (as you
did for your -forvalues- loop, so that -tin()- sees their values, not
their names, and you can use -inrange()- as an alternative to -tin()-.

. set obs 10
. gen t = _n
. tsset t
. scalar foo1 = 4
. scalar foo2 = 7

. l if tin(`= foo1', `=foo2')

     | t |
  4. | 4 |
  5. | 5 |
  6. | 6 |
  7. | 7 |

. gen t1 = 4

. gen t2 = 7

. l if inrange(t, t1, t2)

     | t   t1   t2 |
  4. | 4    4    7 |
  5. | 5    4    7 |
  6. | 6    4    7 |
  7. | 7    4    7 |


On Fri, Jun 29, 2012 at 10:08 PM, Autria Mazda <[email protected]> wrote:

> I'm a first-timer and this is a re-post of a previously poorly explained problem I'm having, my apologies.
> (I'm trying to replicate Stock and Watson 2007, "Why Has Inflation Become Harder to Forecast" in case anyone is familiar with the paper).
> Data: I have a quarterly time series of gdp deflator data from 1960-2010.
> Objective: Ultimately, I'm trying to run a pseudo-out-of-sample forecast.
> What I'm currently doing:
> 1) Obtain the optimal lag length (based on AIC) over an initial sub-sample (e.g. 1960q1-1970q1) using varsoc.
> 2) Once I obtain the optimal lag length, I feed that into an AR(n) model. (Technically, n can change every quarter based on the varsoc results.)
> 3) I am able to do the above steps successfully but not efficiently with the code below. (Right now I'm just deleting the observations that are
> outside the sub-sample period before I run varsoc.)
> Problem:
> I need to run the regression over the sub-sample (1960-1970q1) use those parameters and model specification to forecast for the next quarter (1970q2).
> Then I need to store the forecast in a variable (e.g. f_cast_gdp_def) as the observation for 1970q2.
> Next I need to repeat steps 1 and 2 above and re-run the regression (this time over the sample period 1960q1-1970q2).
> Then forecast for 1970q3 and save that as observation 2 in variable f_cast_gdp_def, and so on and so forth for every quarter until the end of the sample.
> I looked into trying the rolling reg command but from my understanding of the documentation I don't think I can keep changing the model specification at each iteration.
> I think if I can figure out how to have tin() accept a variable, scalar or my loop var (t), then I can get what I need. My code is below.
> I've tried many syntax variants and either I get an error message: invalid syntax or stata just ignores the tin() function altogether and uses the entire sample.
> Any tips or advice would be greatly appreciated. Thanks again!!!
> Autria Christensen
> [email protected]
> use "C:\Stata Files\Thesis\Data\SW_Econ_Activity_Vars_Tab1-3.dta", clear
> forvalues t = `=tq(1970q1)'/`=tq(2010q4)' {
>      preserve
>      *set date range for optimal lag length selection
>      drop if date < tq(1960q1)
>      drop if date > `t'
>      set more off
>      varsoc gdpc96_yoy, maxlag(20)
>      *Select optimal lag length and save as new scalar AIC_LAG
>      matrix D = r(stats)
>      svmat D, name(col)
>      egen min_aic = min(AIC)
>      gen opt_lag_aic = lag if min_aic == AIC
>      egen store_a = mean(opt_lag_aic)
>      scalar aic_lag = store_a
>      disp "OPTIMAL LAG (BASED ON AIC) = " aic_lag
>      *Run regression and store/output parameter estimates
>      eststo one: arima gdpc96_yoy L(0/`=aic_lag').gdpc96_yoy, robust
>      esttab one using "C:\Stata Files\Thesis\Data\Stats\test_`=end_date'.csv", cells("b se z p ci") stats(N ll chi2) nomtitle nonumber replace wide plain
>      *Now I need to forecast for the period t+1 using parameter estimates and lags from period t
>      gen fcast_date = `t' + 1
>      scalar f_date = fcast_date
>      *I've tried using fcast_date and f_date but neither work the "predict" code below which doesn't really work
>      predict static_yhat_AR if tin(`t', `t')
>      predict dynamic_yhat_AR if tin(1970q2,2010q2), dyn(tq(1970q1))
>      est clear
>      restore
> }
*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index