Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Question on Dfactor and Gaps in Time Series

From	Richard Gates <[email protected]>
To	[email protected]
Subject	Re: st: Question on Dfactor and Gaps in Time Series
Date	Mon, 11 Oct 2010 16:01:16 -0500

Degas Wright is getting the "gaps in the time series" error from -dfactor-.

Degas wrote that

> I am using the dfactor command and have run into the gap in time series
> error. My data is price (p), volume (v) and earnings yield (ep) and I am
> trying to develop a dynamic factor model using the dfactor command. My
> code is:
>
> tsset
>         time variable:  date, 2008w25 to 2010w40
>                 delta:  1 week
>
> . dfactor(D.(p v ep)=,noconstant)(f=,ar(1/2))
> gaps in the time series are not allowed
> r(459);
>

-dfactor- cannot use datasets that contain gaps in the data.

A gap in the data occurs when there is a missing observation in the middle of a
time series.

-tsfill- fills in gaps in the time variable, not in the other variables in the
dataset.  I explain the difference below.

Degas can use  -dfactor-, if he is willing to impose an additional assumption
which removes the gaps, as explained below.

Now I fill in the details.

I have some simulated data.  The variables have the same names as those in
Degas' example, but the values are arbitrary.

I begin by using the data and running -tsset- on the time variable -t-.

. use mydata

. tsset 
        time variable:  t, 2008w25 to 2010w40, but with gaps
                delta:  1 week


The output from -tsset- informs us that there are gaps in the data.  We just
happen to know that the missing observations occur in week 52 of each year.
(In my simulated data, the research team takes vacation the last week of the
year, so there is no data for week 52.) We use this knowledge to list out the
data around the missing observations.

. list t if week(dofw(t)) > 50 | week(dofw(t)) < 2 , separator(2)

     +---------+
     |       t |
     |---------|
 27. | 2008w51 |
 28. |  2009w1 |
     |---------|
 78. | 2009w51 |
 79. |  2010w1 |
     +---------+

(The week() function displays the week from a time variable stored in daily
format.  The dofw() function converts a time variable in weekly format to a
time variable in daily format.)

We cannot use -dfactor- on this data because there are gaps in the data. 

If we use -tsfill- on this data, it inserts observations for the missing
periods, but only the time variable will be nonmissing.  We illustrate this
point below.

. tsfill, full

. list t p v ep if week(dofw(t)) > 50 | week(dofw(t)) < 2 , separator(2)

     +----------------------------------------------+
     |       t           p           v           ep |
     |----------------------------------------------|
 27. | 2008w51   11.143581   16.772175   -9.7874321 |
 28. | 2008w52           .           .            . |
     |----------------------------------------------|
 29. |  2009w1   11.894791   19.141332   -10.752946 |
 79. | 2009w51   11.682969   28.361535    -13.24529 |
     |----------------------------------------------|
 80. | 2009w52           .           .            . |
 81. |  2010w1   10.297958   27.970307   -13.730162 |
     +----------------------------------------------+

. tsset
        time variable:  t, 2008w24 to 2010w40
                delta:  1 week

There are still gaps in this data, so we still cannot use -dfactor- on this
data.

Now, I suppose that week 1 actually comes after week 51.  In my example, this
assumption holds because the researchers take off the last week in the year.

I implement this assumption by (1) dropping the two observations for which the
week is 52, (2) creating a new time variable that goes from 1 to the number of
observations in the sample, and (3) using -tsset- to declare the new time
variable.  By construction, there are no missing time periods.

. drop if week(dofw(t)) == 52
(2 observations deleted)

. generate t2 = _n

. tsset t2
        time variable:  t2, 1 to 118
                delta:  1 unit

Having removed the gaps in the data by imposing an additional assumption on 
our model, we can use -dfactor- to estimate the parameters. 

. dfactor(D.(p v ep)=,noconstant)(f=,ar(1/2))
searching for initial values ...........
(setting technique to bhhh)
Iteration 0:   log likelihood = -329.57244  
Iteration 1:   log likelihood = -324.04945  
Iteration 2:   log likelihood = -322.84128  
Iteration 3:   log likelihood = -322.38332  
Iteration 4:   log likelihood = -322.10539  
(switching technique to nr)
Iteration 5:   log likelihood = -322.05329  
Iteration 6:   log likelihood = -321.89974  
Iteration 7:   log likelihood = -321.89735  
Iteration 8:   log likelihood = -321.89735  
Refining estimates:
Iteration 0:   log likelihood = -321.89735  
Iteration 1:   log likelihood = -321.89735  

Dynamic-factor model

Sample: 2 - 118                                   Number of obs   =        117
                                                  Wald chi2(5)    =     378.91
Log likelihood = -321.89735                       Prob > chi2     =     0.0000
------------------------------------------------------------------------------
             |                 OIM
             |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
f            |
           f |
         L1. |   1.302608    .205203     6.35   0.000     .9004174    1.704798
         L2. |  -.5836402   .1829143    -3.19   0.001    -.9421456   -.2251347
-------------+----------------------------------------------------------------
D.p          |
           f |  -.1543837   .0445771    -3.46   0.001    -.2417532   -.0670142
-------------+----------------------------------------------------------------
D.v          |
           f |  -.0692728   .0441276    -1.57   0.116    -.1557613    .0172156
-------------+----------------------------------------------------------------
D.ep         |
           f |   .1194667   .0516863     2.31   0.021     .0181635    .2207699
-------------+----------------------------------------------------------------
var(De.p)    |   .1336773   .0247502     5.40   0.000     .0851679    .1821868
var(De.v)    |   .4937034    .066499     7.42   0.000     .3633677    .6240391
var(De.ep)   |   .4552322   .0655476     6.95   0.000     .3267612    .5837031
------------------------------------------------------------------------------
Note: Tests of variances against zero are conservative and are provided only
      for reference.

(Of course the parameter estimates are for our simulated data.)

I hope this helps.


-Rich
[email protected]
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- RE: st: Question on Dfactor and Gaps in Time Series
  - From: "Degas Wright" <[email protected]>

Prev by Date: Re: st: Truncated at zero count data with underdispersion
Next by Date: RE: st: Question on Dfactor and Gaps in Time Series
Previous by thread: st: RE: RE: RE: RE: Question on Dfactor and Gaps in Time Series
Next by thread: RE: st: Question on Dfactor and Gaps in Time Series
Index(es):
- Date
- Thread