Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: Question on Dfactor and Gaps in Time Series


From   "Degas Wright" <dwright@cornerstoneadvice.com>
To   <statalist@hsphsun2.harvard.edu>
Subject   RE: st: Question on Dfactor and Gaps in Time Series
Date   Mon, 11 Oct 2010 17:17:23 -0400

Richard,
Thank you for stepping me through this solution.  

Degas A. Wright, CFA
Chief Investment Officer
Decatur Capital Management, Inc.
250 East Ponce De Leon Avenue, Suite 325
Decatur, Georgia  30030
Voice: 404.270.9838
Fax:404.270.9840
Website: www.decaturcapital.com

-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu
[mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Richard Gates
Sent: Monday, October 11, 2010 5:01 PM
To: statalist@hsphsun2.harvard.edu
Subject: Re: st: Question on Dfactor and Gaps in Time Series

Degas Wright is getting the "gaps in the time series" error from
-dfactor-.

Degas wrote that

> I am using the dfactor command and have run into the gap in time
series
> error. My data is price (p), volume (v) and earnings yield (ep) and I
am
> trying to develop a dynamic factor model using the dfactor command. My
> code is:
>
> tsset
>         time variable:  date, 2008w25 to 2010w40
>                 delta:  1 week
>
> . dfactor(D.(p v ep)=,noconstant)(f=,ar(1/2))
> gaps in the time series are not allowed
> r(459);
>

-dfactor- cannot use datasets that contain gaps in the data.

A gap in the data occurs when there is a missing observation in the
middle of a
time series.

-tsfill- fills in gaps in the time variable, not in the other variables
in the
dataset.  I explain the difference below.

Degas can use  -dfactor-, if he is willing to impose an additional
assumption
which removes the gaps, as explained below.

Now I fill in the details.

I have some simulated data.  The variables have the same names as those
in
Degas' example, but the values are arbitrary.

I begin by using the data and running -tsset- on the time variable -t-.

. use mydata

. tsset 
        time variable:  t, 2008w25 to 2010w40, but with gaps
                delta:  1 week


The output from -tsset- informs us that there are gaps in the data.  We
just
happen to know that the missing observations occur in week 52 of each
year.
(In my simulated data, the research team takes vacation the last week of
the
year, so there is no data for week 52.) We use this knowledge to list
out the
data around the missing observations.

. list t if week(dofw(t)) > 50 | week(dofw(t)) < 2 , separator(2)

     +---------+
     |       t |
     |---------|
 27. | 2008w51 |
 28. |  2009w1 |
     |---------|
 78. | 2009w51 |
 79. |  2010w1 |
     +---------+

(The week() function displays the week from a time variable stored in
daily
format.  The dofw() function converts a time variable in weekly format
to a
time variable in daily format.)

We cannot use -dfactor- on this data because there are gaps in the data.


If we use -tsfill- on this data, it inserts observations for the missing
periods, but only the time variable will be nonmissing.  We illustrate
this
point below.

. tsfill, full

. list t p v ep if week(dofw(t)) > 50 | week(dofw(t)) < 2 , separator(2)

     +----------------------------------------------+
     |       t           p           v           ep |
     |----------------------------------------------|
 27. | 2008w51   11.143581   16.772175   -9.7874321 |
 28. | 2008w52           .           .            . |
     |----------------------------------------------|
 29. |  2009w1   11.894791   19.141332   -10.752946 |
 79. | 2009w51   11.682969   28.361535    -13.24529 |
     |----------------------------------------------|
 80. | 2009w52           .           .            . |
 81. |  2010w1   10.297958   27.970307   -13.730162 |
     +----------------------------------------------+

. tsset
        time variable:  t, 2008w24 to 2010w40
                delta:  1 week

There are still gaps in this data, so we still cannot use -dfactor- on
this
data.

Now, I suppose that week 1 actually comes after week 51.  In my example,
this
assumption holds because the researchers take off the last week in the
year.

I implement this assumption by (1) dropping the two observations for
which the
week is 52, (2) creating a new time variable that goes from 1 to the
number of
observations in the sample, and (3) using -tsset- to declare the new
time
variable.  By construction, there are no missing time periods.

. drop if week(dofw(t)) == 52
(2 observations deleted)

. generate t2 = _n

. tsset t2
        time variable:  t2, 1 to 118
                delta:  1 unit

Having removed the gaps in the data by imposing an additional assumption
on 
our model, we can use -dfactor- to estimate the parameters. 

. dfactor(D.(p v ep)=,noconstant)(f=,ar(1/2))
searching for initial values ...........
(setting technique to bhhh)
Iteration 0:   log likelihood = -329.57244  
Iteration 1:   log likelihood = -324.04945  
Iteration 2:   log likelihood = -322.84128  
Iteration 3:   log likelihood = -322.38332  
Iteration 4:   log likelihood = -322.10539  
(switching technique to nr)
Iteration 5:   log likelihood = -322.05329  
Iteration 6:   log likelihood = -321.89974  
Iteration 7:   log likelihood = -321.89735  
Iteration 8:   log likelihood = -321.89735  
Refining estimates:
Iteration 0:   log likelihood = -321.89735  
Iteration 1:   log likelihood = -321.89735  

Dynamic-factor model

Sample: 2 - 118                                   Number of obs   =
117
                                                  Wald chi2(5)    =
378.91
Log likelihood = -321.89735                       Prob > chi2     =
0.0000
------------------------------------------------------------------------
------
             |                 OIM
             |      Coef.   Std. Err.      z    P>|z|     [95% Conf.
Interval]
-------------+----------------------------------------------------------
------
f            |
           f |
         L1. |   1.302608    .205203     6.35   0.000     .9004174
1.704798
         L2. |  -.5836402   .1829143    -3.19   0.001    -.9421456
-.2251347
-------------+----------------------------------------------------------
------
D.p          |
           f |  -.1543837   .0445771    -3.46   0.001    -.2417532
-.0670142
-------------+----------------------------------------------------------
------
D.v          |
           f |  -.0692728   .0441276    -1.57   0.116    -.1557613
.0172156
-------------+----------------------------------------------------------
------
D.ep         |
           f |   .1194667   .0516863     2.31   0.021     .0181635
.2207699
-------------+----------------------------------------------------------
------
var(De.p)    |   .1336773   .0247502     5.40   0.000     .0851679
.1821868
var(De.v)    |   .4937034    .066499     7.42   0.000     .3633677
.6240391
var(De.ep)   |   .4552322   .0655476     6.95   0.000     .3267612
.5837031
------------------------------------------------------------------------
------
Note: Tests of variances against zero are conservative and are provided
only
      for reference.

(Of course the parameter estimates are for our simulated data.)

I hope this helps.


-Rich
rgates@stata.com
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index