Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
"Degas Wright" <dwright@cornerstoneadvice.com> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
RE: st: Question on Dfactor and Gaps in Time Series |

Date |
Mon, 11 Oct 2010 17:17:23 -0400 |

Richard, Thank you for stepping me through this solution. Degas A. Wright, CFA Chief Investment Officer Decatur Capital Management, Inc. 250 East Ponce De Leon Avenue, Suite 325 Decatur, Georgia 30030 Voice: 404.270.9838 Fax:404.270.9840 Website: www.decaturcapital.com -----Original Message----- From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Richard Gates Sent: Monday, October 11, 2010 5:01 PM To: statalist@hsphsun2.harvard.edu Subject: Re: st: Question on Dfactor and Gaps in Time Series Degas Wright is getting the "gaps in the time series" error from -dfactor-. Degas wrote that > I am using the dfactor command and have run into the gap in time series > error. My data is price (p), volume (v) and earnings yield (ep) and I am > trying to develop a dynamic factor model using the dfactor command. My > code is: > > tsset > time variable: date, 2008w25 to 2010w40 > delta: 1 week > > . dfactor(D.(p v ep)=,noconstant)(f=,ar(1/2)) > gaps in the time series are not allowed > r(459); > -dfactor- cannot use datasets that contain gaps in the data. A gap in the data occurs when there is a missing observation in the middle of a time series. -tsfill- fills in gaps in the time variable, not in the other variables in the dataset. I explain the difference below. Degas can use -dfactor-, if he is willing to impose an additional assumption which removes the gaps, as explained below. Now I fill in the details. I have some simulated data. The variables have the same names as those in Degas' example, but the values are arbitrary. I begin by using the data and running -tsset- on the time variable -t-. . use mydata . tsset time variable: t, 2008w25 to 2010w40, but with gaps delta: 1 week The output from -tsset- informs us that there are gaps in the data. We just happen to know that the missing observations occur in week 52 of each year. (In my simulated data, the research team takes vacation the last week of the year, so there is no data for week 52.) We use this knowledge to list out the data around the missing observations. . list t if week(dofw(t)) > 50 | week(dofw(t)) < 2 , separator(2) +---------+ | t | |---------| 27. | 2008w51 | 28. | 2009w1 | |---------| 78. | 2009w51 | 79. | 2010w1 | +---------+ (The week() function displays the week from a time variable stored in daily format. The dofw() function converts a time variable in weekly format to a time variable in daily format.) We cannot use -dfactor- on this data because there are gaps in the data. If we use -tsfill- on this data, it inserts observations for the missing periods, but only the time variable will be nonmissing. We illustrate this point below. . tsfill, full . list t p v ep if week(dofw(t)) > 50 | week(dofw(t)) < 2 , separator(2) +----------------------------------------------+ | t p v ep | |----------------------------------------------| 27. | 2008w51 11.143581 16.772175 -9.7874321 | 28. | 2008w52 . . . | |----------------------------------------------| 29. | 2009w1 11.894791 19.141332 -10.752946 | 79. | 2009w51 11.682969 28.361535 -13.24529 | |----------------------------------------------| 80. | 2009w52 . . . | 81. | 2010w1 10.297958 27.970307 -13.730162 | +----------------------------------------------+ . tsset time variable: t, 2008w24 to 2010w40 delta: 1 week There are still gaps in this data, so we still cannot use -dfactor- on this data. Now, I suppose that week 1 actually comes after week 51. In my example, this assumption holds because the researchers take off the last week in the year. I implement this assumption by (1) dropping the two observations for which the week is 52, (2) creating a new time variable that goes from 1 to the number of observations in the sample, and (3) using -tsset- to declare the new time variable. By construction, there are no missing time periods. . drop if week(dofw(t)) == 52 (2 observations deleted) . generate t2 = _n . tsset t2 time variable: t2, 1 to 118 delta: 1 unit Having removed the gaps in the data by imposing an additional assumption on our model, we can use -dfactor- to estimate the parameters. . dfactor(D.(p v ep)=,noconstant)(f=,ar(1/2)) searching for initial values ........... (setting technique to bhhh) Iteration 0: log likelihood = -329.57244 Iteration 1: log likelihood = -324.04945 Iteration 2: log likelihood = -322.84128 Iteration 3: log likelihood = -322.38332 Iteration 4: log likelihood = -322.10539 (switching technique to nr) Iteration 5: log likelihood = -322.05329 Iteration 6: log likelihood = -321.89974 Iteration 7: log likelihood = -321.89735 Iteration 8: log likelihood = -321.89735 Refining estimates: Iteration 0: log likelihood = -321.89735 Iteration 1: log likelihood = -321.89735 Dynamic-factor model Sample: 2 - 118 Number of obs = 117 Wald chi2(5) = 378.91 Log likelihood = -321.89735 Prob > chi2 = 0.0000 ------------------------------------------------------------------------ ------ | OIM | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------- ------ f | f | L1. | 1.302608 .205203 6.35 0.000 .9004174 1.704798 L2. | -.5836402 .1829143 -3.19 0.001 -.9421456 -.2251347 -------------+---------------------------------------------------------- ------ D.p | f | -.1543837 .0445771 -3.46 0.001 -.2417532 -.0670142 -------------+---------------------------------------------------------- ------ D.v | f | -.0692728 .0441276 -1.57 0.116 -.1557613 .0172156 -------------+---------------------------------------------------------- ------ D.ep | f | .1194667 .0516863 2.31 0.021 .0181635 .2207699 -------------+---------------------------------------------------------- ------ var(De.p) | .1336773 .0247502 5.40 0.000 .0851679 .1821868 var(De.v) | .4937034 .066499 7.42 0.000 .3633677 .6240391 var(De.ep) | .4552322 .0655476 6.95 0.000 .3267612 .5837031 ------------------------------------------------------------------------ ------ Note: Tests of variances against zero are conservative and are provided only for reference. (Of course the parameter estimates are for our simulated data.) I hope this helps. -Rich rgates@stata.com * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**Re: st: Question on Dfactor and Gaps in Time Series***From:*Richard Gates <rgates@stata.com>

- Prev by Date:
**Re: st: Question on Dfactor and Gaps in Time Series** - Next by Date:
**st: RE: RE: RE: RE: Question on Dfactor and Gaps in Time Series** - Previous by thread:
**Re: st: Question on Dfactor and Gaps in Time Series** - Next by thread:
**st: Truncated at zero count data with underdispersion** - Index(es):