Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Nick Cox <n.j.cox@durham.ac.uk> |

To |
"'statalist@hsphsun2.harvard.edu'" <statalist@hsphsun2.harvard.edu> |

Subject |
RE: st: getting Stata to read a bizarre sequence of dates |

Date |
Wed, 13 Jun 2012 11:32:38 +0100 |

There are no contradictions here. Many dates round to the same value of floor(dailydate/28). The rounded date does not remember what it was originally. tsfill- only adds observations with missing values at times that were gaps. It does not interpolate. Nick n.j.cox@durham.ac.uk joales salbdralor Thank you for your reply on my question I want to comment on two things 1) You mentioned that " There is a gap at 638 as the following reveals. . format edate1 %td . tab edate1 edate1 | Freq. Percent Cum. ------------+----------------------------------- 23nov2008 | 2 14.29 14.29 28dec2008 | 2 14.29 28.57 25jan2009 | 2 14.29 42.86 22feb2009 | 2 14.29 57.14 29mar2009 | 2 14.29 71.43 26apr2009 | 2 14.29 85.71 24may2009 | 2 14.29 100.00 ------------+----------------------------------- Total | 14 100.00 . tab edate2 edate2 | Freq. Percent Cum. ------------+----------------------------------- 637 | 2 14.29 14.29 639 | 2 14.29 28.57 640 | 2 14.29 42.86 641 | 2 14.29 57.14 642 | 2 14.29 71.43 643 | 2 14.29 85.71 644 | 2 14.29 100.00 ------------+----------------------------------- Total | 14 100.00 But there is also another gap of 35 days going from 22/02/09 to 29/03/09 which is not taken into account by the above table. Why is this the case? You can see the same problem by looking at the data editor after having typed gen edate1 = date(dates1, "DM20Y") gen edate2 = floor(edate1/28) tsset id edate2 tsfill, full 2) My opinion is that by using the -tsfill, full- as a solution to "fix" the gap problem I do not gain much because even if I issue the commands gen edate1 = date(dates1, "DM20Y") gen edate2 = floor(edate1/28) tsset id edate2 tsfill, full egen nonmiss = rownonmiss(id) and base my analysis on ..if rownonmiss(2) nothing will change. I just sort out the data in a more clear way. As you mentioned, interpolation may be the best solution thanks again On 6/13/12, Nick Cox <njcoxstata@gmail.com> wrote: > If "joales salbdralor" is really "stef salvez" what is going on? Do > note the firm request that Statalist posters use their real names. > > If you have gaps in your data, most commands can cope. There is no > catch-all solution, as whether gaps are a problem depends on what you > want to do. Some kind of interpolation would be necessary for some > commands to be used. > > Nick > > On Wed, Jun 13, 2012 at 8:31 AM, joales salbdralor > <joalessakafiora@googlemail.com> wrote: >> thanks Nick for your reply. you are right (as always) that Stata can >> read these string dates and it is converting them to numeric dates. >> The only problem that I have is how to avoid having the "with gaps" >> comment, Put differently, how can I correct this problem?. Or is it ok >> by just issuing the previously mentioned commands, that is: >> >> gen edate1 = date(dates1, "DM20Y") >> gen edate2 = floor(edate1/28) >> tsset id edate2 >> >> >> Regarding your reasonable question why I am posting mysteriously >> similar questions is that I want to play around with different and >> "bizarre" sequence of dates as this is the first step if I want to >> start doing correct data/econometric analysis . >> >> >> >> thank you very much again >> >> >> On 6/13/12, Nick Cox <njcoxstata@gmail.com> wrote: >>> You've posted numerous questions on this kind of data over the last >>> few weeks, to which there have been numerous answers, so why you ask >>> this question seems especially mysterious. >>> >>> First, Stata _is_ reading your string dates and it _is_ converting >>> them to numeric dates. So the implication that Stata can't read these >>> data is quite wrong. >>> >>> The only problem evident is that there is one gap in this dataset even >>> when you restructure it as a series with spacing 28 days. >>> >>> So, what -tsset- reports is essentially "fair comment". There is a gap >>> at 638 as the following reveals. >>> >>> . format edate1 %td >>> >>> . tab edate1 >>> >>> edate1 | Freq. Percent Cum. >>> ------------+----------------------------------- >>> 23nov2008 | 2 14.29 14.29 >>> 28dec2008 | 2 14.29 28.57 >>> 25jan2009 | 2 14.29 42.86 >>> 22feb2009 | 2 14.29 57.14 >>> 29mar2009 | 2 14.29 71.43 >>> 26apr2009 | 2 14.29 85.71 >>> 24may2009 | 2 14.29 100.00 >>> ------------+----------------------------------- >>> Total | 14 100.00 >>> >>> . tab edate2 >>> >>> edate2 | Freq. Percent Cum. >>> ------------+----------------------------------- >>> 637 | 2 14.29 14.29 >>> 639 | 2 14.29 28.57 >>> 640 | 2 14.29 42.86 >>> 641 | 2 14.29 57.14 >>> 642 | 2 14.29 71.43 >>> 643 | 2 14.29 85.71 >>> 644 | 2 14.29 100.00 >>> ------------+----------------------------------- >>> Total | 14 100.00 >>> >>> By the way, posting code that others can run is an excellent idea, but >>> cut the -cd d:- which assumes that users have a d: drive, which may be >>> quite wrong. >>> >>> Nick >>> >>> On Wed, Jun 13, 2012 at 12:37 AM, stef salvez <loggyedy@googlemail.com> >>> wrote: >>> >>>> I have the following panel data set >>>> >>>> clear all >>>> cd d:\ >>>> input str8 (dates1) id >>>> "23/11/08" 1 >>>> "28/12/08" 1 >>>> "25/01/09" 1 >>>> "22/02/09" 1 >>>> "29/03/09" 1 >>>> "26/04/09" 1 >>>> "24/05/09" 1 >>>> "23/11/08" 2 >>>> "28/12/08" 2 >>>> "25/01/09" 2 >>>> "22/02/09" 2 >>>> "29/03/09" 2 >>>> "26/04/09" 2 >>>> "24/05/09" 2 >>>> >>>> end >>>> >>>> the difference (in days) between successive dates is 35 28 28 35 28 28 >>>> >>>> The problem is that I do not know how to convert these string dates >>>> into numeric variables given the fact that I have 2 jumps (35 days) >>>> if i issue the commands >>>> >>>> gen edate1 = date(dates1, "DM20Y") >>>> gen edate2 = floor(edate1/28) >>>> tsset id edate2 >>>> >>>> I get the following error message >>>> >>>> panel variable: id (strongly balanced) >>>> time variable: edate2, 637 to 644, but with gaps >>>> delta: 1 unit >>>> >>>> Is there any code that could fix that so as to get Stata to "read" >>>> that "bizarre"sequence of dates? > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: getting Stata to read a bizarre sequence of dates***From:*stef salvez <loggyedy@googlemail.com>

**References**:**st: getting Stata to read a bizarre sequence of dates***From:*stef salvez <loggyedy@googlemail.com>

**Re: st: getting Stata to read a bizarre sequence of dates***From:*Nick Cox <njcoxstata@gmail.com>

**Re: st: getting Stata to read a bizarre sequence of dates***From:*joales salbdralor <joalessakafiora@googlemail.com>

**Re: st: getting Stata to read a bizarre sequence of dates***From:*Nick Cox <njcoxstata@gmail.com>

**Re: st: getting Stata to read a bizarre sequence of dates***From:*joales salbdralor <joalessakafiora@googlemail.com>

- Prev by Date:
**st: shift-share analysis (counterfactual) of inequality changes by subgroups using ineqdeco** - Next by Date:
**st: Column widths in -tab- or including row percentages in -table-** - Previous by thread:
**Re: st: getting Stata to read a bizarre sequence of dates** - Next by thread:
**Re: st: getting Stata to read a bizarre sequence of dates** - Index(es):