Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# RE: st: getting Stata to read a bizarre sequence of dates

 From Nick Cox To "'statalist@hsphsun2.harvard.edu'" Subject RE: st: getting Stata to read a bizarre sequence of dates Date Wed, 13 Jun 2012 11:32:38 +0100

```There are no contradictions here. Many dates round to the same value of floor(dailydate/28). The rounded date does not remember what it was originally.

tsfill- only adds observations with missing values at times that were gaps. It does not interpolate.

Nick
n.j.cox@durham.ac.uk

joales salbdralor

I want to comment on two things

1)	You mentioned that
" There is a gap
at 638 as the following reveals.

. format edate1 %td

. tab edate1

edate1 |      Freq.     Percent        Cum.
------------+-----------------------------------
23nov2008 |          2       14.29       14.29
28dec2008 |          2       14.29       28.57
25jan2009 |          2       14.29       42.86
22feb2009 |          2       14.29       57.14
29mar2009 |          2       14.29       71.43
26apr2009 |          2       14.29       85.71
24may2009 |          2       14.29      100.00
------------+-----------------------------------
Total |         14      100.00

. tab edate2

edate2 |      Freq.     Percent        Cum.
------------+-----------------------------------
637 |          2       14.29       14.29
639 |          2       14.29       28.57
640 |          2       14.29       42.86
641 |          2       14.29       57.14
642 |          2       14.29       71.43
643 |          2       14.29       85.71
644 |          2       14.29      100.00
------------+-----------------------------------
Total |         14      100.00

But there is also another gap of 35 days going from 22/02/09 to
29/03/09 which is not taken into account by the above table. Why is
this the case?
You can see the same problem by looking at the data editor after having typed

gen edate1 = date(dates1, "DM20Y")
gen edate2 = floor(edate1/28)
tsset id edate2
tsfill, full

2)  My opinion is that by using the -tsfill, full-  as a solution to
"fix" the gap problem I do not gain much because even if I issue the
commands

gen edate1 = date(dates1, "DM20Y")
gen edate2 = floor(edate1/28)
tsset id edate2
tsfill, full
egen nonmiss = rownonmiss(id)

and base my analysis on  ..if rownonmiss(2) nothing will change. I
just sort out the data in a more clear way. As you mentioned,
interpolation may be the best solution

thanks again

On 6/13/12, Nick Cox <njcoxstata@gmail.com> wrote:
> If "joales salbdralor" is really "stef salvez" what is going on? Do
> note the firm request that Statalist posters use their real names.
>
> If you have gaps in your data, most commands can cope. There is no
> catch-all solution, as whether gaps are a problem depends on what you
> want to do. Some kind of interpolation would be necessary for some
> commands to be used.
>
> Nick
>
> On Wed, Jun 13, 2012 at 8:31 AM, joales salbdralor
>> thanks Nick for your reply. you are right (as always) that  Stata can
>> read these   string dates and it is converting them to numeric dates.
>> The only problem that I have is how to avoid having the "with gaps"
>> comment, Put differently, how can I correct this problem?. Or is it ok
>>  by just issuing the previously mentioned commands,  that is:
>>
>> gen edate1 = date(dates1, "DM20Y")
>> gen edate2 = floor(edate1/28)
>> tsset id edate2
>>
>>
>> Regarding your reasonable question why I am posting  mysteriously
>> similar questions is that I want to play around with different and
>> "bizarre" sequence of dates as this is the first step if I want to
>> start doing correct data/econometric analysis .
>>
>>
>>
>> thank you very much again
>>
>>
>> On 6/13/12, Nick Cox <njcoxstata@gmail.com> wrote:
>>> You've posted numerous questions on this kind of data over the last
>>> few weeks, to which there have been numerous answers, so why you ask
>>> this question seems especially mysterious.
>>>
>>> First, Stata _is_ reading your string dates and it _is_ converting
>>> them to numeric dates. So the implication that Stata can't read these
>>> data is quite wrong.
>>>
>>> The only problem evident is that there is one gap in this dataset even
>>> when you restructure it as a series with spacing 28 days.
>>>
>>> So, what -tsset- reports is essentially "fair comment". There is a gap
>>> at 638 as the following reveals.
>>>
>>> . format edate1 %td
>>>
>>> . tab edate1
>>>
>>>      edate1 |      Freq.     Percent        Cum.
>>> ------------+-----------------------------------
>>>   23nov2008 |          2       14.29       14.29
>>>   28dec2008 |          2       14.29       28.57
>>>   25jan2009 |          2       14.29       42.86
>>>   22feb2009 |          2       14.29       57.14
>>>   29mar2009 |          2       14.29       71.43
>>>   26apr2009 |          2       14.29       85.71
>>>   24may2009 |          2       14.29      100.00
>>> ------------+-----------------------------------
>>>       Total |         14      100.00
>>>
>>> . tab edate2
>>>
>>>      edate2 |      Freq.     Percent        Cum.
>>> ------------+-----------------------------------
>>>         637 |          2       14.29       14.29
>>>         639 |          2       14.29       28.57
>>>         640 |          2       14.29       42.86
>>>         641 |          2       14.29       57.14
>>>         642 |          2       14.29       71.43
>>>         643 |          2       14.29       85.71
>>>         644 |          2       14.29      100.00
>>> ------------+-----------------------------------
>>>       Total |         14      100.00
>>>
>>> By the way, posting code that others can run is an excellent idea, but
>>> cut the -cd d:- which assumes that users have a d: drive, which may be
>>> quite wrong.
>>>
>>> Nick
>>>
>>> On Wed, Jun 13, 2012 at 12:37 AM, stef salvez <loggyedy@googlemail.com>
>>> wrote:
>>>
>>>> I have the following panel data set
>>>>
>>>>  clear all
>>>> cd d:\
>>>> input str8  (dates1)            id
>>>>  "23/11/08"         1
>>>> "28/12/08"          1
>>>>  "25/01/09"        1
>>>>  "22/02/09"         1
>>>> "29/03/09"         1
>>>>  "26/04/09"        1
>>>>  "24/05/09"        1
>>>>  "23/11/08"         2
>>>> "28/12/08"          2
>>>>  "25/01/09"        2
>>>>  "22/02/09"        2
>>>> "29/03/09"         2
>>>>  "26/04/09"        2
>>>>  "24/05/09"        2
>>>>
>>>> end
>>>>
>>>> the difference (in days)  between successive dates is 35 28 28 35 28 28
>>>>
>>>> The problem is that I do not know how to convert these string dates
>>>> into numeric variables given the fact that I have 2 jumps  (35 days)
>>>>  if i issue the commands
>>>>
>>>> gen edate1 = date(dates1, "DM20Y")
>>>> gen edate2 = floor(edate1/28)
>>>> tsset id edate2
>>>>
>>>> I get the following error message
>>>>
>>>> panel variable:  id (strongly balanced)
>>>>        time variable:  edate2, 637 to 644, but with gaps
>>>>                delta:  1 unit
>>>>
>>>> Is there any code that could fix that so as to get Stata to "read"
>>>> that "bizarre"sequence of dates?
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```