Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: re: data row transformation for irregular consecutive days


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   RE: st: re: data row transformation for irregular consecutive days
Date   Tue, 23 Feb 2010 11:59:52 -0000

As Kit underlines, -tsspell- (from SSC) requires -tsset- data, but they must be correctly -tsset-!

As your data are panel data, you need to declare identifier and time variables. This example shows that, once you have done that, issuing -tsspell- using an example from the help file will identify spells defined by consecutive times. 

. l

     +-----------+
     | id   time |
     |-----------|
  1. |  1      1 |
  2. |  1      2 |
  3. |  1      3 |
  4. |  1      5 |
  5. |  1      6 |
     |-----------|
  6. |  2      1 |
  7. |  2      2 |
  8. |  2      8 |
  9. |  2      9 |
     +-----------+

. tsset id time
       panel variable:  id (unbalanced)
        time variable:  time, 1 to 9, but with gaps
                delta:  1 unit

. tsspell, f(L.time == .)

. l

     +----------------------------------+
     | id   time   _spell   _seq   _end |
     |----------------------------------|
  1. |  1      1        1      1      0 |
  2. |  1      2        1      2      0 |
  3. |  1      3        1      3      1 |
  4. |  1      5        2      1      0 |
  5. |  1      6        2      2      1 |
     |----------------------------------|
  6. |  2      1        1      1      0 |
  7. |  2      2        1      2      1 |
  8. |  2      8        2      1      0 |
  9. |  2      9        2      2      1 |
     +----------------------------------+

Nick 
n.j.cox@durham.ac.uk 

Kaspar Dardas

Hi Kit & Nick,

thanks a lot. The solution almost worked. However, for some  _spell
values I receive too many observations. As you can see the top _spell
has three observations (1 1 1), however, there can only be two (1 1).
I cannot explain why some dates are "grouped" in the same _spell. Most
of them are correct but some are incorrectly grouped.  Did I do
something wrong? ( I have sorted my data by symbol and date
furthermore I have used the below code).


symbol	days	date	 en	_spell	_seq	_end

3IN	04/02/2010	18297	1	1	1	0
3IN	05/02/2010	18298	2	1	2	0
888	12/05/2006	16933	3	1	3	1
888	15/05/2006	16936	4	2	1	1
888	25/09/2006	17069	5	3	1	0
888	26/09/2006	17070	6	3	2	0
888	27/09/2006	17071	7	3	3	0
888	28/09/2006	17072	8	3	4	1
888	03/10/2006	17077	9	4	1	0


gen date = date(days, "DMY")
sort symbol date
g en = _n
tsset en
tsspell date, fcond(D.date>1)
bys _spell: g sdate = date if _seq==1
bys _spell: g ndate = date if _end
collapse sdays ndays, by(symbol _spell)

Thanks,

Kaspar


2010/2/22 Kit Baum <baum@bc.edu>:
> <>
> Kaspar said
>
> Is there a fast way in Stata 11 to do this data transformation?
>
> What I have:
> symbol  days
> AAL     04-10-2004
> AAL     10-01-2005
> AAL     11-01-2005
> AAL     12-01-2005
> AAL     01-04-2005
> AAL     04-04-2005
> AAL     06-06-2005
> AAL     07-06-2005
> AAL     08-06-2005
>
> What I need:
> AAL     04-10-2004 04-10-2004
> AAL     10-01-2005 12-01-2005
> AAL     01-04-2005 01-04-2005
> AAL     04-04-2005 04-04-2005
> AAL     06-06-2005 08-06-2005
>
>
> g date = date(var2,"DMY")
> g en = _n
> tsset en
> // requires N J Cox -tsspell- from SSC (findit tsspell)
> tsspell date, fcond(D.date>1)
> bys _spell: g sdate = date if _seq==1
> bys _spell: g ndate = date if _end
> l
> collapse sdate ndate, by(var1 _spell)
> format sdate %td
> format ndate %td
> l

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index