Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: Re: transforming panel data


From   Christopher F Baum <[email protected]>
To   [email protected]
Subject   st: Re: transforming panel data
Date   Sat, 28 Sep 2002 08:18:39 -0400

--On Saturday, September 28, 2002 2:33 -0400 Paulo wrote:

Hi Stata users,

Please, I need help to change the dataset below format. Note that the
reshape command is not enough to do this transformation.

            Id          period      others variables
  1.         1          1996
  2.         1          1996
  3.         1          1996
  4.         1          1997
  5.         1          1997
  6.         1          1997
  7.         1          1998
  8.         1          1998
  8.         1          1999
  9.         1          1999
 10.         1          2000
 11.         1          2000
 12          1          2000
 13          1          2001
 14          1          2001
 15          2          1996
 16.         2          1996
 17.         2          1996
 18.         2          1997
 19          2          1997
 20          2          1997
 21          2          1998
 22          2          1998
 23          .          1999
 24          .          1999
 25          2          2000
 26          2          2000
 27          2          2000
 28          2          2001
 29          2          2001
 30          3          1996
 31          3          1996
 32          3          1996
 33          3          1997
 34          3          1997
 35          3          1997
 36          3          1998
 37          3          1998
 38          3          1999
 39          3          1999
 40          3          2000
 41          3          2000
 42          3          2000
 43          .          2001
 44          .          2001
 45          4          1996
 46          4          1996
 47          4          1996
 48          4          1997
 49          4          1997
 50          4          1997
 51          4          1998
 52          4          1998
 53          4          1999
 54          4          1999
 55          4          2000
 56          4          2000
 57          4          2000
 58          4          2001
 59          4          2001


***************************************************************
 ***** I need the data in this format (with the year controlling the time
series operator) ********************

            id         period      others variables
  1.         1          1996
  2.         1          1997
  3.         1          1998
  4.         1          1999
  5.         1          2000
  6.         1          2001
  7.         1          1996
  8.         1          1997
  9.         1          1998
 10.         1          1999
 11.         1          2000
 12.         1          2001
 13.         1          1996
 14.         1          1997
 18          .          1998
 19          .          1999
 20          1          2000
 21          .          2001
THere is no way of knowing why the multiple observations for 1996 in the original data should be placed in the order of the transformed data. E.g. if those observations were

id period class ...
1. 1 1996 a
2. 1 1996 b
3. 1 1996 c
4. 1 1997 a
5. 1 1997 b
6. 1 1997 c

Then the desired result would appear from 'sort id class period'. If this is the case (and it must be, since there are multiple periods per id) then to 'tsset' these data, and use the time series operator, your panel variable is not merely id -- it is the combination of id and class. You must generate a new integer indicator that uniquely identifies, e.g., obs. 1 and 4 in my example as belonging to the same group. Starting with (imagine var5 contains your actual data of interest):


var2 var3 var4 var5
1. 1 1996 a 1
2. 1 1996 b 2
3. 1 1996 c 3
4. 1 1997 a 4
5. 1 1997 b 5
6. 1 1997 c 6
7. 2 1996 a 7
8. 2 1996 b 8
9. 2 1996 c 9
10. 2 1997 a 10
11. 2 1997 b 11
12. 2 1997 c 12



. encode var4,gen(class) required to generate an integer for tsset

. drop var4

. egen tsid=group(var2 class)

. sort tsid var3

. tsset tsid var3
panel variable: tsid, 1 to 6
time variable: var3, 1996 to 1997


. list

var2 var3 var5 class tsid
1. 1 1996 1 a 1
2. 1 1997 4 a 1
3. 1 1996 2 b 2
4. 1 1997 5 b 2
5. 1 1996 3 c 3
6. 1 1997 6 c 3
7. 2 1996 7 a 4
8. 2 1997 10 a 4
9. 2 1996 8 b 5
10. 2 1997 11 b 5
11. 2 1996 9 c 6
12. 2 1997 12 c 6

I believe this is doing what you describe.

Kit


*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/




© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index