Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# st: RE: Identifying first observation in each panel of unbalanced panel

 From Nick Cox <[email protected]> To "'[email protected]'" <[email protected]> Subject st: RE: Identifying first observation in each panel of unbalanced panel Date Mon, 4 Jun 2012 18:21:58 +0100

Your code looks fine to me, so I have difficulty understanding why you think it doesn't work.

The -sort- on the second command is unnecessary given the previous command, but I don't see that it will change the sort order.

You can check logic in terms of this example:

. webuse grunfeld

. su year

Variable |       Obs        Mean    Std. Dev.       Min        Max
-------------+--------------------------------------------------------
year |       200      1944.5    5.780751       1935       1954

. drop if year == 1935 & mod(company, 2)
(5 observations deleted)

. tab year

year |      Freq.     Percent        Cum.
------------+-----------------------------------
1935 |          5        2.56        2.56
1936 |         10        5.13        7.69
1937 |         10        5.13       12.82
1938 |         10        5.13       17.95
1939 |         10        5.13       23.08
1940 |         10        5.13       28.21
1941 |         10        5.13       33.33
1942 |         10        5.13       38.46
1943 |         10        5.13       43.59
1944 |         10        5.13       48.72
1945 |         10        5.13       53.85
1946 |         10        5.13       58.97
1947 |         10        5.13       64.10
1948 |         10        5.13       69.23
1949 |         10        5.13       74.36
1950 |         10        5.13       79.49
1951 |         10        5.13       84.62
1952 |         10        5.13       89.74
1953 |         10        5.13       94.87
1954 |         10        5.13      100.00
------------+-----------------------------------
Total |        195      100.00

. bysort company (year) : gen first = _n == 1

. l company year  if first

+----------------+
| company   year |
|----------------|
1. |       1   1936 |
20. |       2   1935 |
40. |       3   1936 |
59. |       4   1935 |
79. |       5   1936 |
|----------------|
98. |       6   1935 |
118. |       7   1936 |
137. |       8   1935 |
157. |       9   1936 |
176. |      10   1935 |
+----------------+

Nick
[email protected]

Ivan Png

I am analyzing an unbalanced panel of company data, organized by
company (gvkey) and year.  I want to create  a flag to the first
observation of each company in the panel.  I tried

. sort gvkey year
. by gvkey , sort: gen flag = 1 if  _n == 1

However, this only flagged flag = 1 if a company was present in year 1
of the panel.  It missed any company that appeared in later years.

I searched statalist and found this:
http://www.stata.com/statalist/archive/2005-04/msg00334.html

But it doesn't work.  I'd be grateful for any relevant help.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

• References: