Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Nick Cox <njcoxstata@gmail.com> |
To | statalist@hsphsun2.harvard.edu |
Subject | Re: st: Identifying first observation in each panel after regression |
Date | Mon, 4 Jun 2012 23:31:17 +0100 |
Which bit don't you understand? On Mon, Jun 4, 2012 at 11:16 PM, Ivan Png <iplpng@gmail.com> wrote: > Dear Nick-- > > Many thanks for your hint. I found the solution. I execute > . by gvkey , sort: gen flag = 1 if _n == 1 > before the regression. > > Then, after the regression, I execute > . gen regsample == 1 if e(sample) > > And, to identify the first observation of each company in the > regression sample, I use > regsample == 1 & flag == 1 > > However, I still don't understand the reason it works. > > > On 4 June 2012 14:24, Nick Cox <njcoxstata@gmail.com> wrote: >> What code do you mean by "the code below"? >> >> I suspect there's something else up with your dataset that leads to >> what you see. Examine the data omitted by >> >> . edit if !e(sample) >> >> after your -xtreg- command. >> >> Nick >> >> On Mon, Jun 4, 2012 at 6:44 PM, Ivan Png <iplpng@gmail.com> wrote: >>> Many thanks, Nick. Incidentally, thanks for the yeoman service to all >>> STATAlisters. >>> >>> The discrepancy I found was by using xtreg to run a fixed-effects >>> regression on the sample. xtreg reported 2773 companies. Yet, when I >>> used the code below on the regression sample, I got only 1048 >>> companies. So, the only reason I could think of was that the flag >>> identified only companies that were present in year 1. >> >> On 4 June 2012 13:21, Nick Cox <n.j.cox@durham.ac.uk> wrote: >> >>>> Your code looks fine to me, so I have difficulty understanding why you think it doesn't work. >>>> >>>> The -sort- on the second command is unnecessary given the previous command, but I don't see that it will change the sort order. >>>> >>>> You can check logic in terms of this example: >>>> >>>> . webuse grunfeld >>>> >>>> . su year >>>> >>>> Variable | Obs Mean Std. Dev. Min Max >>>> -------------+-------------------------------------------------------- >>>> year | 200 1944.5 5.780751 1935 1954 >>>> >>>> . drop if year == 1935 & mod(company, 2) >>>> (5 observations deleted) >>>> >>>> . tab year >>>> >>>> year | Freq. Percent Cum. >>>> ------------+----------------------------------- >>>> 1935 | 5 2.56 2.56 >>>> 1936 | 10 5.13 7.69 >>>> 1937 | 10 5.13 12.82 >>>> 1938 | 10 5.13 17.95 >>>> 1939 | 10 5.13 23.08 >>>> 1940 | 10 5.13 28.21 >>>> 1941 | 10 5.13 33.33 >>>> 1942 | 10 5.13 38.46 >>>> 1943 | 10 5.13 43.59 >>>> 1944 | 10 5.13 48.72 >>>> 1945 | 10 5.13 53.85 >>>> 1946 | 10 5.13 58.97 >>>> 1947 | 10 5.13 64.10 >>>> 1948 | 10 5.13 69.23 >>>> 1949 | 10 5.13 74.36 >>>> 1950 | 10 5.13 79.49 >>>> 1951 | 10 5.13 84.62 >>>> 1952 | 10 5.13 89.74 >>>> 1953 | 10 5.13 94.87 >>>> 1954 | 10 5.13 100.00 >>>> ------------+----------------------------------- >>>> Total | 195 100.00 >>>> >>>> . bysort company (year) : gen first = _n == 1 >>>> >>>> . l company year if first >>>> >>>> +----------------+ >>>> | company year | >>>> |----------------| >>>> 1. | 1 1936 | >>>> 20. | 2 1935 | >>>> 40. | 3 1936 | >>>> 59. | 4 1935 | >>>> 79. | 5 1936 | >>>> |----------------| >>>> 98. | 6 1935 | >>>> 118. | 7 1936 | >>>> 137. | 8 1935 | >>>> 157. | 9 1936 | >>>> 176. | 10 1935 | >>>> +----------------+ >>>> >>>> Nick >>>> n.j.cox@durham.ac.uk >>>> >>>> Ivan Png >>>> >>>> I am analyzing an unbalanced panel of company data, organized by >>>> company (gvkey) and year. I want to create a flag to the first >>>> observation of each company in the panel. I tried >>>> >>>> . sort gvkey year >>>> . by gvkey , sort: gen flag = 1 if _n == 1 >>>> >>>> However, this only flagged flag = 1 if a company was present in year 1 >>>> of the panel. It missed any company that appeared in later years. >>>> >>>> I searched statalist and found this: >>>> http://www.stata.com/statalist/archive/2005-04/msg00334.html >>>> >>>> But it doesn't work. I'd be grateful for any relevant help. * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/