Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: elementary panel data management question


From   "Scott Merryman" <smerryman@kc.rr.com>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: RE: elementary panel data management question
Date   Fri, 22 Apr 2005 10:02:50 -0500

Here is one way, using the "grunfeld.dta" data set with investment as the
variable of interest:

. webuse grunfeld

. sort year invest

. gen top5 = .
(200 missing values generated)

. by year: replace top5 = 1 if _n  >_N-5 & invest <.
(100 real changes made)

. replace top = 0 if top ==.
(100 real changes made)

. egen total = sum(invest) if top == 1 , by(year)
(100 missing values generated)


Hope this helps,
Scott

> -----Original Message-----
> From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-
> statalist@hsphsun2.harvard.edu] On Behalf Of Crystal Lopez
> Sent: Friday, April 22, 2005 9:37 AM
> To: statalist@hsphsun2.harvard.edu
> Subject: st: elementary panel data management question
> 
> Hi,
> 
> My first 2 questions to the stata list:
> 
> I have a large panel dataset, with entries for each
> company and for each year. In other words, I have one
> variable called "company" and one variable called
> "year", so that I have one observation for each
> company for each year, and each observation has
> several other variables.
> 
> Basically what I want to do is to identify the 5
> companies that have the highest profits for each year.
> I then want to create a dummy variable (call it top5)
> which indicates, for each observation, whether that
> company for that year is one of the 5 most profitable.
> The 5 would be of course tend to be different for
> every year. I would end up with a variable which is 1
> if that company is among the 5 most profitable for
> that year, 0 otherwise. (I would then like to do this
> for top 10 and top 20 as well, but I guess I can
> figure that out once I have the above).
> 
> The reason that I want to create such a variable is
> that I am doing panel data regressions and one of the
> independent variables that I want to throw in is a
> dummy like this, to indicate whether or not a given
> company is among the top 5 for that year. I also want
> to be able to exclude from my regression any company
> that is among the top 5 in a given year.
> 
> A second, related question is how I can get the total
> profits of the top 5 banks in every year. I guess once
> I have created the dummy this shouldn't be too
> difficult - probably I can get some kind of table that
> sums up the profit variable for the top 5 (ie where
> the dummy=1) by year??
> 
> Sorry for these simple questions but I've been trying
> to figure it out without success.
> 
> Thanks,
> Crystal



*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index