[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
st: elementary panel data management question
My first 2 questions to the stata list:
I have a large panel dataset, with entries for each
company and for each year. In other words, I have one
variable called "company" and one variable called
"year", so that I have one observation for each
company for each year, and each observation has
several other variables.
Basically what I want to do is to identify the 5
companies that have the highest profits for each year.
I then want to create a dummy variable (call it top5)
which indicates, for each observation, whether that
company for that year is one of the 5 most profitable.
The 5 would be of course tend to be different for
every year. I would end up with a variable which is 1
if that company is among the 5 most profitable for
that year, 0 otherwise. (I would then like to do this
for top 10 and top 20 as well, but I guess I can
figure that out once I have the above).
The reason that I want to create such a variable is
that I am doing panel data regressions and one of the
independent variables that I want to throw in is a
dummy like this, to indicate whether or not a given
company is among the top 5 for that year. I also want
to be able to exclude from my regression any company
that is among the top 5 in a given year.
A second, related question is how I can get the total
profits of the top 5 banks in every year. I guess once
I have created the dummy this shouldn't be too
difficult - probably I can get some kind of table that
sums up the profit variable for the top 5 (ie where
the dummy=1) by year??
Sorry for these simple questions but I've been trying
to figure it out without success.
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
* For searches and help try: