Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: Re: panel data management - dividing into quartiles

From   Crystal Lopez <[email protected]>
To   [email protected]
Subject   st: Re: panel data management - dividing into quartiles
Date   Sun, 15 May 2005 14:54:16 -0700 (PDT)

Thank you both Scott and David for your quick replies.

Regarding Scott's reply, assets is not actually my
dependent variable (although I have used it in the
calculation of some of my variables, but just as part
of a ratio), the variable "assets" itself is not in my
regression as such. I just want to use it to be able
to    split my dataset up so as to estimate my
regression separately for different company sizes.
Would Scott's comments still apply in this case,
seeing as it's not the dependent variable being
"truncated"? And if so, can qreg be used with panel
data in any case, I tried but it doesn't seem to work.

In terms of David's suggestion, although I guess it
would work, but as he says, it is "clunky" - I have 50
years to deal with, and if I understand his suggestion
it would mean using xtile for every year, and then
manually entering cutoff points for a dummy for every
quartile individually, and separately for every year
one by one. I'm sure (hoping!) there must be an
easier/more efficient way of doing this - any more

Thanks a lot!!!


Scott wrote:

I would not recommend you go this, but instead use
quantile regression
(-qreg-) to examine the conditional distribution of

Partitioning the dependent variable into quartiles
could produce incorrect
results.  Koenker and Hallock (2001 "Quantile
Regression" Journal of
Economic Perspectives, 15:4 pages 143 -156) refer to
this as "truncation on
the dependent variable."  This method of truncation is
vulnerable to
selection bias, since one is truncating the full
sample based on the
dependent variable in the model. This can produce both
biased and
inconsistent estimates.


David wrote:

The command xtile listed in the manual under pctile
will give you the quartiles or quintiles you need.
Then a clunky way to find the highest value in a
quartile is to use the list command with if conditions
after sorting the values of the variable of interest.
Finally, construct a dummy in the usual way with a
generate command modified by if statements.

I'll bet the experts can come up with a more
economical way, but this should work.

Dave Jacobs


--- Crystal Lopez <[email protected]> wrote:
> Hi,
> I have a panel data set, with data on a number of
> companies over a number of years. Each observation
> is
> a particular company and year, and has various data
> for that observation.
> I would like to divide my data into 4 quartiles or 5
> quintiles, on the basis of one of my variables,
> assets. So I would like to create a variable, call
> it
> quartile, that has the value 1 if that observation
> is
> in the top quartile of assets for that year, has the
> value 2 if that observation is in the second
> quartile
> of assets for that year, etc. (So of course a
> company
> might have different values of "quartile" in
> different
> years, depending on which quartile of assets it fits
> into in a particular year. The quartiles will
> probably
> contain somewhat different companies in each year.)
> The reason that I want to do this is that I can run
> regressions separately for each quartile. 
> I was thinking that I would need to use the by var:
> command, as in "by assets: ", but I'm not sure how
> to
> do it, although it seems like it should be simple.
> Thanks in advance for any help!
> Crystal
> __________________________________ 
> Yahoo! Mail Mobile 
> Take Yahoo! Mail with you! Check email on your
> mobile phone. 

Yahoo! Mail Mobile 
Take Yahoo! Mail with you! Check email on your mobile phone. 
*   For searches and help try:

© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index