Since I cannot get a reply on this, I would like to post it again to see
if anyone can help. Thanks!
I have two questions about the use of weights in regression.
First, I have a question about using aweight in regression. As I
understand from this: http://www.stata.com/support/faqs/stat/crc36.html
The (slope) coefficients and se estimates of regression using aweights
(=n) and those of regression with variables transformed by multiplying
sqrt(n) are the same. However, what I have got is not the same. Maybe I
have misunderstood the above page. If so, how and why?
I have attached the codes and results at the end of the mail.
My second question is as follows: I am working with the US census data
and I am pooling data from 1960 to 2000. Due to the huge data size and
following some other researchers working on it, I am going to use group
mean data to run certain regressions, with aweight=cellsize (number of
original observations it is averaged from.)
First, I should expect a loss of efficiency, am I correct?
Second, a problem is that individual observations contains a person
weight due to survey design, especially after 1990. One suggestion is to
use this person weight (as pweight) to calculate the cell means and use
aweight=cellsize to do the regression on cell means, where cellsize is
the number observation these means are derived from, without regarding
the person weight.
I would like to ask if it is a good way, and if there is another better
way to deal with this situation, say should we take into account the
person weight to construct the weight in the regression stage?
Another question I haven't asked last time: will it be generally more
efficient if I collapse into more number of cells to run the regression?
Thank you very much for your assistance and opinion!
Regards,
Tak Wai
My codes used are:
. reg y x1 x2 [aw=celsize]
(sum of wgt is 3.0000e+02)
Number of obs = 20
(something omitted...)
------------------------------------------------------------------------------
y | Coef. Std. Err. t P>|t| [95% Conf.
Interval]
-------------+----------------------------------------------------------------
x1 | 2.130015 .5890904 3.62 0.002 .8871432 3.372887
x2 | 1.364704 .7330079 1.86 0.080 -.181808 2.911215
_cons | 1.127888 .1742532 6.47 0.000 .7602464 1.495531
------------------------------------------------------------------------------