Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: egen and computing fixed effects


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: RE: egen and computing fixed effects
Date   Tue, 22 Jun 2004 20:42:57 +0100

-egen- is going to calculate 
a mean over missing and non-missing 
values of its argument alike. The 
mean so calculated will be the mean
of the non-missings but it will 
be attached to missings too. 

You can see for this for yourself
by e.g. 

. sysuse auto
. egen mean = mean(rep78), by(foreign) 
. edit foreign rep78 mean 

Your d2 variables should set things 
straight again, however. Whenever 
a response is missing, 

response - mean = missing - mean = missing

irrespective of what is in mean. 

The explanation lies elsewhere, I guess. 
Is -student- ever missing? 

Nick 
n.j.cox@durham.ac.uk 

Tim R. Sass
> 
> I am trying to "manually" compute a fixed-effects estimator 
> by taking the 
> differences from means of all variables and then running reg on the 
> demeaned data.  You may ask why in the world I would want to 
> do that, but 
> that's for another post.
> 
> I have a panel of student-level data over three years.  I 
> demean the data 
> as follows:
> 
> bysort student:egen nrtrgain_m = mean(nrtrgain);
> bysort student:egen charter_m = mean(charter);
> bysort student:egen nschools_m = mean(nschools);
> bysort student:egen chgschl_m = mean(chgschl);
> 
> gen d2_nrtrgain = nrtrgain - nrtrgain_m;
> gen d2_charter = charter - charter_m;
> gen d2_nschools = nschools - nschools_m;
> gen d2_chgschl = chgschl - chgschl_m;
> 
> I then run the following models:
> 
> areg  nrtrgain charter nschools chgschl,
>                   absorb(student) ;
> 
> reg   d2_nrtrgain d2_charter d2_nschools d2_chgschl ;
> 
> xtdata  nrtrgain charter nschools chgschl, fe clear;
> reg  nrtrgain charter nschools chgschl;
> 
> 
> The first and third models yield the same estimated 
> coefficients (except 
> for the constant, of course), but the coefficients for the 
> second model 
> (using reg on the demeaned variables) yields different 
> results.  However, 
> when I eliminate all observations with missing values for any of the 
> variables in the model, all three models yield identical 
> estimated slope 
> coefficients.
> 
> I'm guessing the problem has something to do with how egen 
> computes the 
> mean for each student when there are missing observations.  I 
> have read 
> through the manual and searched the archives, but still can't 
> figure out 
> what is going on.  Any help would be greatly appreciated.
> 
> Tim
> 
> 
> Tim R. Sass
> Professor                               Voice:   (850)644-7087
> Department of Economics         Fax:      (850)644-4535
> Florida State University                E-mail:   tsass@coss.fsu.edu
> Tallahassee, FL  32306-2180     Internet: 
http://garnet.acns.fsu.edu/~tsass


*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index