Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Problem with cross sectional regression, different results from xtreg, xi


From   huyen le <cbt_fx@yahoo.com>
To   statalist@hsphsun2.harvard.edu
Subject   st: Problem with cross sectional regression, different results from xtreg, xi
Date   Tue, 19 Jul 2011 09:56:14 +0100 (BST)

Dear Stata Friends,

I'm  facing a serious problem with the cross sectional regression. Even I've  
searched around but couldn't solve it yet. Therefore, I'd be deeply  grateful 
for your help.
  
I'm  doing a cross-sectional regression for about 2000 funds to examize the  
relationship between funds returns and age, expense, size. Each fund  includes 
data about returns, age, etc.  for 10 years, from 2000 to 2010.
Since I want to do a yearly cross-sectional regression, I regress 2000 funds 
controlled for the time variable. 
Well, I do it according to the following 3 ways. Sadly, I get 3 different 
results. And I'm not sure, which one is correct.

1st way:  seperate regression
. regress returns   age   expense   size if year==2000, robust
. regress returns   age   expense   size if year==2001,  robust
...
. regress returns   age   expense   size if year==2010, robust

Hence,  I get 10 coefficients for each independent variable from 2000 to 2010.  
Then I calculate the average of them, and check the statistical  significance by 
ttest. E.g.
coeff_age = (coeff_age2000+coeff_age2001+ ... + coeff_age2010) / 10
ttest coeff_age==0    /*check for statistical significance*/


2nd way: Using dummy variables

. xi: regress returns   age   expense   size i.year, robust

3rd way: Using xtreg

. xtreg: regress returns   age   expense   size, fe   i(year)   robust

If  I don't use the robust option, the 2nd way and 3rd way deliver the same  
result in p_values. But because of heteroskedasticity I have to use  robust 
regression and get totally different p_values between the two  ways.

Could you please tell me, which way is the correct one and how I can solve this 
problem. 
Thank you very much!

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index