Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Nahla Betelmal <nahlaib@gmail.com> |
To | "statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu> |
Subject | Re: st: inconsistent results for two-dimensions fixed effects regressions using xtreg reg areg ivreg2 |
Date | Thu, 15 Aug 2013 10:36:24 +0100 |
Hi Mike, Thanks for the reply and help again. Yes, I understand that at industry level I got many firms observations, so I dent set the panel at industry. I tried to get around that by having a group variable for industry-year , so each firm-year observation will belong only to one industry-year identifier. I got the idea from this thread http://www.talkstats.com/showthread.php/26900-how-can-I-include-two-fixed-effect-in-one-model I modified the variable to be industry-year instead (because as you said a firm can not belong to different industries). So, to use xtcommand, I set the panel at the group variable industry-year only and then included the year dummy in the regression ( I hope this does not contradict sound way of thinking) egen industry_year= group(industry year) xtset industry_year xtreg IV DV, fe and the result were the same as using reg IV DV i.year i.industry, areg IV DV i.year , absorb (industry) Regarding your kind comment about fixed effect and clustering and the same level ( say firm level as widely done). In the finance field, papers seem to include both year and firm dummies as well as clustering for firms. The most used wordings are " reported t-statistics adjusted for heteroskedasticity (White, 1980) and firm-level clustering" and "The t-values are computed using Roger’s robust standard errors correcting for firm clusters" with dummies for years and dummies for firms already included in the regression. for example see table 4 in this paper : http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1030359 Also, I found this thread which hints that fixed effect and cluster at the panel variable is widely reported. http://www.stata.com/statalist/archive/2006-09/msg00782.html However, I totally got your point that the fixed effects will control for correlation of error-terms within clusters. I wonder why large number of finance papers published in well-known journals use firm cluster as well! Thanks a million, and I am sure that other users appreciate your previous detailed explanation and help. many thanks Nahla On 14 August 2013 19:57, Michael Barker <mdb96statalist@gmail.com> wrote: > Hi Nahla, > > You are having trouble with the xtcommands, because you're not really > doing a panel-data analysis. Panel data implies one observation per > unit per year. You are analyzing this data using industry and year, so > you have many observations (firms) per unit (industry) per year. That > is why you got the error about repeated time values within panel. Your > data may actually be panel data, at the firm-year level, but you are > analyzing it as clustered data, not panel data. > > You said that you were using two-dimension fixed-effects, so I would > keep industry and year as separate groups of dummy variables, rather > than creating a single interaction. The results may come out the same, > I'm not sure about that, but I think it is easier conceptually. > > Lastly, if you are including fixed effects at the industry-level, you > don't have to compute clustered standard errors at the same level. You > can just use the typical robust standard error estimator. The cluster > fixed effects will control for correlation of error-terms within > clusters. > > So I think you should use one of these two commands: > reg IV DV i.year i.industry, robust > areg IV DV i.year , absorb (industry) robust > > About the ivreg2 command, it is used for instrumental variables. I > think your "IV" stands for independent variable, not instrumental > variable, so it is not relevant to your topic. ivreg2 will not help > you with a fixed-effects analysis. > > Mike > > > > On Wed, Aug 14, 2013 at 10:51 AM, Nahla Betelmal <nahlaib@gmail.com> wrote: >> Thank you so much Mike, your detailed comments are great help. I do >> appreciate it. >> >> As I am looking for industry year fixed effects rather than firm year, >> I tried to set the panel accordingly, but did not work due to repeated >> time values within panel. >> >> So, this time I grouped based on industry-year (thanks to ur note >> about no repeated firms in different industries). I hope this time I >> did it in the right way. Kindly let me know please. I got identical >> coefficients for IV. >> >> Also, could you please explain more your comment about ivreg2 or give >> and an example how to execute it right to get fixed effects please. >> >> the command are : >> 1) egen industry_year= group(industry year) then xtset industry_year >> then xtreg IV DV, fe vce (cluster industry) >> 2) xi: reg IV DV i.year i.industry >> 3)areg IV DV , absorb ( industry_year ) cluster (industry) >> >> In the first command , I could not put i.year as it is omitted because >> of collinearity. >> In the second, I could not apply cluster (industry) option as F-test >> became missing. >> The third command gave almost identical results to the previous two >> with and without the cluster option. However, it gave slightly >> different R-Square 0.645 than that of regress 0.621. Is this OK or >> they should be identical. >> >> >> >> egen industry_year= group(industry year) >> xtset industry_year >> xtreg DV IV, fe vce (cluster industry ) >> >> Fixed-effects (within) regression Number of obs = 23830 >> Group variable: industry_year Number of groups = 1179 >> >> R-sq: within = 0.5516 Obs per group: min = 1 >> between = 0.5262 avg = 20.2 >> overall = 0.4955 max = 155 >> >> F(1,57) = 2233.13 >> corr(u_i, Xb) = -0.1260 Prob > F = 0.0000 >> >> (Std. Err. adjusted for 58 clusters in industry) >> ------------------------------------------------------------------------------ >> | Robust >> DV| Coef. Std. Err. t P>|t| [95% Conf. Interval] >> -------------+---------------------------------------------------------------- >> IV| .4393407 .009297 47.26 0.000 .4207237 .4579577 >> _cons | 5.675498 .0673395 84.28 0.000 5.540653 5.810343 >> -------------+---------------------------------------------------------------- >> sigma_u | .40739078 >> sigma_e | .58512671 >> rho | .32648834 (fraction of variance due to u_i) >> ------------------------------------------------------------------------------ >> >> >> >> Also I tried >> xi: reg DV IV i.year i.industry >> >> without a cluster(industry) as F-test became missing >> >> IV= .4397811 and SE= .0026298 >> If I run xtreg without the cluster option, I get the same SE= .0026322 >> >> the output is too long >> >> In addition >> areg DV IV, absorb ( industry_year ) cluster (industry) >> >> Linear regression, absorbing indicators Number of obs = 23830 >> F( 1, 57) = 2122.73 >> Prob > F = 0.0000 >> R-squared = 0.6458 >> Adj R-squared = 0.6274 >> Root MSE = 0.5851 >> >> (Std. Err. adjusted for 58 clusters in industry) >> ------------------------------------------------------------------------------ >> | Robust >> DV | Coef. Std. Err. t P>|t| [95% Conf. Interval] >> -------------+---------------------------------------------------------------- >> IV | .4393407 .0095357 46.07 0.000 .4202457 >> .4584357 >> _cons | 5.675498 .0690685 82.17 0.000 5.537191 5.813805 >> -------------+---------------------------------------------------------------- >> industry_year | absorbed (1179 categories) >> >> >> Many thanks again >> >> Nahla >> >> >> On 14 August 2013 14:29, Michael Barker <mdb96statalist@gmail.com> wrote: >>> Hi Nahla, >>> >>> You are actually running several different models there. I'll describe >>> each one below, so you can see how they differ: >>> >>>> 1) xi: reg DV IV i.year, vce (cluster industry) >>> - Year fixed effects only. >>> - Include one dummy variable for each year: >>> >>>> 2) xtset firm year then xtreg DV IV i.year, fe vce (cluster industry) >>> - Year and firm fixed effects >>> - Equivalent to including one dummy for each year and one dummy for each firm. >>> - xtreg includes fixed effects for the panel variable, firm and you >>> include year dummies manually >>> >>>> 3) egen industry_firm= group (industry firm) then xtset industry_firm year then xtreg DV IV i.year, fe vce (cluster industry) >>> - year and industry-firm level fixed effects >>> - equivalent to including one dummy for each year and one dummy for >>> each industry-firm combination >>> - apparently no firm is in multiple industries, so this regression is >>> equivalent to regression 2. >>> >>>> 4) tsset industry_firm year then ivreg2 DV IV,cluster ( industry_firm year) >>> - No fixed effects >>> - You didn't specify the endogenous / IV variables, so this is just a >>> regular regression with clustered standard errors >>> - This is equivalent to "reg DV IV,cluster ( industry_firm year)" >>> >>>> 5) areg DV IV, absorb ( year ) cluster (industry) >>> - Year fixed effects only >>> - Equivalent to regression 1, without reporting year coefficients >>> - Notice that the coefficient and standard error estimates are the >>> same as the first regression. >>>> >>> >>> If you want firm and year fixed effects, I would use regression 2. If >>> you want to see equivalent results with alternative regressions, try >>> these: >>> xi: reg DV IV i.year i.firm, vce (cluster industry) >>> areg DV IV i.year, absorb (firm) cluster (industry) >>> >>> The first suggestion might not run, since you will have to include >>> many dummy variables for all of your firms. You may exceed the maximum >>> number of variables allowed, depending on your version of Stata. >>> >>> Mike >>> >>> >>> >>> >>> On Wed, Aug 14, 2013 at 8:22 AM, Nahla Betelmal <nahlaib@gmail.com> wrote: >>>> Hi Statalist, >>>> >>>> I have a panel data of firms and years, however, I would like to >>>> perform industry and year fixed effect regression. using different >>>> approaches, I got different IV coefficient and standard error, >>>> although it should be identical if I am doing it right. I would highly >>>> appreciate it if someone kindly explain what I am doing wrong and what >>>> is the right way to get industry and year fixed effects. >>>> >>>> the commands I used are: >>>> >>>> 1) xi: reg DV IV i.year, vce (cluster industry) >>>> >>>> 2) xtset firm year then xtreg DV IV i.year, fe vce (cluster industry) >>>> >>>> 3) egen industry_firm= group (industry firm) then xtset industry_firm >>>> year then xtreg DV IV i.year, fe vce (cluster industry) >>>> >>>> 4) tsset industry_firm year then ivreg2 DV IV,cluster ( industry_firm year) >>>> >>>> 5) areg DV IV, absorb ( year ) cluster (industry) >>>> >>>> >>>> under reg command: IV = 0.386 with SE= 0.022 >>>> under xtreg command with firm year panel set: IV = .418 with SE= .0241 >>>> under xtreg command with industry-firm year panel set: IV = .418 with SE= .024 >>>> under ivreg2 command: IV = .410 with SE= .007 >>>> under areg command: IV = 0.386 with SE= 0.022 >>>> >>>> >>>> . xi: reg DV IV i.year, vce (cluster industry) >>>> i.year _Iyear_1992-2012 (naturally coded; _Iyear_1992 omitted) >>>> >>>> Linear regression Number of obs = 23830 >>>> F( 21, 57) = 768.66 >>>> Prob > F = 0.0000 >>>> R-squared = 0.5461 >>>> Root MSE = .6461 >>>> >>>> (Std. Err. adjusted for 58 clusters in industry) >>>> ------------------------------------------------------------------------------- >>>> | Robust >>>> DV | Coef. Std. Err. t P>|t| [95% >>>> Conf. Interval] >>>> --------------+---------------------------------------------------------------- >>>> IV | .3869693 .0225831 17.14 0.000 >>>> .3417475 .4321911 >>>> _Iyear_1993 | .150389 .0239546 6.28 0.000 .1024208 .1983573 >>>> _Iyear_1994 | .2857099 .0271864 10.51 0.000 .2312702 .3401496 >>>> _Iyear_1995 | .2927993 .0307951 9.51 0.000 .2311331 .3544654 >>>> _Iyear_1996 | .4353512 .0304859 14.28 0.000 .3743044 .4963981 >>>> _Iyear_1997 | .5286896 .0292151 18.10 0.000 .4701874 .5871917 >>>> _Iyear_1998 | .5852497 .0337522 17.34 0.000 .5176621 .6528374 >>>> _Iyear_1999 | .6969439 .0523892 13.30 0.000 .5920364 .8018514 >>>> _Iyear_2000 | .8019949 .0666928 12.03 0.000 .6684448 .9355449 >>>> _Iyear_2001 | .7710818 .0486744 15.84 0.000 .673613 .8685507 >>>> _Iyear_2002 | .6978223 .0325914 21.41 0.000 .6325592 .7630854 >>>> _Iyear_2003 | .6427671 .0347611 18.49 0.000 .5731593 .712375 >>>> _Iyear_2004 | .7757021 .0394535 19.66 0.000 .6966978 .8547064 >>>> _Iyear_2005 | .7806429 .0418054 18.67 0.000 .6969291 .8643566 >>>> _Iyear_2006 | .7746051 .0462916 16.73 0.000 .6819076 .8673025 >>>> _Iyear_2007 | .7758041 .0484202 16.02 0.000 .6788444 .8727639 >>>> _Iyear_2008 | .7734638 .0508533 15.21 0.000 .6716317 .8752958 >>>> _Iyear_2009 | .7319797 .0564072 12.98 0.000 .6190263 .8449332 >>>> _Iyear_2010 | .8741285 .0506573 17.26 0.000 .772689 .975568 >>>> _Iyear_2011 | .8889354 .0532101 16.71 0.000 .782384 .9954869 >>>> _Iyear_2012 | .8979328 .0565989 15.86 0.000 .7845956 1.01127 >>>> _cons | 5.403047 .1238831 43.61 0.000 5.154975 5.651118 >>>> ------------------------------------------------------------------------------- >>>> >>>> >>>> >>>> >>>> xtset firm year >>>> panel variable: firm (unbalanced) >>>> time variable: year, 1992 to 2012, but with gaps >>>> delta: 1 unit >>>> >>>> . xtreg DV IV i.year, fe vce (cluster industry) >>>> >>>> Fixed-effects (within) regression Number of obs = 23830 >>>> Group variable: firm Number of groups = 2312 >>>> >>>> R-sq: within = 0.4113 Obs per group: min = 1 >>>> between = 0.5998 avg = 10.3 >>>> overall = 0.5456 max = 21 >>>> >>>> F(21,57) = 463.93 >>>> corr(u_i, Xb) = -0.0970 Prob > F = 0.0000 >>>> >>>> (Std. Err. adjusted for 58 clusters in industry) >>>> ------------------------------------------------------------------------------ >>>> | Robust >>>> DV | Coef. Std. Err. t P>|t| [95% Conf. Interval] >>>> -------------+---------------------------------------------------------------- >>>> IV | .4183645 .0241281 17.34 0.000 .3700488 >>>> .4666802 >>>> | >>>> year | >>>> 1993 | .1560772 .0200202 7.80 0.000 .1159874 .196167 >>>> 1994 | .2929982 .0224807 13.03 0.000 .2479813 .3380151 >>>> 1995 | .3019359 .0268163 11.26 0.000 .2482373 .3556345 >>>> 1996 | .4272691 .0264501 16.15 0.000 .3743038 .4802344 >>>> 1997 | .5209287 .0266063 19.58 0.000 .4676506 .5742069 >>>> 1998 | .5877827 .0276877 21.23 0.000 .5323391 .6432264 >>>> 1999 | .6989115 .0427304 16.36 0.000 .6133453 .7844777 >>>> 2000 | .7988406 .0477286 16.74 0.000 .7032657 .8944154 >>>> 2001 | .7589164 .0375573 20.21 0.000 .6837091 .8341236 >>>> 2002 | .687617 .034973 19.66 0.000 .6175848 .7576492 >>>> 2003 | .6310008 .0488884 12.91 0.000 .5331035 .7288982 >>>> 2004 | .7611996 .0507837 14.99 0.000 .659507 .8628921 >>>> 2005 | .7687923 .0552525 13.91 0.000 .6581511 .8794336 >>>> 2006 | .7524079 .0609127 12.35 0.000 .6304324 .8743834 >>>> 2007 | .7519399 .0642041 11.71 0.000 .6233734 .8805064 >>>> 2008 | .750493 .0684401 10.97 0.000 .6134441 .887542 >>>> 2009 | .7118027 .067056 10.62 0.000 .5775254 .8460799 >>>> 2010 | .8504969 .0632919 13.44 0.000 .7237569 .9772368 >>>> 2011 | .8674839 .0664437 13.06 0.000 .7344328 1.000535 >>>> 2012 | .863437 .0733127 11.78 0.000 .7166308 1.010243 >>>> | >>>> _cons | 5.18669 .152373 34.04 0.000 4.881568 5.491812 >>>> -------------+---------------------------------------------------------------- >>>> sigma_u | .4935113 >>>> sigma_e | .47151369 >>>> rho | .52278302 (fraction of variance due to u_i) >>>> ------------------------------------------------------------------------------ >>>> >>>> >>>> . egen industry_firm= group (industry firm) >>>> >>>> . xtset industry_firm year >>>> panel variable: industry_firm (unbalanced) >>>> time variable: year, 1992 to 2012, but with gaps >>>> delta: 1 unit >>>> >>>> >>>> >>>> >>>> >>>> . xtreg DV IV i.year, fe vce (cluster industry) >>>> >>>> Fixed-effects (within) regression Number of obs = 23830 >>>> Group variable: industry_firm Number of groups = 2312 >>>> >>>> R-sq: within = 0.4113 Obs per group: min = 1 >>>> between = 0.5998 avg = 10.3 >>>> overall = 0.5456 max = 21 >>>> >>>> F(21,57) = 463.93 >>>> corr(u_i, Xb) = -0.0970 Prob > F = 0.0000 >>>> >>>> (Std. Err. adjusted for 58 clusters in industry) >>>> ------------------------------------------------------------------------------ >>>> | Robust >>>> DV | Coef. Std. Err. t P>|t| [95% Conf. Interval] >>>> -------------+---------------------------------------------------------------- >>>> IV | .4183645 .0241281 17.34 0.000 .3700488 .4666802 >>>> | >>>> year | >>>> 1993 | .1560772 .0200202 7.80 0.000 .1159874 .196167 >>>> 1994 | .2929982 .0224807 13.03 0.000 .2479813 .3380151 >>>> 1995 | .3019359 .0268163 11.26 0.000 .2482373 .3556345 >>>> 1996 | .4272691 .0264501 16.15 0.000 .3743038 .4802344 >>>> 1997 | .5209287 .0266063 19.58 0.000 .4676506 .5742069 >>>> 1998 | .5877827 .0276877 21.23 0.000 .5323391 .6432264 >>>> 1999 | .6989115 .0427304 16.36 0.000 .6133453 .7844777 >>>> 2000 | .7988406 .0477286 16.74 0.000 .7032657 .8944154 >>>> 2001 | .7589164 .0375573 20.21 0.000 .6837091 .8341236 >>>> 2002 | .687617 .034973 19.66 0.000 .6175848 .7576492 >>>> 2003 | .6310008 .0488884 12.91 0.000 .5331035 .7288982 >>>> 2004 | .7611996 .0507837 14.99 0.000 .659507 .8628921 >>>> 2005 | .7687923 .0552525 13.91 0.000 .6581511 .8794336 >>>> 2006 | .7524079 .0609127 12.35 0.000 .6304324 .8743834 >>>> 2007 | .7519399 .0642041 11.71 0.000 .6233734 .8805064 >>>> 2008 | .750493 .0684401 10.97 0.000 .6134441 .887542 >>>> 2009 | .7118027 .067056 10.62 0.000 .5775254 .8460799 >>>> 2010 | .8504969 .0632919 13.44 0.000 .7237569 .9772368 >>>> 2011 | .8674839 .0664437 13.06 0.000 .7344328 1.000535 >>>> 2012 | .863437 .0733127 11.78 0.000 .7166308 1.010243 >>>> | >>>> _cons | 5.18669 .152373 34.04 0.000 4.881568 5.491812 >>>> -------------+---------------------------------------------------------------- >>>> sigma_u | .4935113 >>>> sigma_e | .47151369 >>>> rho | .52278302 (fraction of variance due to u_i) >>>> ------------------------------------------------------------------------------ >>>> >>>> >>>> >>>> ivreg2 DV IV,cluster ( industry_firm year) >>>> >>>> OLS estimation >>>> -------------- >>>> >>>> Estimates efficient for homoskedasticity only >>>> Statistics robust to heteroskedasticity and clustering on >>>> industry_firm and fyear2 >>>> >>>> Number of clusters (industry_firm) = 2312 Number of obs = 23830 >>>> Number of clusters (fyear2) = 21 F( 1, 20) = 2849.29 >>>> Prob > F = 0.0000 >>>> Total (centered) SS = 21896.66904 Centered R2 = 0.4955 >>>> Total (uncentered) SS = 1891568.745 Uncentered R2 = 0.9942 >>>> Residual SS = 11046.6797 Root MSE = .6809 >>>> >>>> ------------------------------------------------------------------------------ >>>> | Robust >>>> DV | Coef. Std. Err. z P>|z| [95% Conf. Interval] >>>> -------------+---------------------------------------------------------------- >>>> IV | .410624 .0075071 54.70 0.000 .3959104 .4253377 >>>> _cons | 5.883496 .0562149 104.66 0.000 5.773317 5.993675 >>>> ------------------------------------------------------------------------------ >>>> Included instruments: IV >>>> >>>> >>>> >>>> >>>> areg DV IV, absorb ( year ) cluster (industry) >>>> >>>> Linear regression, absorbing indicators Number of obs = 23830 >>>> F( 1, 57) = 293.62 >>>> Prob > F = 0.0000 >>>> R-squared = 0.5461 >>>> Adj R-squared = 0.5457 >>>> Root MSE = 0.6461 >>>> >>>> (Std. Err. adjusted for 58 clusters in twodigit) >>>> ------------------------------------------------------------------------------ >>>> | Robust >>>> DV | Coef. Std. Err. t P>|t| [95% Conf. Interval] >>>> -------------+---------------------------------------------------------------- >>>> IV | .3869693 .0225831 17.14 0.000 .3417475 .4321911 >>>> _cons | 6.05483 .1337655 45.26 0.000 5.786969 6.322691 >>>> -------------+---------------------------------------------------------------- >>>> year | absorbed (21 categories) >>>> >>>> >>>> >>>> Many thanks in advance, >>>> >>>> Nahla Betelmal >>>> * >>>> * For searches and help try: >>>> * http://www.stata.com/help.cgi?search >>>> * http://www.stata.com/support/faqs/resources/statalist-faq/ >>>> * http://www.ats.ucla.edu/stat/stata/ >>> * >>> * For searches and help try: >>> * http://www.stata.com/help.cgi?search >>> * http://www.stata.com/support/faqs/resources/statalist-faq/ >>> * http://www.ats.ucla.edu/stat/stata/ >> * >> * For searches and help try: >> * http://www.stata.com/help.cgi?search >> * http://www.stata.com/support/faqs/resources/statalist-faq/ >> * http://www.ats.ucla.edu/stat/stata/ > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/faqs/resources/statalist-faq/ > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/