Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Rebecca Pope <rebecca.a.pope@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: comparing equality of coefficients from two subsamples |

Date |
Thu, 21 Feb 2013 11:24:25 -0600 |

The FAQ link was intended to be helpful in a "first-prinicples" sense. I sent it because you seemed to not understand what Jay was saying about constraining variances & it provided a simple introduction. You won't be able to use those exact steps with your problem however, not least because -aweight-s aren't allowed with -xtreg-. Now, let's try to clarify what you are wanting before proceeding any further because I want to make sure that we're clear on the use of "interaction". Say, for example that you are interested in the model log(wage) = intercept + tenure + tenure^2 + not_smsa + wks_ue where not_smsa indicates that the respondent doesn't live in a metropolitan area and wks_ue is the number of weeks she was unemployed in the previous year. This data is from -webuse nlswork-, the example given with -xtreg-. Now, suppose that you think that the effect of wks_ue differs by whether or not the respondent lives in an urban area. For this, you have a simple interaction term. (You can think of this like your policy indicator). The Stata syntax for this is: xtreg ln_w tenure c.tenure#c.tenure i.not_smsa##c.wks_ue, fe Now, suppose you further hypothesize that the model above does not apply equally to southern areas. The model could differ in multiple ways, but the two that are of interest concern the south somehow moderating the effect of unemployment and rural residence. You can approach this in one of two ways. The first is to simply model the difference with respect to not_smsa and wks_ue; all other effects are the same. The second does not constrain any of the coefficients to be equal across groups, here south/not south. The first syntax is: xtreg ln_w tenure c.tenure#c.tenure i.south##i.not_smsa##c.wks_ue, fe This is what you say you want in your most recent post. The second approach though is what you have written: xtreg ln_w tenure c.tenure#c.tenure i.not_smsa##c.wks_ue if south==0, fe xtreg ln_w tenure c.tenure#c.tenure i.not_smsa##c.wks_ue if south==1, fe If you estimate these equations, you get different parameter estimates for _all_ terms by "south". This is why I said that you were working with a fully-interacted model. To understand this, note that you must estimate the two equations above _as one_ in order to test whether rural unemployment differs in the south (or your government policy differs by firm type). The correct Stata syntax is: xtreg ln_w i.south#c.tenure i.south#c.tenure#c.tenure i.south##i.not_smsa##c.wks_ue, fe Do not take "simply" above to mean that it is somehow inferior. I just mean that the model has fewer parameters to estimate. Your choice of specification must be theory-driven. If you think that approach 2 is incorrect, then nothing stops you using approach 1. However, that isn't what you indicated you were estimating when you wrote 2 separate equations. With all of these approaches, you get 1 error term for both groups. Is this a problem? It depends on your groups. You have to look at your data and decide. If you decide you shouldn't constrain the variance, you'll need to choose an appropriate approach at that point. Now, what do you observe with respect to the coefficients? Probably that the "pooled" regression does not exactly reproduce the coefficients of the separate regressions with -xtreg, fe-. This shouldn't surprise you. -xtreg, fe- is estimating a model on the "demeaned" data, the so-called "within" estimator. When you pool the observations, you alter the calculation of the mean within the j-th unit. This occurs because there are some respondents, in this example, who have lived in and out of the south. If that weren't the case, "1.south" would be dropped from the FE part of our model when we pooled results and we would be left with a single overall intercept. Quite apart from that, if you submitted xtreg ln_w tenure c.tenure#c.tenure i.south##i.not_smsa##c.wks_ue, fe thinking you were going to get the same results as: xtreg ln_w tenure c.tenure#c.tenure i.not_smsa##c.wks_ue if south==0, fe xtreg ln_w tenure c.tenure#c.tenure i.not_smsa##c.wks_ue if south==1, fe you would be wrong, even if you were using simple linear regression because you are working with fundamentally different views of how your grouping variable relates to the other covariates. I hope this helps, Rebecca On Wed, Feb 20, 2013 at 4:10 PM, Mario Jose <mariojose276@gmail.com> wrote: > Thank you Rebecca for the links, they were very useful to understand > the previous Jay's comment. > I have implemented the strategy of Bill Gould (allowing for different > variances), but it appeared the message of error "weight must be > constant within id"... Anyway I do not want to introduce interactions > with all independent variables but to only one. > > Below I expose what the specific problem I have. > > I have a panel sample of firms, and in the middle of the period > (2004) it was implemented by the government a specific fiscal > measure. I want to test whether this measure had impacts on the > profits reported by firms. As I think that the measure had impacts in > a specific subsample of firms, I divided the sample in two subsamples > - group1 group2 (splitted according the debt/assets ratio of firms). > > I run the model for the two groups separately: > xtreg, Y x1 control1 control2 ... i.pos i.pos#c.x1 if group==1, fe > xtreg, Y x1 control1 control2 ... i.pos i.pos#c.x1 if group==2, fe. > > (pos is binary taking value 1 for years after the implementation of the policy) > > and I obtain the following estimates for group 1 and 2, respectively: > > *******output excerpt************ > > ----------------------------------------------------------------------------------- > | Robust > Y | Coef. Std. Err. t P>|t| [95% > Conf. Interval] > ------------------+---------------------------------------------------------------- > x1 | -2.053274 .5641935 -3.64 0.000 -3.159248 > -.9473006 > control1 | .5904103 .0267907 22.04 0.000 .5378933 .6429273 > control2 | .0947558 .0233539 4.06 0.000 .0489758 .1405358 > ... | -.0234459 .2617354 -0.09 0.929 -.5365189 > .4896271 > year dum.. | > 1.pos | -.5814072 .1512517 -3.84 0.000 -.877902 -.2849124 > 1.pos#c.x1 | 1.256448 .4183398 3.00 0.003 .4363875 2.076508 > _cons | -6.099231 1.766059 -3.45 0.001 -9.561191 -2.637272 > ------------------+---------------------------------------------------------------- > sigma_u | 2.1744991 > sigma_e | .77651905 > rho | .88690051 (fraction of variance due to u_i) > ----------------------------------------------------------------------------------- > > > ----------------------------------------------------------------------------------- > | Robust > Y | Coef. Std. Err. t P>|t| [95% > Conf. Interval] > ------------------+---------------------------------------------------------------- > x1 | -2.047585 .6997248 -2.93 0.003 -3.41921 > -.6759593 > control1 | .4552402 .0232387 19.59 0.000 .4096868 .5007936 > control2 | .028412 .0110095 2.58 0.010 .0068306 .0499933 > ... > year dum .. | > 1.pos | -.4291118 .1817098 -2.36 0.018 -.7853059 -.072917 > 1.pos#c.x1 |.6220617 .5078439 1.22 0.221 -.3734318 1.617555 > cons | -7.341474 1.606579 -4.57 0.000 -10.49075 -4.192201 > ------------------+---------------------------------------------------------------- > sigma_u | 2.4369753 > sigma_e | .70849863 > rho | .92206421 (fraction of variance due to u_i) > ----------------------------------------------------------------------------------- > > **********end of excerpt************* > > These results are in the direction of the predicted, but when I pooled > the sample for me to compare the coefs, the estimates appear to be > significantly different. They are as follows: > > *******output excerpt************ > -------------------------------------------------------------------------------------------------- > | Robust > Y | Coef. Std. Err. t > P>|t| [95% Conf. Interval] > ---------------------------------+---------------------------------------------------------------- > x1 | -1.601963 .5324727 -3.01 > 0.003 -2.645681 -.5582453 > control1 | .5435240 .0232387 19.59 0.000 > .4096868 .5007936 > control2 | .03976 .0110095 2.58 0.010 > .0068306 .0499933 > ... | > year dum .. | > 1.pos | -.382873 .1487651 -2.57 0.010 > -.6744726 -.0912734 > pos#c.x1 | .5273469 .4331443 1.22 0.223 > -.3216739 1.376368 > 1.group | .2575 .175552 1.47 0.142 > -.0866054 .60 > 1.group#c.x1 | -.8550352 .5470408 -1.56 0.118 > -1.927308 .217238 > 1.group#pos | -.2539677 .1681945 -1.51 0.131 > -.5836514 .075716 > 1.goup#pos#c.x1 | .8948809 .528096 1.69 0.090 > -.140258 1.93002 > _cons | -6.485282 1.161574 -5.58 0.000 > -8.762123 -4.208441 > ---------------------------------+---------------------------------------------------------------- > sigma_u | 2.2954577 > sigma_e | .76123454 > rho | .90092029 (fraction of variance due to u_i) > > **********end of excerpt************* > > Do you find something wrong with the last equation? > > I would appreciate any help. > Best > MJ > <snip> On Wed, Feb 20, 2013 at 4:10 PM, Mario Jose <mariojose276@gmail.com> wrote: > Thank you Rebecca for the links, they were very useful to understand > the previous Jay's comment. > I have implemented the strategy of Bill Gould (allowing for different > variances), but it appeared the message of error "weight must be > constant within id"... Anyway I do not want to introduce interactions > with all independent variables but to only one. > > Below I expose what the specific problem I have. > > I have a panel sample of firms, and in the middle of the period > (2004) it was implemented by the government a specific fiscal > measure. I want to test whether this measure had impacts on the > profits reported by firms. As I think that the measure had impacts in > a specific subsample of firms, I divided the sample in two subsamples > - group1 group2 (splitted according the debt/assets ratio of firms). > > I run the model for the two groups separately: > xtreg, Y x1 control1 control2 ... i.pos i.pos#c.x1 if group==1, fe > xtreg, Y x1 control1 control2 ... i.pos i.pos#c.x1 if group==2, fe. > > (pos is binary taking value 1 for years after the implementation of the policy) > > and I obtain the following estimates for group 1 and 2, respectively: > > *******output excerpt************ > > ----------------------------------------------------------------------------------- > | Robust > Y | Coef. Std. Err. t P>|t| [95% > Conf. Interval] > ------------------+---------------------------------------------------------------- > x1 | -2.053274 .5641935 -3.64 0.000 -3.159248 > -.9473006 > control1 | .5904103 .0267907 22.04 0.000 .5378933 .6429273 > control2 | .0947558 .0233539 4.06 0.000 .0489758 .1405358 > ... | -.0234459 .2617354 -0.09 0.929 -.5365189 > .4896271 > year dum.. | > 1.pos | -.5814072 .1512517 -3.84 0.000 -.877902 -.2849124 > 1.pos#c.x1 | 1.256448 .4183398 3.00 0.003 .4363875 2.076508 > _cons | -6.099231 1.766059 -3.45 0.001 -9.561191 -2.637272 > ------------------+---------------------------------------------------------------- > sigma_u | 2.1744991 > sigma_e | .77651905 > rho | .88690051 (fraction of variance due to u_i) > ----------------------------------------------------------------------------------- > > > ----------------------------------------------------------------------------------- > | Robust > Y | Coef. Std. Err. t P>|t| [95% > Conf. Interval] > ------------------+---------------------------------------------------------------- > x1 | -2.047585 .6997248 -2.93 0.003 -3.41921 > -.6759593 > control1 | .4552402 .0232387 19.59 0.000 .4096868 .5007936 > control2 | .028412 .0110095 2.58 0.010 .0068306 .0499933 > ... > year dum .. | > 1.pos | -.4291118 .1817098 -2.36 0.018 -.7853059 -.072917 > 1.pos#c.x1 |.6220617 .5078439 1.22 0.221 -.3734318 1.617555 > cons | -7.341474 1.606579 -4.57 0.000 -10.49075 -4.192201 > ------------------+---------------------------------------------------------------- > sigma_u | 2.4369753 > sigma_e | .70849863 > rho | .92206421 (fraction of variance due to u_i) > ----------------------------------------------------------------------------------- > > **********end of excerpt************* > > These results are in the direction of the predicted, but when I pooled > the sample for me to compare the coefs, the estimates appear to be > significantly different. They are as follows: > > *******output excerpt************ > -------------------------------------------------------------------------------------------------- > | Robust > Y | Coef. Std. Err. t > P>|t| [95% Conf. Interval] > ---------------------------------+---------------------------------------------------------------- > x1 | -1.601963 .5324727 -3.01 > 0.003 -2.645681 -.5582453 > control1 | .5435240 .0232387 19.59 0.000 > .4096868 .5007936 > control2 | .03976 .0110095 2.58 0.010 > .0068306 .0499933 > ... | > year dum .. | > 1.pos | -.382873 .1487651 -2.57 0.010 > -.6744726 -.0912734 > pos#c.x1 | .5273469 .4331443 1.22 0.223 > -.3216739 1.376368 > 1.group | .2575 .175552 1.47 0.142 > -.0866054 .60 > 1.group#c.x1 | -.8550352 .5470408 -1.56 0.118 > -1.927308 .217238 > 1.group#pos | -.2539677 .1681945 -1.51 0.131 > -.5836514 .075716 > 1.goup#pos#c.x1 | .8948809 .528096 1.69 0.090 > -.140258 1.93002 > _cons | -6.485282 1.161574 -5.58 0.000 > -8.762123 -4.208441 > ---------------------------------+---------------------------------------------------------------- > sigma_u | 2.2954577 > sigma_e | .76123454 > rho | .90092029 (fraction of variance due to u_i) > > **********end of excerpt************* > > Do you find something wrong with the last equation? > > I would appreciate any help. > Best > MJ > > 2013/2/20 Rebecca Pope <rebecca.a.pope@gmail.com>: >> Jay has given you important advice as it pertains to the group >> residual variances. > >> You are correct that Wooldridge gives an explanation of interaction >> terms. He also notes that a fully interacted model (as I assume you >> will be estimating since your initial post seemed to suggest that you >> expect different coefficients for all covariates for males and >> females) assumes group error homogeneity (pg 245 of the 4th ed). >> Unfortunately, there doesn't appear to be any discussion, at least in >> that section, of how to address heteroskedasticity between the groups. >> I didn't read through the rest of the book > >> You might want to take a look at this FAQ by Bill Gould: >> http://www.stata.com/support/faqs/statistics/pooling-data-and-chow-tests/ >> >> And these slides from a talk by Bobby Gutierrez: >> http://www.stata.com/meeting/fnasug08/gutierrez.pdf >> >> Only you can see your data and judge whether the constrained variance >> model is appropriate or not. I wouldn't just dismiss the issue out of >> hand though. >> >> Rebecca >> >> On Wed, Feb 20, 2013 at 5:47 AM, Mario Jose <mariojose276@gmail.com> wrote: >>> Thanks you for comments. Testing for equality of coefficients from >>> different subsamples, as suggested by Marteen, can be solved by >>> interactions. >>> There is an excellent explanation of the procedure in Wooldridge: >>> Introd.Econometrics ModernApproach; pp. 243-246 and pp. 449-450 and in >>> the following link: >>> http://www.stata.com/support/faqs/statistics/chow-tests/ >>> >>> Best, >>> MJ >>> >>> 2013/2/18 JVerkuilen (Gmail) <jvverkuilen@gmail.com>: >>>> As someone else indicated, your syntax is odd. >>>> >>>> The main question I have is whether you want to allow for different >>>> group residual variances. If not, interaction. If so, then I guess the >>>> easiest approach would be -suest-. >>>> >>>> On Mon, Feb 18, 2013 at 11:15 AM, Mario Jose <mariojose276@gmail.com> wrote: >>>>> Dear Statalisters, >>>>> >>>>> I have tryed to solve the question below, searching for help in the >>>>> Stata Archiv without too much success... >>>>> >>>>> I have estimated a fixed effects linear regression for two different >>>>> groups on my sample (say, sex male/female), using this strategy: >>>>> xtreg dv iv, if sex==male >>>>> xtreg dv iv, if sex==female >>>>> >>>>> I am interested in testing whether or not the coefficient b1 is >>>>> identical to each other in the two subsamples. >>>>> >>>>> I would really appreciate any help. >>>>> Regards >>>>> MJ >>>>> * >>>>> * For searches and help try: >>>>> * http://www.stata.com/help.cgi?search >>>>> * http://www.stata.com/support/faqs/resources/statalist-faq/ >>>>> * http://www.ats.ucla.edu/stat/stata/ >>>> >>>> >>>> >>>> -- >>>> JVVerkuilen, PhD >>>> jvverkuilen@gmail.com >>>> >>>> http://lesswrong.com/ >>>> >>>> "Everybody loves progress but nobody likes change." ---Fortune cookie, 1/13/13. >>>> * >>>> * For searches and help try: >>>> * http://www.stata.com/help.cgi?search >>>> * http://www.stata.com/support/faqs/resources/statalist-faq/ >>>> * http://www.ats.ucla.edu/stat/stata/ >>> >>> 2013/2/18 JVerkuilen (Gmail) <jvverkuilen@gmail.com>: >>>> As someone else indicated, your syntax is odd. >>>> >>>> The main question I have is whether you want to allow for different >>>> group residual variances. If not, interaction. If so, then I guess the >>>> easiest approach would be -suest-. >>>> >>>> On Mon, Feb 18, 2013 at 11:15 AM, Mario Jose <mariojose276@gmail.com> wrote: >>>>> Dear Statalisters, >>>>> >>>>> I have tryed to solve the question below, searching for help in the >>>>> Stata Archiv without too much success... >>>>> >>>>> I have estimated a fixed effects linear regression for two different >>>>> groups on my sample (say, sex male/female), using this strategy: >>>>> xtreg dv iv, if sex==male >>>>> xtreg dv iv, if sex==female >>>>> >>>>> I am interested in testing whether or not the coefficient b1 is >>>>> identical to each other in the two subsamples. >>>>> >>>>> I would really appreciate any help. >>>>> Regards >>>>> MJ >>>>> * >>>>> * For searches and help try: >>>>> * http://www.stata.com/help.cgi?search >>>>> * http://www.stata.com/support/faqs/resources/statalist-faq/ >>>>> * http://www.ats.ucla.edu/stat/stata/ >>>> >>>> >>>> >>>> -- >>>> JVVerkuilen, PhD >>>> jvverkuilen@gmail.com >>>> >>>> http://lesswrong.com/ >>>> >>>> "Everybody loves progress but nobody likes change." ---Fortune cookie, 1/13/13. >>>> * >>>> * For searches and help try: >>>> * http://www.stata.com/help.cgi?search >>>> * http://www.stata.com/support/faqs/resources/statalist-faq/ >>>> * http://www.ats.ucla.edu/stat/stata/ >>> * >>> * For searches and help try: >>> * http://www.stata.com/help.cgi?search >>> * http://www.stata.com/support/faqs/resources/statalist-faq/ >>> * http://www.ats.ucla.edu/stat/stata/ >> >> >> >> On Wed, Feb 20, 2013 at 5:47 AM, Mario Jose <mariojose276@gmail.com> wrote: >>> Thanks you for comments. Testing for equality of coefficients from >>> different subsamples, as suggested by Marteen, can be solved by >>> interactions. >>> There is an excellent explanation of the procedure in Wooldridge: >>> Introd.Econometrics ModernApproach; pp. 243-246 and pp. 449-450 and in >>> the following link: >>> http://www.stata.com/support/faqs/statistics/chow-tests/ >>> >>> Best, >>> MJ >>> >>> 2013/2/18 JVerkuilen (Gmail) <jvverkuilen@gmail.com>: >>>> As someone else indicated, your syntax is odd. >>>> >>>> The main question I have is whether you want to allow for different >>>> group residual variances. If not, interaction. If so, then I guess the >>>> easiest approach would be -suest-. >>>> >>>> On Mon, Feb 18, 2013 at 11:15 AM, Mario Jose <mariojose276@gmail.com> wrote: >>>>> Dear Statalisters, >>>>> >>>>> I have tryed to solve the question below, searching for help in the >>>>> Stata Archiv without too much success... >>>>> >>>>> I have estimated a fixed effects linear regression for two different >>>>> groups on my sample (say, sex male/female), using this strategy: >>>>> xtreg dv iv, if sex==male >>>>> xtreg dv iv, if sex==female >>>>> >>>>> I am interested in testing whether or not the coefficient b1 is >>>>> identical to each other in the two subsamples. >>>>> >>>>> I would really appreciate any help. >>>>> Regards >>>>> MJ >>>>> * >>>>> * For searches and help try: >>>>> * http://www.stata.com/help.cgi?search >>>>> * http://www.stata.com/support/faqs/resources/statalist-faq/ >>>>> * http://www.ats.ucla.edu/stat/stata/ >>>> >>>> >>>> >>>> -- >>>> JVVerkuilen, PhD >>>> jvverkuilen@gmail.com >>>> >>>> http://lesswrong.com/ >>>> >>>> "Everybody loves progress but nobody likes change." ---Fortune cookie, 1/13/13. >>>> * >>>> * For searches and help try: >>>> * http://www.stata.com/help.cgi?search >>>> * http://www.stata.com/support/faqs/resources/statalist-faq/ >>>> * http://www.ats.ucla.edu/stat/stata/ >>> >>> 2013/2/18 JVerkuilen (Gmail) <jvverkuilen@gmail.com>: >>>> As someone else indicated, your syntax is odd. >>>> >>>> The main question I have is whether you want to allow for different >>>> group residual variances. If not, interaction. If so, then I guess the >>>> easiest approach would be -suest-. >>>> >>>> On Mon, Feb 18, 2013 at 11:15 AM, Mario Jose <mariojose276@gmail.com> wrote: >>>>> Dear Statalisters, >>>>> >>>>> I have tryed to solve the question below, searching for help in the >>>>> Stata Archiv without too much success... >>>>> >>>>> I have estimated a fixed effects linear regression for two different >>>>> groups on my sample (say, sex male/female), using this strategy: >>>>> xtreg dv iv, if sex==male >>>>> xtreg dv iv, if sex==female >>>>> >>>>> I am interested in testing whether or not the coefficient b1 is >>>>> identical to each other in the two subsamples. >>>>> >>>>> I would really appreciate any help. >>>>> Regards >>>>> MJ >>>>> * >>>>> * For searches and help try: >>>>> * http://www.stata.com/help.cgi?search >>>>> * http://www.stata.com/support/faqs/resources/statalist-faq/ >>>>> * http://www.ats.ucla.edu/stat/stata/ >>>> >>>> >>>> >>>> -- >>>> JVVerkuilen, PhD >>>> jvverkuilen@gmail.com >>>> >>>> http://lesswrong.com/ >>>> >>>> "Everybody loves progress but nobody likes change." ---Fortune cookie, 1/13/13. >>>> * >>>> * For searches and help try: >>>> * http://www.stata.com/help.cgi?search >>>> * http://www.stata.com/support/faqs/resources/statalist-faq/ >>>> * http://www.ats.ucla.edu/stat/stata/ >>> * >>> * For searches and help try: >>> * http://www.stata.com/help.cgi?search >>> * http://www.stata.com/support/faqs/resources/statalist-faq/ >>> * http://www.ats.ucla.edu/stat/stata/ >> * >> * For searches and help try: >> * http://www.stata.com/help.cgi?search >> * http://www.stata.com/support/faqs/resources/statalist-faq/ >> * http://www.ats.ucla.edu/stat/stata/ > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/faqs/resources/statalist-faq/ > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: comparing equality of coefficients from two subsamples***From:*Mario Jose <mariojose276@gmail.com>

**References**:**st: comparing equality of coefficients from two subsamples***From:*Mario Jose <mariojose276@gmail.com>

**Re: st: comparing equality of coefficients from two subsamples***From:*"JVerkuilen (Gmail)" <jvverkuilen@gmail.com>

**Re: st: comparing equality of coefficients from two subsamples***From:*Mario Jose <mariojose276@gmail.com>

**Re: st: comparing equality of coefficients from two subsamples***From:*Rebecca Pope <rebecca.a.pope@gmail.com>

**Re: st: comparing equality of coefficients from two subsamples***From:*Mario Jose <mariojose276@gmail.com>

- Prev by Date:
**st: How to correctly estimate nested logit sequentially?** - Next by Date:
**st: Panel VAR Impulse Response Functions** - Previous by thread:
**Re: st: comparing equality of coefficients from two subsamples** - Next by thread:
**Re: st: comparing equality of coefficients from two subsamples** - Index(es):