[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
austin nichols <[email protected]> |

To |
[email protected] |

Subject |
Re: st: regression |

Date |
Wed, 28 Dec 2005 08:57:18 -0500 |

The mathematical explanation is quite easy, but it's not clear to me what is confusing you, so I'm not sure I can explain in a way to make it more clear. by including a dummy variable (AKA indicator variable) like hvptb, you are allowing the constant term (AKA intercept) to differ across the two groups defined by hvptb=0 and hvptb=1. By including interactions with hvptb, you are allowing the slope wrt time and other relevant terms to differ across those two groups as well. If you include the interaction of hvptb with every other variable, it is almost the same as estimating two separate models (e.g. reg y $xvars if hvptb==0 and reg y $xvars if hvptb==1). If you think the relationship between y [or log(crh) as you call it] and time is nonlinear, then I guess you should be including at least a linear and quadratic term (a la the Taylor series expansion of the presumably unknown nonlinear function) in both of those models. Which means for the model using both groups (hvptb=0 and hvptb=1), you have to include all the interactions. The significance of any one coefficient in such a model is nearly irrelevant, since what you care about is whether linear combinations of coefficients are significant (the obvious test is whether b3=0 and b4=0 and b5=0, given by -test hvptb txhvptb t2xhvptb- or somesuch) in a model of the form: E(y)= b0 + b1*time + b2*time^2 + b3*hvptb + b4*(time*hvptb) +b5*(time^2*hvptb) estimated by, e.g., . reg y t t2 hvptb txhvptb t2xhvptb In general, you should try thinking through the marginal effect of each relevant variable for each relevant subgroup to interpret the coefficients. In your "Model-1" below, the marginal effect of time for the hvptb=0 subgroup is b1+b2*time and the marginal effect of time for the hvptb=1 subgroup is b1+b4+b2*time which shows you that b2 is capturing an effect of time constrained to be the same across the two groups. Obviously, including a quadratic term for (almost) any variable will result in a different estimate of the coefficient on the linear term, and including a quadratic term that is the same for both groups will result in different estimates of the coefficients on the linear terms for each group than would including quadratic terms that differ across groups. On 12/26/05, [email protected] <[email protected]> wrote: > Model-1: Log(crh)= b0 + b1*time + b2*time^2 + b3*hvptb + b4*(time*hvptb). > Model-2: Log(crh)= b0 + b1*time + b2*time^2 + b3*hvinf + b4*(time*hvinf). > > On estimating the models I find that the values of b0 and b1 are not the same for the two models. Hence the prediction equation for the normal people are different in the two models. The quadratic time term is significant in the model. > > Austin Nichols wrote back to me that the coefficients will be the same only if I include all the relevant interaction terms "time^2*hvptb" and "time^2*hvinf" also in the model. I verified this but I couldn't find a mathematical explanation to why this is so. > > On including the interaction terms for time^2, I lose significance on those terms as well as the other terms. So if I leave out the interaction terms I need to explain why the normal people have different equations in the two models. > I'd appreciate any help on this issue. * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: regression***From:*<[email protected]>

- Prev by Date:
**RE: st: significance of model improvement using log pseudolikelihood?** - Next by Date:
**Re: st: doing a t test after svymean** - Previous by thread:
**st: regression** - Next by thread:
**Re: st: regression** - Index(es):

© Copyright 1996–2024 StataCorp LLC | Terms of use | Privacy | Contact us | What's new | Site index |