Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
jaweria seth <jaweriaseth@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Looping over variables in more than one group |

Date |
Wed, 7 Mar 2012 10:31:10 -0600 |

Thanks J, You are correct. In theory, i expect a 'production/size' variable to significantly affect my dependent variable, however, I wanted to let the regression spit out which of the variables in that category are most significant (since they are somewhat similar). In that case, I am looking to the tstatistics of the independent variables in the model. Is that not correct? On Wed, Mar 7, 2012 at 10:00 AM, Joerg Luedicke <joerg.luedicke@gmail.com> wrote: > You should probably rather think about what covariates make the most > sense to include with respect to your theory and research question. > Digging up variables to cook up good looking p-values and then > interpreting these p-values in the usual way is a questionable > endeavor, to say the least. However, if you are rather interested in > something like a prediction model, and not in hypothesis testing, you > could just use straight data mining techniques right away, for example > boosted regression (-findit boost-). > > J. > > On Wed, Mar 7, 2012 at 7:12 AM, jaweria seth <jaweriaseth@gmail.com> wrote: >> Thanks Nick, >> I understand this would result in a large number of models.. >> however, I wouldn't be combining variables of the same category/group, >> as this would bring up the issue of multicollinearity. >> for example, I know for sure I need to add one variable each from >> groups 1 and 2. group 1 contains variables that measure the >> size/production of a business, and I am wondering which of those >> variables would be most significant in a multi-variate model. I am >> looking at t-stats in the regression output: if even one of the >> variables included is not significant at the 10%, that model gets >> dropped..( and as im running the regressions manually, i find that the >> majority of the combos are not significant). >> >> Does this make sense? If so, how can I implement it? >> The way I am doing it right now: Holding one variable from group2 >> constant and looping through group 1/size variables to find >> significance. however, this gets tricky when I try to include a third >> variable. >> >> >> Thanks, >> >> On Wed, Mar 7, 2012 at 2:34 AM, Nick Cox <njcoxstata@gmail.com> wrote: >>> Before you even think of how to implement this, do the combinatorics >>> of how many models this implies. >>> >>> So, for example, >>> >>> . di 30^4 >>> 810000 >>> >>> . di 5^4 >>> 625 >>> >>> Then bump up those numbers adding in the null choices, i.e. no >>> variable from each group, as well. >>> >>> So you would need not only to do the looping but to ponder what it >>> implies in terms of gathering results from thousands of models, >>> finding the "best", whatever that means, including the implications >>> for how you think about the resulting P-values, etc. >>> >>> Nick >>> >>> On Tue, Mar 6, 2012 at 10:01 PM, jaweria seth <jaweriaseth@gmail.com> wrote: >>> >>>> I would like to run regressions with up to 4 different variables. My >>>> variables are separated into 4 groups with 5-30 variables in each >>>> group. I would like to run regression combos of different variables to >>>> find the best model: >>>> How do I regress my y variable on 1 variable from group 1 and 1 from >>>> group 2 and loop through different combos of each? >>>> for ex: >>>> regress Yvariable Group1 Group2 >>>> >>>> Then I would like to add a variable from group 3, and so on.. >>> * >>> * For searches and help try: >>> * http://www.stata.com/help.cgi?search >>> * http://www.stata.com/support/statalist/faq >>> * http://www.ats.ucla.edu/stat/stata/ >> >> * >> * For searches and help try: >> * http://www.stata.com/help.cgi?search >> * http://www.stata.com/support/statalist/faq >> * http://www.ats.ucla.edu/stat/stata/ > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ -- Jaweria Seth * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: Looping over variables in more than one group***From:*Joerg Luedicke <joerg.luedicke@gmail.com>

**References**:**st: Looping over variables in more than one group***From:*jaweria seth <jaweriaseth@gmail.com>

**Re: st: Looping over variables in more than one group***From:*Nick Cox <njcoxstata@gmail.com>

**Re: st: Looping over variables in more than one group***From:*jaweria seth <jaweriaseth@gmail.com>

**Re: st: Looping over variables in more than one group***From:*Joerg Luedicke <joerg.luedicke@gmail.com>

- Prev by Date:
**RE: st: Looping over variables in more than one group** - Next by Date:
**st: Re: about MLE of exponential distribution** - Previous by thread:
**Re: st: Looping over variables in more than one group** - Next by thread:
**Re: st: Looping over variables in more than one group** - Index(es):