Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: Looping over variables in more than one group


From   Nick Cox <n.j.cox@durham.ac.uk>
To   "'statalist@hsphsun2.harvard.edu'" <statalist@hsphsun2.harvard.edu>
Subject   RE: st: Looping over variables in more than one group
Date   Wed, 7 Mar 2012 16:39:59 +0000

Thanks for the extra detail. I don't think it contradicts my main point. 

In essence, I don't think what you want to do is a good idea, so I have no interest personally in even thinking about how to do it in Stata. Of course, that is no more than a personal opinion. 

Nick 
n.j.cox@durham.ac.uk 

jaweria seth

Thanks Nick,
I understand this would result in a large number of models..
however, I wouldn't be combining variables of the same category/group,
as this would bring up the issue of multicollinearity.
for example, I know for sure I need to add one variable each from
groups 1 and 2. group 1 contains variables that measure the
size/production of a business, and I am wondering which of those
variables would be most significant in a multi-variate model. I am
looking at t-stats in the regression output: if even one of the
variables included is not significant at the 10%, that model gets
dropped..( and as im running the regressions manually, i find that the
majority of the combos are not significant).

Does this make sense? If so, how can I implement it?
The way I am doing it right now: Holding one variable from group2
constant and looping through group 1/size variables to find
significance. however, this gets tricky when I try to include a third
variable.


Thanks,

On Wed, Mar 7, 2012 at 2:34 AM, Nick Cox <njcoxstata@gmail.com> wrote:
> Before you even think of how to implement this, do the combinatorics
> of how many models this implies.
>
> So, for example,
>
> . di 30^4
> 810000
>
> . di 5^4
> 625
>
> Then bump up those numbers adding in the null choices, i.e. no
> variable from each group, as well.
>
> So you would need not only to do the looping but to ponder what it
> implies in terms of gathering results from thousands of models,
> finding the "best", whatever that means, including the implications
> for how you think about the resulting P-values, etc.
>
> Nick
>
> On Tue, Mar 6, 2012 at 10:01 PM, jaweria seth <jaweriaseth@gmail.com> wrote:
>
>> I would like to run regressions with up to 4 different variables. My
>> variables are separated into 4 groups with 5-30 variables in each
>> group. I would like to run regression combos of different variables to
>> find the best model:
>> How do I regress my y variable on 1 variable from group 1 and 1 from
>> group 2 and loop through different combos of each?
>> for ex:
>> regress Yvariable Group1 Group2
>>
>> Then I would like to add a variable from group 3, and so on..

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index