Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Zeynep Ozkok <zeynepozkok@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: RE: creating loops using combinations of variables |

Date |
Thu, 16 Feb 2012 23:25:47 +0100 |

Thanks a lot for all your explanations Nick. I'll try to think of other ways for indexing. The nested loops sound very complicated to construct and time consuming as well. Thanks a lot again. Zeynep On Thu, Feb 16, 2012 at 1:33 PM, Nick Cox <n.j.cox@durham.ac.uk> wrote: > Thanks for your clarification. > > My impression is that the combinatorial explosion here makes this impracticable for 27 variables and in fact a bad idea in principle. If you are trying out what are literally millions of different models, your inferences have to be adjusted for that selection; otherwise no P-values can be taken very seriously. How to program it is a side-issue, except that trying to nest loops would be time-consuming and stressful. > > A quite separate issue is that ln(1 + anything) looks like a fudge for the fact that mostly you want to work with logarithms but that you know zeros are possible. Using ln(1 + anything) divides statistical people right down the middle, as various threads on this list have shown. The pessimistic view is that if you do this, you throw away most of what is useful and interpretable about ln(anything). > > Nick > n.j.cox@durham.ac.uk > > Zeynep Ozkok > > Thank you very much for your comment Nick. > > Let me try to clarify the issue a bit by taking three variables as you > suggested. The three variables are: var1, var2, and var3. > > What I would like to do is the following: > > Step 1: Generate two variables called lex1, and lex2 such that, lex1 = > var1 and lex2= var2+var3 > Generate two indices index1 and index2, such that: index1 = ln(1+ > lex1) and index2 = ln(1+lex2) > > Run a regression of the following form: Y_i,s,t= alpha_i +alpha_s > +alpha_t +beta*(index1)_i,t +lamda* (index2)_i,t + error_i,s,t > > Save the coefficients for index1 and index2, and the Rsquare. > > Clear lex1, lex2, index1, index2. > > Step 2: Generate two variables called lex1, and lex2 such that, lex1 = > var2 and lex2= var1+var3 > Generate two indices index1 and index2, such that: index1 = ln(1+ > lex1) and index2 = ln(1+lex2) > > Run a regression of the following form: Y_i,s,t= alpha_i +alpha_s > +alpha_t +beta*(index1)_i,t +lamda* (index2)_i,t + error_i,s,t > > Save the coefficients for index1 and index2, and the Rsquare. > > Clear lex1, lex2, index1, index2. > > Step 3: Generate two variables called lex1, and lex2 such that, lex1 = > var3 and lex2= var1+var2 > Generate two indices index1 and index2, such that: index1 = ln(1+ > lex1) and index2 = ln(1+lex2) > > Run a regression of the following form: Y_i,s,t= alpha_i +alpha_s > +alpha_t +beta*(index1)_i,t +lamda* (index2)_i,t + error_i,s,t > > Save the coefficients for index1 and index2, and the Rsquare. > > Clear lex1, lex2, index1, index2. > > Unfortunately the order of the variables included in the index > measures are important. I should be able to tell which significant > indices include which variables. To me that seems almost impossible > when considering 27 variables. Is there a way to construct a loop to > run this entire process? > > Thank you so much for all your help. > > Zeynep > > >> On Thu, Feb 16, 2012 at 11:41 AM, Nick Cox <n.j.cox@durham.ac.uk> wrote: >>> "All possible combinations" would usually mean, for 27 variables, 27 ways of selecting just one, comb(27, 2) = 351 ways of selecting two, ..., up to comb(27, 27) = 1 way of selecting them all. In total that means 2^27 - 1 ~ 10^8 combinations. That is, precisely, 134,217,727 combinations. >>> >>> My suggestion is to set aside the fact that you have 27 variables. Show us exactly what you would do with just 3 variables, say. >>> >>> Nick >>> n.j.cox@durham.ac.uk >>> >>> Zeynep Ozkok >>> >>> I have a question on how to create loops for combinations of different >>> variables. I have 27 variables that I would like to put in two different >>> indices. >>> >>> The indices can be constructed in two steps: >>> >>> Lex1=sum(of different variables out of 27) this variable should be able >>> to take on 1 to 27 variables, so it should allow for all possible >>> combinations. It could be equal to only 1 variable, or it could be equal to >>> the sum of different variables >>> >>> Index1 = ln (1+lex1) this index is then dependent on what values lex1 >>> takes on >>> >>> Similarly >>> >>> Lex2 = sum (of all the variables that are not accounted in lex1) again this >>> could take on one variable, or more than one depending on the structure of >>> lex1. >>> >>> Index2 = ln(1+lex2) this index is once again dependent on what values lex2 >>> takes on, which is dependent on the values that lex1 takes on. >>> >>> Then these two indices will simultaneously be used in fixed effects >>> regressions as follows: >>> >>> Y_i,s,t= alpha_i +alpha_s +alpha_t +beta*(index1)_i,t +lamda* (index2)_i,t >>> + error_i,s,t >>> >>> The loop must go on until all possibilities/ combinations are completed. I >>> need to check the results of the beta and lamda coefficients and their >>> corresponding rsquares for each regression. Since there are numerous >>> possibilities in constructing each index I need to create a loop. However I >>> don't even know how to start out a loop that depends on combinations of >>> variables. Could you possibly help me out in writing and solving this >>> problem? >>> >>> >>> * >>> * For searches and help try: >>> * http://www.stata.com/help.cgi?search >>> * http://www.stata.com/support/statalist/faq >>> * http://www.ats.ucla.edu/stat/stata/ > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: creating loops using combinations of variables***From:*Zeynep Ozkok <zeynepozkok@gmail.com>

**st: RE: creating loops using combinations of variables***From:*Nick Cox <n.j.cox@durham.ac.uk>

**Re: st: RE: creating loops using combinations of variables***From:*Zeynep Ozkok <zeynepozkok@gmail.com>

**Re: st: RE: creating loops using combinations of variables***From:*Zeynep Ozkok <zeynepozkok@gmail.com>

**RE: st: RE: creating loops using combinations of variables***From:*Nick Cox <n.j.cox@durham.ac.uk>

- Prev by Date:
**Re: st: Foreach looping code for generating a Quantiles variable for each two digit sic code** - Next by Date:
**Re: st: RE: RE: IPUMS data open issue** - Previous by thread:
**RE: st: RE: creating loops using combinations of variables** - Next by thread:
**st: hierarchical logistic regression command** - Index(es):