Nice to see that Nick is an active member of the "anti-stepwise regression club." In that regard, I might strongly suggest taking a look at: Flom, P.L., & Cassell, D.L. (2007). Stopping stepwise: Why stepwise and similar selection methods are bad, and what you should use. NESUG 2007: Statistics and Data Analysis. http://www.nesug.org/proceedings/nesug07/sa/sa07.pdf Huberty, C. J. (1989). Problems with stepwise methods—Better alternatives. In B. Thompson (Ed.), Advances in social science methodology (Vol. 1, pp. 43–70). Greenwich, CT: JAI Press. http://education.gsu.edu/coshima/EPRS8550/Oshima%20Problem.pdf Thompson, B. (2001). Significance, Effect Sizes, Stepwise Methods, and Other Issues: Strong Arguments Move the Field. The Journal of Experimental Education, 70(1), 80-93. http://web.me.com/rsbalkin/Site/Research_Methods_and_Statistics_files/Strong%20arguments%20move%20the%20field--Thompson.pdf Thompson, B. (1995). Stepwise Regression and Stepwise Discriminant Analysis Need Not Apply here: A Guidelines Editorial. Educational and Psychological Measurement, 55(4), 525-534. Thompson, B. (1989). Why won't stepwise methods die? Measurement and Evaluation in Counseling and Development, 21(4), 146-148. http://web.me.com/rsbalkin/Site/Research_Methods_and_Statistics_files/why%20won't%20stepwise%20methods%20die.pdf Some additional references are in the FAQ Nick mentioned. To be sure, I'm not against data mining in general. Cam > Date: Mon, 13 Aug 2012 21:33:01 -0400 > Subject: Re: st: tuples, stepwise and counting types of variables > From: sohnesen@gmail.com > To: statalist@hsphsun2.harvard.edu > > Thanks Nick > > My question is how do i generate the "used" list after using stepwise > regression? Stepwise (or another automated variable selection method) > decides which variables stay in the model. I've counted the number of > variables in e(df_m), but i believe i need to save the actual names of > the variables that stay in the regression to use your suggested > approach. > > thanks again > Thomas > On Mon, Aug 13, 2012 at 8:36 PM, Nick Cox <njcoxstata@gmail.com> wrote: >> I can't comment on analogues to MAXR as I am not familiar with SAS. >> >> For counting how many of a list are in another list, you can find the >> intersection of two lists using >> >> : list a & b >> >> as documented at -help macrolists-. and then count them. >> >> For example, >> >> local availablex "x1 x2 x3" >> local usedx "x2" >> local inter : list availablex & usedx >> di `: word count `inter' >> >> Nick >> >> On Tue, Aug 14, 2012 at 1:24 AM, Thomas Sohnesen <sohnesen@gmail.com> wrote: >>> Thanks Nick >>> >>> For this exercise i'm not interested in the coeffiicents or their >>> meaning, i'm looking to find a parsimonouce model for predictions. >>> Any advice on a better alternative than stepwise? Doing it manually >>> is not really an option as we will be running a lot of different >>> models. Further, though my data is organized in blocks i would like to >>> keep single variables if they are highly correlated with my dependent >>> variable. I believe SAS has an alernative in MAXR. Do you know if >>> stata has a similar alternativ? >>> >>> Finally, no matter which alternativ we end up using, i still have the >>> challange of counting number of variables from each block in the final >>> model. Any insights on that? >>> >>> thanks and best >>> >>> Thomas >>> >>> >>> On Mon, Aug 13, 2012 at 5:30 PM, Nick Cox <njcoxstata@gmail.com> wrote: >>>> I belong to a club which is dedicated to advising people against using >>>> -stepwise-. A -search- will find an FAQ on this question. >>>> >>>> I'd look at -nestreg- instead. >>>> >>>> Nick >>>> >>>> On Mon, Aug 13, 2012 at 10:18 PM, Thomas Sohnesen <sohnesen@gmail.com> wrote: >>>> >>>>> I have a number of "groups" of variables as examplified below. >>>>> >>>>> >>>>> local gr1 x1 x2 x3 x4 >>>>> >>>>> local gr2 x5 x6 x7 x8 >>>>> >>>>> local gr3 x9 x10 x11 x12 x13 x14 x15 >>>>> >>>>> local gr4 x16 x17 >>>>> >>>>> >>>>> >>>>> I run stepwise regressions for all the combinations of these groups >>>>> using tuples. >>>>> >>>>> tuples "`gr1'" "`gr2'" "`gr3'" "`gr4'" , display >>>>> >>>>> forval i = 1/`ntuples' { >>>>> >>>>> qui stepwise, pr(0.05): regress y `tuple`i'' >>>>> >>>>> } >>>>> >>>>> >>>>> >>>>> Now i would like to count how many variables from each group that >>>>> stayed in the step wise model. >>>>> >>>>> >>>>> >>>>> For instance in the stepwise regression of gr1 and gr2 (ei x1 x2 x3 >>>>> x4 x5 x6 x7 x8) only x3 x4 x5 was included in the regression. I >>>>> would then like an output along the lines of: >>>>> >>>>> Model Num_var_gr1 num_var_gr2 num_var_gr3 num_var_gr4 >>>>> >>>>> gr2 gr3 1 2 0 >>>>> 0 >>>>> >>>>> gr2 gr4 >>>>> >>>>> gr1 gr2 >> * >> * For searches and help try: >> * http://www.stata.com/help.cgi?search >> * http://www.stata.com/support/statalist/faq >> * http://www.ats.ucla.edu/stat/stata/ > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

