Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
hind lazrak <hindstata@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: how to index regressions inside a foreach loop in order to avoid writing over the estimates |

Date |
Fri, 14 Oct 2011 02:19:25 -0700 |

Dear Nick Thank you for your relevant comments. I should always be careful in how I word my approach ( perhaps a mix of language barrier and theory confusion...) I'd like to address a few points that you raised #1: The variables that I would like to control for are indeed theory guided for the first two (applying physical principles) and the last one is more a "common sense" one. I have not examined these variables in any statistical way. I will be offering them in the regression in spite of any significance test results. I like to think of my approach of one that would be on the side of those who do not let the p-values decide for their users. #2: I agree and this is a struggle that I will have to face. Reading on correction of multiple testing is a step that I will take (I vaguely know of Bonferroni correction). Meanwhile I am curious to see what this exploratory analysis is showing even without correction. #4 : When I get to the writing phase, this is something that I need to keep in mind: striking the right balance between too much explanations and results and helping the reader follow the steps. As for the 3rd comment, I really don't know what to say. I agree, the frequentist approach is one of what may seem arbitrary. The way I see it is what type of error am I most likely to accept. Here we are examining exposure metrics. I'd rather say that the metric is not measuring what the true exposure is (type I error) and having a more conservative estimates when the time for an epidemiological study comes - but this is not the focus of this study for which I was asking for help on the coding part. Best, Hind On Fri, Oct 14, 2011 at 1:44 AM, Nick Cox <njcoxstata@gmail.com> wrote: > I'd advise strongly against this for several reasons. Here are some of them. > > 1. This is mixing crude and subtle in a strange way. You have > subject-matter (perhaps theory-guided) thinking telling you that some > confounders deserve to be in the model, but otherwise it appears that > you are going to let significance tests do all the work of deciding > what else should be in the model or what is worth thinking about. Many > people do that, but many disapprove too. > > 2. Multiple tests at the same critical level have shifted your real > critical level in a way that is difficult to handle. This divides up > any field from people who don't care much to those who have a strong > belief that not confronting this is a major technical error. The > problem goes under different names in different literatures. > > 3. Your critical level is 0.95 now, was 0.1 in your first posting. > Although I guess the mention of 0.95 is just confusing significance > level and confidence level, this illustrates a major difficulty with > this approach: the threshold is arbitrary. You then have to argue with > both those who want a different threshold and those who don't believe > you should use just significance tests for your decision-making here. > > 4. A reviewer of your work is likely to have some favourite > variable(s) that they think should be tried out. If your story is > going to be "Oh yes, I tried that but it wasn't significant, so it's > not in the Table" that is not going to impress. Most reviewers want > access to all the results in principle; how much time they spend > scanning them is their capricious decision. > > Note that #4 can bite you even if you discount #1, #2, #3. > > Nick > > On Fri, Oct 14, 2011 at 4:46 AM, hind lazrak <hindstata@gmail.com> wrote: > >> Thank you for taking the time to respond to the question I posted. >> >> I made the example simpler in my post for more clarity. >> >> In the first step I ran the pwcorr, sig to capture the list of >> variables that I ran in the loop. >> In fact the simple linear regression does include three other >> variables that may act as either modifier or confounder. So I need to >> control for them. >> >> So this brings me back to the original question. Is there any way to >> get a table of coeffs that are statistically significant at the 95% >> level? > > Richard Williams >>> At 04:54 PM 10/13/2011, hind lazrak wrote: > >>>> I am using Stata Version 10 on Windows Vista. >>>> The analysis I am conducting is exploratory and involves a long list >>>> of independent variables I am testing using simple linear regression. >>>> In order to see which variables are "promising" I'd like to find a way >>>> to store each model estimate and ideally figure out how to tabulate >>>> only those that have a p-value<0.1. >>>> >>>> The code I used is as follow >>>> >>>> foreach var of varlist [list of 55 vars] { >>>> qui reg y1 `var' */ first set of regressions looking at Y1 >>>> eststo model1`var' >>>> >>>> qui reg y2 `var' */ second set of regressions looking at Y2 >>>> eststo model2`var' >>>> } >>>> estimates table model1`var' model`var', beta not >>>> >>>> This code is not working because it overwrites all the estimates in >>>> each regression and only keeps the last one. Also I did not figure out >>>> how to only show those with p-val<0.1 >>> >>> The line >>> >>> estimates table model1`var' model`var', beta not >>> >>> should probably be >>> >>> estimates table model1`var' model2`var', beta not >>> >>> And, it should come before the }, not afterwards. >>> >>> This is just a bunch of bivariate regressions, right? Why not something like >>> >>> pwcorr y1 y2 x1-x55, star(.10) >>> >>> You could probably also fiddle around with the ereturned results and make >>> the estimates table command conditional on one p value or the other being >>> significant. > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: how to index regressions inside a foreach loop in order to avoid writing over the estimates***From:*Nick Cox <njcoxstata@gmail.com>

**References**:**st: how to index regressions inside a foreach loop in order to avoid writing over the estimates***From:*hind lazrak <hindstata@gmail.com>

**Re: st: how to index regressions inside a foreach loop in order to avoid writing over the estimates***From:*Richard Williams <richardwilliams.ndu@gmail.com>

**Re: st: how to index regressions inside a foreach loop in order to avoid writing over the estimates***From:*hind lazrak <hindstata@gmail.com>

**Re: st: how to index regressions inside a foreach loop in order to avoid writing over the estimates***From:*Nick Cox <njcoxstata@gmail.com>

- Prev by Date:
**st: Estout After Margin** - Next by Date:
**Re: st: how to index regressions inside a foreach loop in order to avoid writing over the estimates** - Previous by thread:
**Re: st: how to index regressions inside a foreach loop in order to avoid writing over the estimates** - Next by thread:
**Re: st: how to index regressions inside a foreach loop in order to avoid writing over the estimates** - Index(es):