Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: how to index regressions inside a foreach loop in order to avoid writing over the estimates


From   hind lazrak <hindstata@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: how to index regressions inside a foreach loop in order to avoid writing over the estimates
Date   Fri, 14 Oct 2011 02:19:25 -0700

Dear Nick

Thank you for your relevant comments. I should always be careful in
how I word my approach ( perhaps a mix of language barrier and theory
confusion...)
I'd like to address a few points that you raised
#1: The variables that I would like to control for are indeed theory
guided for the first two  (applying physical principles) and the last
one
is more a "common sense" one. I have not examined these variables in
any statistical way. I will be offering them in the regression in
spite of any significance test results. I like to think of my approach
of one that would be on the side of those who do not let the p-values
decide for their users.

#2: I agree and this is a struggle that I will have to face. Reading
on correction of multiple testing is a step that I will take (I
vaguely know of Bonferroni correction). Meanwhile I am curious  to see
what this exploratory analysis is showing even without correction.

#4 : When I get to the writing phase, this is something that I need to
keep in mind: striking the right balance between too much explanations
and results and helping the reader follow the steps.

As for the 3rd comment, I really don't know what to say. I agree, the
frequentist approach is one of what may seem arbitrary. The way I see
it is what type of error am I most likely to accept. Here we are
examining exposure metrics. I'd rather say that the metric is not
measuring what the true exposure is (type I error) and having a more
conservative estimates when the time for an epidemiological study
comes - but this is not the focus of this study for which I was asking
for help on the coding part.

Best,
Hind

On Fri, Oct 14, 2011 at 1:44 AM, Nick Cox <njcoxstata@gmail.com> wrote:
> I'd advise strongly against this for several reasons. Here are some of them.
>
> 1. This is mixing crude and subtle in a strange way. You have
> subject-matter (perhaps theory-guided) thinking telling you that some
> confounders deserve to be in the model, but otherwise it appears that
> you are going to let significance tests do all the work of deciding
> what else should be in the model or what is worth thinking about. Many
> people do that, but many disapprove too.
>
> 2. Multiple tests at the same critical level have shifted your real
> critical level in a way that is difficult to handle. This divides up
> any field from people who don't care much to those who have a strong
> belief that not confronting this is a major technical error. The
> problem goes under different names in different literatures.
>
> 3. Your critical level is 0.95 now, was 0.1 in your first posting.
> Although I guess the mention of 0.95 is just confusing significance
> level and confidence level, this illustrates a major difficulty with
> this approach: the threshold is arbitrary. You then have to argue with
> both those who want a different threshold and those who don't believe
> you should use just significance tests for your decision-making here.
>
> 4. A reviewer of your work is likely to have some favourite
> variable(s) that they think should be tried out. If your story is
> going to be "Oh yes, I tried that but it wasn't significant, so it's
> not in the Table" that is not going to impress. Most reviewers want
> access to all the results in principle; how much time they spend
> scanning them is their capricious decision.
>
> Note that #4 can bite you even if you discount #1, #2, #3.
>
> Nick
>
> On Fri, Oct 14, 2011 at 4:46 AM, hind lazrak <hindstata@gmail.com> wrote:
>
>> Thank you for taking the time to respond to the question I posted.
>>
>> I made the example simpler in my post for more clarity.
>>
>> In the first step I ran the pwcorr, sig to capture the list of
>> variables that I ran in the loop.
>> In fact the simple linear regression does include three other
>> variables that may act as either modifier or confounder. So I need to
>> control for them.
>>
>> So this brings me back to the original question. Is there any way to
>> get a table of coeffs that are statistically significant at the 95%
>> level?
>
> Richard Williams
>>> At 04:54 PM 10/13/2011, hind lazrak wrote:
>
>>>> I am using Stata Version 10 on Windows Vista.
>>>> The analysis I am conducting is exploratory and involves a long list
>>>> of independent variables I am testing using simple linear regression.
>>>> In order to see which variables are "promising" I'd like to find a way
>>>> to store each model estimate and ideally figure out how to tabulate
>>>> only those that have a p-value<0.1.
>>>>
>>>> The code I used is as follow
>>>>
>>>> foreach var of varlist [list of 55 vars] {
>>>> qui reg y1 `var'   */ first set of regressions looking at Y1
>>>> eststo model1`var'
>>>>
>>>> qui reg y2 `var'  */ second set of regressions looking at Y2
>>>> eststo model2`var'
>>>> }
>>>> estimates table model1`var' model`var', beta not
>>>>
>>>> This code is not working because it overwrites all the estimates in
>>>> each regression and only keeps the last one. Also I did not figure out
>>>> how to only show those with p-val<0.1
>>>
>>> The line
>>>
>>> estimates table model1`var' model`var', beta not
>>>
>>> should probably be
>>>
>>> estimates table model1`var' model2`var', beta not
>>>
>>> And, it should come before the }, not afterwards.
>>>
>>> This is just a bunch of bivariate regressions, right? Why not something like
>>>
>>> pwcorr y1 y2 x1-x55, star(.10)
>>>
>>> You could probably also fiddle around with the ereturned results and make
>>> the estimates table command conditional on one p value or the other being
>>> significant.
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index