Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Joerg Luedicke <joerg.luedicke@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Very high t- statistics and very small standard errors |

Date |
Tue, 1 May 2012 07:41:39 -0700 |

Laurie, Let's have a look at a simple difference in means, by regressing mpg on foreign in the auto dataset: sysuse auto, clear reg mpg foreign We can see that the difference in means is 4.95. If we were interested in significance testing we can calculate the t-value, which simply measures how many times the difference between the two groups is away from zero: di 4.945804/1.362162 3.6308486 and then attach a p-value by assuming some probability distribution. However, the point is that whatever test you use, the result will depend on your t-value which in turn depends on your standard error. Now, how is the standard error being calculated? Say we were only interested in a standard error of one mean (to build a confidence interval, for example), then the standard error is simply the sample standard deviation, divided by the square root of your sample size. For example, if we look at the mpg variable again sum mpg Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- mpg | 74 21.2973 5.785503 12 41 we can calculate the SE: di 5.785503/sqrt(74) .67255106 which is what you would get by invoking Stata's -mean-: mean mpg Mean estimation Number of obs = 74 -------------------------------------------------------------- | Mean Std. Err. [95% Conf. Interval] -------------+------------------------------------------------ mpg | 21.2973 .6725511 19.9569 22.63769 -------------------------------------------------------------- So, what you can clearly see now is how the SE depends on your sample size. Imagine the auto dataset would have 1 million cases, but mpg would have the same sample standard deviation: di 5.785503/sqrt(1000000) .0057855 and note how small the standard error would be in that case. Now this is all most basic and highly introductory stuff and if you lack these basics, however, I would strongly advice doing yourself a favor and attend some introductory courses and/or read some introductory textbooks before doing any (serious) data analysis. Joerg On Mon, Apr 30, 2012 at 6:18 PM, Laurie Molina <molinalaurie@gmail.com> wrote: > It is not the first time I hear people say that when you have a lot of > observations everything is significant... Is it because the lenght of > the confidence intervals is inversely related to the number of > observations considered? Or could you tell me what is the logic behind > saying that with a lot of observations everything is statistically > significant? > Thank you very much again! > > On Mon, Apr 30, 2012 at 9:10 PM, Richard Williams > <richardwilliams.ndu@gmail.com> wrote: >> At 07:54 PM 4/30/2012, Laurie Molina wrote: >>> >>> Hi everybody, >>> I'm running some OLS with around 4 million observations and 6 >>> explanatory variables. My coefficients are always significants, with >>> very high t statistics and very low standard errors. for example t >>> statistic=20.6 and standard error= .000023. This is a cross sectional >>> data set. >>> I have run the VIF test and for all the variables the variance >>> inflation factor is less than 3. >>> I have also ran the Durbin test creating an index variable (_n) to see >>> wheter there is some sort of correlation in the error terms of my >>> regresion, but there is not. >>> Should I bee concerned about the significance of my coefficients? Is >>> there any problem with getting such a large t statistics and small >>> standard errors? >>> Thank you all in advance and best regards!! >> >> >> With 4 million cases it is hard not to get statistically significant >> results. Whether they are worth caring about is another matter. For example, >> a $2 difference in the incomes of men and women may be statistically >> significant. $2 is not the same as $0. But how much you should care is >> another matter. So, if everything is highly significant, I would ask myself >> what the substantive significance of the findings is. (Actually I would do >> that even if the results were not so significant - I think many people do >> not pay enough attention to "So What?" sorts of questions.) >> >> >> ------------------------------------------- >> Richard Williams, Notre Dame Dept of Sociology >> OFFICE: (574)631-6668, (574)631-6463 >> HOME: (574)289-5227 >> EMAIL: Richard.A.Williams.5@ND.Edu >> WWW: http://www.nd.edu/~rwilliam >> >> >> * >> * For searches and help try: >> * http://www.stata.com/help.cgi?search >> * http://www.stata.com/support/statalist/faq >> * http://www.ats.ucla.edu/stat/stata/ > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

- Prev by Date:
**Re: st: Very high t- statistics and very small standard errors** - Next by Date:
**st: Conformability error following margins command** - Previous by thread:
**Re: st: Very high t- statistics and very small standard errors** - Next by thread:
**Re: st: Very high t- statistics and very small standard errors** - Index(es):