Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: Difference of means and t-test

From   Richard Williams <>
Subject   RE: st: Difference of means and t-test
Date   Tue, 15 Jun 2010 08:43:22 -0500

At 02:25 PM 6/14/2010, Nick Cox wrote:
I don't think our views are contradictory. It is clearly true that you
can get results from summary statistics alone. But erecting fake
Gaussians with those summaries is not equivalent to reconstructing the
original data. That is my point, and no more. It is akin to arguments at
a higher level about "sufficient statistics". If something is normal,
then it is sufficient to know mean and sd, but there isn't a reverse

At 11:19 AM 6/14/2010, Nick Cox wrote:
>-- except that will surely overstate the strength of the conclusions,
>so far as the real distributions are unlikely to be exactly Gaussian.

Still, it is incorrect to say that constructing fake Gaussians "will surely overstate the strength of the conclusions." The p values are based on various assumptions, e.g. normally distributed, homoskedastic errors. If the assumptions are wrong, the p values are wrong. But, whether the assumptions are correct or not, the calculation of the test statistics and coefficients are the same, i.e. for regression-type problems if you've got the means, correlations and standard deviations there are all sorts of things you can compute without having the rest of the data. You run a regression or Anova with the "fake" data and you'll get the exact same results as with the real data.

Of course, without having the original data, you can't, say, do diagnostic tests of assumptions, analyze subsets of the data, add an x^2 term, etc. So, yes, you greatly prefer having the real data! But if the real data aren't available there is still a lot you can do. I don't know why the original poster was using ttesti instead of ttest, but if it was because he only had summary statistics available to him then it would be possible for him to run an Anova the way I suggested and the numbers he would get would be the same as if he had the real data. There probably wouldn't be a whole lot else he could do though, e.g. the predict command and most other post-estimation commands won't be of much use without the real data.

Richard Williams, Notre Dame Dept of Sociology
OFFICE: (574)631-6668, (574)631-6463
HOME:   (574)289-5227
EMAIL:  Richard.A.Williams.5@ND.Edu

*   For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index