[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
"Lachenbruch, Peter" <Peter.Lachenbruch@oregonstate.edu> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
RE: RE: st: significance of mean and median |

Date |
Wed, 26 Nov 2008 10:02:19 -0800 |

In general, I've found that bad skewness/asymmetry messes up significance tests more than heavy tails. I know I read this somewhere long ago, and it seems to work pretty well. When you have skewness, looking for transformations is a good idea. Where you can get messed up is when the skewness is caused by a lumping at a value - e.g. the number of subjects who have 0 days of hospitalization. Then no transformation will help - and it's probably better to fit models to no response and response given that it's greater than 0. This might be a two-part or hurdle model or a mixture of distributions (such as zip or zinb) Tony Peter A. Lachenbruch Department of Public Health Oregon State University Corvallis, OR 97330 Phone: 541-737-3832 FAX: 541-737-4001 -----Original Message----- From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Maarten buis Sent: Wednesday, November 26, 2008 6:49 AM To: statalist@hsphsun2.harvard.edu Subject: Re: RE: st: significance of mean and median --- Bastian Steingros <Steingros@gmx.de> wrote: > using > sysuse auto, clear > reg mpg, nohe > mean mpg > ttest mpg==0 > > displays the same results. However, how do these tests deal with the > assumption, that mpg has to normal distributed? > More precisely , how important is the fact that mpg is normal > distributed? Most of the variables in my sample are left or right > skewed... > Is ttest also in this case reliable it? You can find that out using -simulate-. One way to figure this out is to use simulation. You declare your data to be the population and repeatedly test a true hypothesis on a random sample from your "population" (N out of N with replacement, just like the bootstrap), and than you look at whether the p-value folows a uniform distribution, and whether you reject the null in only 5% of the samples. See the example below and http://ideas.repec.org/p/boc/nsug08/14.html . *-------------- begin example -------------------- capture program drop sim program define sim, rclass sysuse auto, clear sum mpg, meanonly replace mpg = mpg - r(mean) bsample ttest mpg = 0 return scalar p = r(p) end simulate p=r(p), reps(5000): sim hist p // should be a uniform distribution gen sig = p < .05 sum sig // mean should be .05 *--------------- end example -------------------- (For more on how to use examples I sent to the Statalist, see http://home.fsw.vu.nl/m.buis/stata/exampleFAQ.html ) > by the way, median mpg require a option. So, how can I test if the > median of a var. is significant without using this command? Because I > have no idea which by-option would make sense in my sample. I think that the term "significant" has done more harm than good because it hides the null hypothesis. As a consequence too many non-sensical hypotheses are being tested. What you need to do is to specify a null hypothesis and justify why anyone should care about this hypothesis. The hypothesis that the mean or the median of a variable is zero is almost never of interest, and thus should almost never be tested. It is usually much more interesting to compare the mean/median between groups, for example men and women. So this is probably why it never occured to someone (or no one thought it was worth their time) to implement a test whether or not the median is equal to a certain fixed value. > Nick Cox seems not to be fully agreed with LAD/qreg... Nick can speak for himself, but I got the impression that he wasn't negative about -qreg-, but just noted that -qreg- did not have a neat test equivalent like -regress- and -ttest-. -- Maarten ----------------------------------------- Maarten L. Buis Department of Social Research Methodology Vrije Universiteit Amsterdam Boelelaan 1081 1081 HV Amsterdam The Netherlands visiting address: Buitenveldertselaan 3 (Metropolitan), room N515 +31 20 5986715 http://home.fsw.vu.nl/m.buis/ ----------------------------------------- * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**R: RE: st: significance of mean and median***From:*"Carlo Lazzaro" <carlo.lazzaro@tiscalinet.it>

**RE: RE: st: significance of mean and median***From:*Maarten buis <maartenbuis@yahoo.co.uk>

**References**:**Re: RE: st: significance of mean and median***From:*"Bastian Steingros" <Steingros@gmx.de>

**Re: RE: st: significance of mean and median***From:*Maarten buis <maartenbuis@yahoo.co.uk>

- Prev by Date:
**Re: how to reply on Statalist [was: RE: st: Categorizing HIV status using a series of string variables]** - Next by Date:
**Re: st: RE: Re: Poststratification weights** - Previous by thread:
**Re: RE: st: significance of mean and median** - Next by thread:
**RE: RE: st: significance of mean and median** - Index(es):

© Copyright 1996–2014 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |