[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: RE: st: significance of mean and median

From	Maarten buis <[email protected]>
To	[email protected]
Subject	Re: RE: st: significance of mean and median
Date	Wed, 26 Nov 2008 14:48:36 +0000 (GMT)

--- Bastian Steingros <[email protected]> wrote:
> using 
> sysuse auto, clear
> reg mpg, nohe
> mean mpg
> ttest mpg==0
> 
> displays the same results. However, how do these tests deal with the
> assumption, that mpg has to normal distributed? 
> More precisely , how important is the fact that mpg is normal
> distributed? Most of the variables in my sample are left or right
> skewed... 
> Is ttest also in this case reliable it? 

You can find that out using -simulate-. One way to figure this out is
to use simulation. You declare your data to be the population and
repeatedly test a true hypothesis on a random sample from your
"population" (N out of N with replacement, just like the bootstrap),
and than you look at whether the p-value folows a uniform distribution,
and whether you reject the null in only 5% of the samples. See the
example below and http://ideas.repec.org/p/boc/nsug08/14.html .

*-------------- begin example --------------------
capture program drop sim
program define sim, rclass
	sysuse auto, clear
	sum mpg, meanonly
	replace mpg = mpg - r(mean)
	bsample
	ttest mpg = 0
	return scalar p = r(p)
end
simulate p=r(p), reps(5000): sim

hist p // should be a uniform distribution

gen sig = p < .05

sum sig // mean should be .05
*--------------- end example --------------------
(For more on how to use examples I sent to the Statalist, see
http://home.fsw.vu.nl/m.buis/stata/exampleFAQ.html )

> by the way, median mpg require a option. So, how can I test if the
> median of a var. is significant without using this command? Because I
> have no idea which by-option would make sense in my sample.

I think that the term "significant" has done more harm than good
because it hides the null hypothesis. As a consequence too many
non-sensical hypotheses are being tested. What you need to do is to
specify a null hypothesis and justify why anyone should care about this
hypothesis. The hypothesis that the mean or the median of a variable is
zero is almost never of interest, and thus should almost never be
tested. It is usually much more interesting to compare the mean/median
between groups, for example men and women. So this is probably why it
never occured to someone (or no one thought it was worth their time) to
implement a test whether or not the median is equal to a certain fixed
value. 

> Nick Cox seems not to be fully agreed with LAD/qreg...

Nick can speak for himself, but I got the impression that he wasn't
negative about -qreg-, but just noted that -qreg- did not have a neat
test equivalent like -regress- and -ttest-. 

-- Maarten

-----------------------------------------
Maarten L. Buis
Department of Social Research Methodology
Vrije Universiteit Amsterdam
Boelelaan 1081
1081 HV Amsterdam
The Netherlands

visiting address:
Buitenveldertselaan 3 (Metropolitan), room N515

+31 20 5986715

http://home.fsw.vu.nl/m.buis/
-----------------------------------------

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- RE: RE: st: significance of mean and median
  - From: "Lachenbruch, Peter" <[email protected]>

References:
- Re: RE: st: significance of mean and median
  - From: "Bastian Steingros" <[email protected]>

Prev by Date: Re: st: STATA cannot read a vlue of dataXY=.z_
Next by Date: Re: st: RE: egen & sum()
Previous by thread: Re: RE: st: significance of mean and median
Next by thread: RE: RE: st: significance of mean and median
Index(es):
- Date
- Thread