[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: AW: Simulate a skewed variable in stata, sample vs. population skewness

From	"Martin Weiss" <[email protected]>
To	<[email protected]>
Subject	st: AW: Simulate a skewed variable in stata, sample vs. population skewness
Date	Mon, 7 Dec 2009 11:46:18 +0100

<> 

" In case my question is unclear the following simple example may help 
illustrate the gist of my problem. Let's assume that we want to study 
how the OLS-estimator perform in small samples when the error terms 
are skewed."


You may want to look at http://www.stata-press.com/books/mus.html, section
4.6.1, where a similar exercise is conducted.


Re you -program-, you can shorten it to


********
program define skewchi
	version 9.2
	drop _all
	set obs 10
	gen double x2=(invnorm(uniform()))^2	
	qui sum x2, detail
end

simulate mean=r(mean) var=r(Var) skew=r(skewness), ///
reps(1000) seed(1) dots: skewchi 

sum
********

HTH
Martin


-----Ursprüngliche Nachricht-----
Von: [email protected]
[mailto:[email protected]] Im Auftrag von Karl-Oskar
Lindgren
Gesendet: Montag, 7. Dezember 2009 11:06
An: [email protected]
Betreff: st: Simulate a skewed variable in stata, sample vs. population
skewness

Dear listusers,

I have a question that I guess is partly statistical and partly 
philosphical. In a paper that uses Monte-Carlo simulations to study 
the small sample performance of an estimator I was asked by a referee 
to investigate how the estimator performs when the error terms are 
skewed. 

When trying to implement this suggestion I realized that sample 
skewness as reported by stata can differ considerably from the 
skewness of the underlying population (although both the sample mean
and variance of the variable remain close to their population 
counterparts). My question is therefore if it is the sample skewness 
or the population skewness that should be kept constant when 
examining the small sample performance of a statistical estimator. 

In case my question is unclear the following simple example may help 
illustrate the gist of my problem. Let's assume that we want to study 
how the OLS-estimator perform in small samples when the error terms 
are skewed. In order to do this we decide to generate 10 error terms 
from a chi-square distribution with 1 degree-of-freedom. The 
population skewness should then be 2^(3/2), i.e., about 2.8. But if I 
generate 1000 samples from such a distribution in stata the average
skewness across these 1000 samples turn out to be about 1.3 (see the 
example code below). I understand that the reason for the discrepancy 
is that measures of skewness tend to be biased in small samples when 
the variables are non-normal (indeed the sample skewness is 
approaching its theoretical level as we increases the number of 
observations in the example below).

My question, however, concerns whether it is the sample skewness or 
the population skewness that I should keep constant in my 
replications when I vary the other parameters of the model. If it is 
the population skewness the implementation is straightforward since 
the skewness in the population is known. But if it is the sample 
skewness that  should be kept constant I would appreciate any hints 
of appropriate methods to accomplish this.        


**Example code to illustrate the bias of r(skewness)
program define skewchi, rclass
	version 9.2
	drop _all
	set obs 10
	gen double x=invnorm(uniform())
	gen double x2=x^2
	
	sum x2, detail
	return scalar mean=r(mean)
	return scalar var=r(Var) 
	return scalar skew=r(skewness)
	
end

	simulate mean=r(mean) var=r(var) skew=r(skew), ///
 	reps(1000) seed(1)  dots: skewchi 
        sum
        
Best wishes,

Karl-Oskar Lindgren
Department of Government
Uppsala University
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

References:
- st: Simulate a skewed variable in stata, sample vs. population skewness
  - From: "Karl-Oskar Lindgren" <[email protected]>

Prev by Date: st: Simulate a skewed variable in stata, sample vs. population skewness
Next by Date: st: How to interpret Oaxaca?
Previous by thread: st: Simulate a skewed variable in stata, sample vs. population skewness
Next by thread: st: RE: Simulate a skewed variable in stata, sample vs. population skewness
Index(es):
- Date
- Thread