Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Maarten Buis <maartenlbuis@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Questions for random data generation and value label |

Date |
Thu, 14 Mar 2013 10:26:10 +0100 |

On Thu, Mar 14, 2013 at 12:50 AM, Yu Xue wrote: > http://galton.uchicago.edu/~collins/resources/stata/stata-commands.html, > which shows how to generate random data with some specific parameters > without mentioning the type of distribution. You misunderstood that website: It does specify the distribution from which it draws the random numbers. In the first example it draws the random variables from a uniform distribution and in the second example from a normal distribution.So first you need to tell us which distribution you want to draw from. The mean, standard deviation, min and max is not sufficient to define a distribution, see the discussion and example below. So the answer you need to give us is either "normal", or "uniform", or "gamma", or "Laplace", or "beta", or ... > If I have to specify the type of distribution in order for you to > answer my question, I will specify a normal distribution. As we have explained before, you cannot have a normal distribution and specify bounds. By definition, the normal distribution is a distribution for variables that can take values between -infinity to +infinity. The way you write, it seems like you just picked one distribution that sounded familiar. That is not a good criterion. You really need to consider what you want to use your random draws for and what their properties should be. > Min in "seq_num" and "seq_num1" are very different, which is what I > called "not accurate" before. We have said before, if you want a strick adherence to the minimum and maximum you could consider drawing from a beta distribution. However, the fact that the mean, standard deviation, min and max correspond to the values you specify is not enough to guarantee that it is appropriate, as can be seen in the example below. The example requires the -qplot- package, which you can find and install using -findit qplot-. *------------------ begin example ------------------ sysuse auto, clear sum price tempname m sd min max scalar `m' = ( r(mean) - r(min) ) / ( r(max) - r(min) ) scalar `sd' = r(sd) / ( r(max) - r(min) ) scalar `min' = r(min) scalar `max' = r(max) tempname alpha beta scalar `alpha' = `m'*((`m'*(1-`m'))/(`sd'^2)-1) scalar `beta' = (1-`m')*((`m'*(1-`m'))/(`sd'^2)-1) forvalues i = 1/19 { gen sim`i' = rbeta(`alpha', `beta')*(`max' - `min') + `min' } // mean and standard deviatins differ as much as one would // expect with random draws and the min and max is strictly // maintained sum price sim* // still the distribution of the simulated variables differ // considerably from the distribution of price qplot price sim*, /// trscale(invibeta(`alpha',`beta',@)*(`max' - `min') + `min') /// ms(oh none ..) c(. l ..) lc(gs10 ..) legend(off) *------------------- end example ------------------- (For more on examples I sent to the Statalist see: http://www.maartenbuis.nl/example_faq ) --------------------------------- Maarten L. Buis WZB Reichpietschufer 50 10785 Berlin Germany http://www.maartenbuis.nl --------------------------------- * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/

**References**:**st: Questions for random data generation and value label***From:*Yu Xue <snowrain@gmail.com>

**Re: st: Questions for random data generation and value label***From:*Maarten Buis <maartenlbuis@gmail.com>

**Re: st: Questions for random data generation and value label***From:*Yu Xue <snowrain@gmail.com>

**Re: st: Questions for random data generation and value label***From:*Maarten Buis <maartenlbuis@gmail.com>

**Re: st: Questions for random data generation and value label***From:*Yu Xue <snowrain@gmail.com>

**Re: st: Questions for random data generation and value label***From:*Joerg Luedicke <joerg.luedicke@gmail.com>

**Re: st: Questions for random data generation and value label***From:*Yu Xue <snowrain@gmail.com>

**Re: st: Questions for random data generation and value label***From:*"Joseph Coveney" <stajc2@gmail.com>

**Re: st: Questions for random data generation and value label***From:*Yu Xue <snowrain@gmail.com>

**Re: st: Questions for random data generation and value label***From:*Maarten Buis <maartenlbuis@gmail.com>

**Re: st: Questions for random data generation and value label***From:*Yu Xue <snowrain@gmail.com>

- Prev by Date:
**Re: st: RE thread: SEM with categorical variables** - Next by Date:
**st: Corrgram and values saved in r()** - Previous by thread:
**Re: st: Questions for random data generation and value label** - Next by thread:
**Re: st: Questions for random data generation and value label** - Index(es):