Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: - uniform() - random variable not uniformly distributed overinterval


From   "Michael S. Hanson" <[email protected]>
To   [email protected]
Subject   Re: st: - uniform() - random variable not uniformly distributed overinterval
Date   Wed, 9 Mar 2005 21:05:28 -0500

On Mar 9, 2005, at 8:19 PM, Deborah Garvey wrote:

I am generating a random number with the goal of (randomly) assigning
children who report multiple races to a single race category for
purposes of analysis.  I'd like to evenly distribute children across
single race categories.
I confess I don't fully understand what you are doing, but in light of your message I have a comment below....



when I use uniform() to generate a random number, I don't get an even split:

. gen r = uniform() if k_race == 801
(363374 missing values generated)

. su r

Variable | Obs Mean Std. Dev. Min Max
-------------+--------------------------------------------------------
r | 2309 .4995691 .2914078 .0007709 .9993796

. gen byte k_race2 = .
(365683 missing values generated)

. replace k_race2 = 100 if r <= 0.5
(1151 real changes made)

. replace k_race2 = 200 if r > 0.5 & r <= 1
k_race2 was byte now int
(1158 real changes made)

I am puzzled. I would expect median = mean = 0.5 for a uniform number
defined over [0,1).
It does equal 1/2 -- within the sampling error inherent in drawing random numbers from a given population distribution. If you always got *exactly* 1/2 of "random" sample from a uniform distribution to be on either side of the mean/median, you should be very, very skeptical of your random number generator!

Thus, my quick comment: don't expect your "randomly" re-assigned race categories to reproduce the distribution of the remaining race categories in your larger sample -- that's not how random variables (should) work. Even worse would be to find some way to force the "randomly" re-assigned individuals to follow the sample distribution rather than let them be chosen randomly given your population assumptions -- assuming you truly desire random assignment. As I said above, I don't understand your application well enough to advise you here. HTH.

-- Mike

*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/




© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index