[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
"Lachenbruch, Peter" <Peter.Lachenbruch@oregonstate.edu> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
RE: st: Verify randomization in a large sample |

Date |
Wed, 1 Oct 2008 08:29:09 -0700 |

The tests Dr. Nichols notes assume normality - not much of an issue for univariate issue unless there is bad skewness. The multivariate test based on Hotelling could be an issue as it isn't quite as robust to non-normality. The testing of balance after randomization is often done in the pharmaceutical industry but Senn had a good article in Statistics in Medicine on this about 10 years ago. It's not sensible, as all it does is verify if you did the job right, and if you didn't what then? Others have suggested using the test to determine if you should adjust for the variables that aren't balanced. This is allowing the data to determine the analysis, and is completely exploratory. If you are planning to adjust for covariables, you should specify these a priori - and usually these are based on their potential effect on the response. ******************* * * * SOAP BOX * * * ******************* Tony Peter A. Lachenbruch Department of Public Health Oregon State University Corvallis, OR 97330 Phone: 541-737-3832 FAX: 541-737-4001 -----Original Message----- From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Austin Nichols Sent: Tuesday, September 30, 2008 7:05 PM To: statalist@hsphsun2.harvard.edu Subject: Re: st: Verify randomization in a large sample José Luis Chávez Calva <josechc@gmail.com>: The only way to verify randomization is to observe the randomization mechanism. But you can check the balance by comparing means of several variables in the dataset like age, gender, language, etc. across categories. For example, if you have treatment and control groups defined by a variable t (=0 for control and =1 for treatment), you can do hotelling age gender language etc, by(t) or reg t age gender language etc to get an F test of the null that all means are the same. Assuming variances may differ, you can reg t age gender language etc, r and for alternative models you can run logit or probit instead (to get a chi2 test). Presumably, for a categorical t you could run mlogit t age gender language etc or -mprobit- assuming a specific error distribution under the null of randomization (in which case the X vars should not help you predict t). All of that is just for comparisons of means; for higher moments you will need tests of equality of distributions (e.g. -ksmirnov-) or graphical methods (e.g. -qqplot-). On Tue, Sep 30, 2008 at 8:18 PM, José Luis Chávez Calva <josechc@gmail.com> wrote: > Dear Stata users: > > I have a dataset on household income with a large number of > individuals. The set contains one variable indicating the locality > where each individual lives and another one indicating the household > to which this individual belongs to. I would like to know how to > verify randomization both at locality and household level using > several variables in the dataset like age, gender, language, etc. * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: Verify randomization in a large sample***From:*"Austin Nichols" <austinnichols@gmail.com>

- Prev by Date:
**Re: st: Kernel regression** - Next by Date:
**Re: st: Verify randomization in a large sample** - Previous by thread:
**Re: st: Verify randomization in a large sample** - Next by thread:
**Re: st: Verify randomization in a large sample** - Index(es):

© Copyright 1996–2017 StataCorp LLC | Terms of use | Privacy | Contact us | What's new | Site index |