Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.

# st: How to do Pseudo-class draws in STATA?

 From ChengShi Shiu To "statalist@hsphsun2.harvard.edu" Subject st: How to do Pseudo-class draws in STATA? Date Thu, 18 Apr 2013 03:45:46 +0000

Hi, The STATA list members,

I would like to do Pseudo-Class draws in STATA. I have some ideas about how to do it, but not sure how to implement it in STATA.

When I say "Pseudo-Class draws", I refer to the papers by Bandee-Roche, K. et al., (1997). Latent variable regression for multiple discrete outcomes, JASA, 92(440): 1375-1386 & Wang, C.-P. et al., (2005). Residual diagnostics for growth mixture models, JASA, 100(471): 1054 - 1075. A more concrete example could be seen in Petras, H. & Masyn, K. (2010). General growth mixture analysis with antecedents and consequences of change, Handbook of Quantitative Criminology, pp 69-100. But in short, if now a person has 0.5, 0.3, and 0.2 probability to be categorized in the 1st, 2nd, and the 3rd classes respectively, then, if I make independent 100 random draws, there should be approximately 50 "class 1", 30 "class 2", and 20 "class 3". But as it is random drawing, the exact numbers may or may not perfect match with their probabilities. Also, the order to these numbers will be at random too.

So I would like to do this in STATA.

Now, say, I would love to make 20 draws (as recommended by Wang et al.'s paper) for each person in the dataset, and thus create 20 datasets containing these draws (each draw will be stored in a dataset). This is sort of parallel to multiple imputation. But I am not sure how to do it in STATA.

The following is how I think I may do, but I am not sure how to implement it:

1. I will create a variable for each person. In this variable, I would create 100 observations, among which, there are, using the example above, 50 "1", 30 "2", and 20 "3" (as the probabilities for class 1, 2, and 3 are 0.5, 0.3, and 0.2 respectively). So, if there are 500 people in my dataset, I will need to create 500 new variables.

2.  I will sort randomly these new variables.

3. Then I will select the first 20 observations.

4. I would transpose the data I select in my step 3, so that now the row represents the original people/observations, and the column represents the 20 new draws for each individuals.

This is what I think I would like to do, but I can't figure out the first step: how to create this new variable containing replicated numbers according to other variables that have information about the class probabilities? (in this case, I may have other three variables containing the probabilities for each of the three class)

That is the reason I come here to seek help.

Any suggestion would be so much appreciated.

Or, if there is any other way to do the pseudo-class draws, I would love to know that.

Thank you so much.

Best,

Koh

Cheenghee M. Koh, MSW, MS(c), PhD(c)
University of Chicago
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/