Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Generate random missing values within a set of variables


From   Eric Booth <eric.a.booth@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Generate random missing values within a set of variables
Date   Sun, 15 Apr 2012 12:08:41 -0500

<>

Others have offered solutions with the data in wide format, here's another approach after reshaping to long (to me, this is an easier approach):

*******************! watch for wrapping below:
*-- make some fake data
clear
set obs 1000
forvalues i =1/6{
      gen x`i' = rnormal()
}
g i = _n
reshape long x, i(i) j(j)
tempvar rand rand2




*--1. "I would like to randomly generate one missing value in one of the 6 variables per line/observation"
bys i: gen `rand' = runiform()
bys i (`rand'): gen missingone = j if _n==1

*--2. "then in another set of variables randomly generate 2 missing values per line/observation - 2 out of the 6 variables"
bys i: gen `rand2' = runiform()
bys i (`rand2'): gen missingtwo = j if inlist(_n, 1, 2)




*--3. Make one or two values missing, as described
clonevar x_two = x  //x_two is for your second condition (2 missing values per group)
	lab var x_two "same as x, but will have 2 missing obs per group"
replace x = . if !mi(missingone) 
replace x_two = . if !mi(missingtwo)
sort i j
	ta miss*



*--reshape back if you want this data to be wide again
drop __*  missing*
reshape wide x* , i(i) j(j)
*******************!

- Eric


__
Eric A. Booth
Public Policy Research Institute 
Texas A&M University
ebooth@ppri.tamu.edu
+979.845.6754

On Apr 15, 2012, at 7:06 AM, Sofia Ramiro wrote:

> Dear all, 
> 
> I have 6 variables (without missings) and I would like to randomly generate one missing value in one of the 6 variables per line/observation (and then in another set of variables randomly generate 2 missing values per line/observation - 2 out of the 6 variables). 
> I know that with the runiform command we manage to choose observations randomly within one variable (so I could generate random missing values within one variable), but how can I choose randomly one variable out of the 6 to be transformed into missing and make sure that one of them is being transformed per observation?
> 
> I appreciate your help.
> 
> Thanks!
> 
> Sofia
> 		 	   		  
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index