Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | "Thirukumaran, Caroline Pinto" <cpt8913@gmail.com> |
To | statalist@hsphsun2.harvard.edu |
Subject | Re: st: RE: Check for coding of new variables using data patterns |
Date | Tue, 22 Oct 2013 15:19:36 -0400 |
Joe, I'm sorry about not mentioning the -collapse- code I used. Here it is: . collapse (first) var1 var2 var3 var4 var5, by(newvar) I tried different statistics by which to -collapse- including count, first etc. But they didn't seem to give me the output I needed. The -groups- command suggested by David and the code provided by Sergiy work perfectly! Thank you. On Tue, Oct 22, 2013 at 2:48 PM, Joe Canner <jcanner1@jhmi.edu> wrote: > Caroline, > > Yes, it would be nice to have something in Stata like the SAS "/ list" option. In the meantime, you don't say what exactly you did with -collapse- so it's hard to say why that doesn't work. If you have a variable (or can create one) that is nonmissing when var1-var5 are nonmissing you could do: > > . collapse (count) somevar, by (newvar var1-var5) > > Regards, > Joe Canner > Johns Hopkins University School of Medicine > > -----Original Message----- > From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Thirukumaran, Caroline Pinto > Sent: Tuesday, October 22, 2013 2:19 PM > To: statalist@hsphsun2.harvard.edu > Subject: st: Check for coding of new variables using data patterns > > Hi, > Is there a way to check the coding of a newly created variable using data patterns of variables that have used to create the new variable? > > As an example, newvar is a variable that has been created based on values of var1-var5. The code used for creating the newvar variable is as follows: > egen newvar=rsum(var1 var2 var3 var4 var5) > > To check that newvar has been correctly coded, it would be helpful to have an output like the one below: > > newvar var1 var2 var3 var4 var5 Frequency > 0 0 0 0 0 0 10 > 1 0 0 0 0 1 20 > 1 0 1 0 0 0 40 > 2 0 0 1 1 0 70 > 2 0 1 1 0 0 80 > 3 2 0 1 0 0 110 > 3 1 0 1 0 1 120 > 4 1 0 1 2 0 130 > > -collapse- gives an acceptable output (it does not give me the frequency count) only when var1 -var5 are binary. > > I am using Stata 12.1 for Windows. > > I get the output tabulated above from SAS using the following code: > proc freq data=abc; > tables newvar*var1*var2*var3*var4*var5 / list missing; run; > > Many thanks in advance, > Caroline Thirukumaran > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/faqs/resources/statalist-faq/ > * http://www.ats.ucla.edu/stat/stata/ > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/faqs/resources/statalist-faq/ > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/