Thank you so much Eric!! It worked. But I am curious that why LONG format is better? I thought wide is better given the condition of the last two years. Rituparna -----Original Message----- From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Eric Booth Sent: Wednesday, April 11, 2012 6:08 PM To: statalist@hsphsun2.harvard.edu Subject: Re: st: another coding que to group vars Importance: High It's probably easier to work with these data in long (rather than wide) format, but here's one way to do it: clear ************************! clear inp ID str1( Y1 Y2 Y3 Y4 Y5 Y6) 1 x x x x x x 2 y y y y y y 3 x x x x . x 4 x y y x x x 5 y y y x x x 6 x . y . . x 7 x y x x . . 8 x x x x . . 9 y y y y . . end foreach v of varlist Y* { replace `v' = "0" if `v' == "y" replace `v' = "1" if `v' == "x" } destring, replace g type = . egen x = rowtotal(Y1-Y6) egen y = rownonmiss(Y1-Y6) replace y = y - x egen first = rowfirst(Y1-Y6) replace type = 1 if x > 0 & !y replace type = 2 if !x & y>0 replace type = 3 if x>0 & y>0 & first replace type = 4 if x>0 & y>0 & !first replace type = 5 if x > 0 & !y & mi(Y5) & mi(Y6) replace type = 6 if !x & y>0 & mi(Y5) & mi(Y6) ************************! - Eric __ Eric A. Booth Public Policy Research Institute Texas A&M University ebooth@ppri.tamu.edu Office: +979.845.6754 On Apr 11, 2012, at 5:50 PM, Rituparna Basu wrote: > Thanks Eric! It looks like the 'group' command is similar to 'concat'. > I used 'group' but it generated more than 1000 groups. > My aim is to create the 'TYPE' variable such that X & Y are grouped into six broad categories,. (type 1 is group X thru out(no matter the missing values), type 2 is group Y thru out (no matter the missing values), type 3 is started as group X and then changed to Y at any time point (no matter the missing values), type 4 is started as Y and then changed to X at any time point (no matter the missing values), type 5 & 6 have either X or Y thru out (no matter the missing values) except the last TWO years)). > > > > -----Original Message----- > From: owner-statalist@hsphsun2.harvard.edu > [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Eric Booth > Sent: Wednesday, April 11, 2012 3:29 PM > To: statalist@hsphsun2.harvard.edu > Subject: Re: st: another coding que to group vars > Importance: High > > > <> > > Take a look at -egen- function group() > > e.g., egen TYPE = group(Y1-Y6) > > - Eric > > __ > Eric A. Booth > Public Policy Research Institute > Texas A&M University > ebooth@ppri.tamu.edu > Office: +979.845.6754 > > > On Apr 11, 2012, at 2:12 PM, Rituparna Basu wrote: > >> Dear Statalist Scholars, >> >> I have yet another coding question to ask. Please see the mock data below: >> >> ID Y1 Y2 Y3 Y4 Y5 Y6 TYPE >> 1 x x x x x x type1 >> 2 y y y y y y type2 >> 3 x x x x . x type1 >> 4 x y y x x x type3 >> 5 y y y x x x type4 >> 6 x . y . . x type3 >> 7 x y x x . . type3 >> 8 x x x x . . type5 >> 9 y y y y . . type6 >> >> >> Here ID= Patient ID >> Y1-Y6= years >> X & Y are two types of group >> TYPE= This is the variable I would like to CREATE by grouping x & y into six categories. (type 1 is group X thru out, type 2 is group Y thru out, type 3 is started as group X and then changed to Y at any time point, type 4 is started as Y and then changed to X at any time point, type 5 & 6 have either X or Y thru out except the last TWO years)). The types includes missing. >> >> Any help is very much appreciated! Please let me know if more clarification is needed. >> >> Thank you in advance! >> >> Regards, >> >> RB * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

