Thanks Nick, Those were very simple and straightforward ways of combining variables. In the first option, does 'max' indicate the max possible value i.e. 1 in this case? Both ways suggested by you give me the same total of 80,346. But I was expecting a total of 81,360. Could some of the subjects with multiple diagnoses be counted just once, i.e. the first time they appear as coded as 1? . tab gestht gestht | Freq. Percent Cum. ------------+----------------------------------- 0 | 2,911,110 97.31 97.31 1 | 80,346 2.69 100.00 ------------+----------------------------------- Total | 2,991,456 100.00 . egen gesthtx = rowmax(gestht1 gestht2 gestht3 gestht4) . tab gesthtx gesthtx | Freq. Percent Cum. ------------+----------------------------------- 0 | 2,911,110 97.31 97.31 1 | 80,346 2.69 100.00 ------------+----------------------------------- Total | 2,991,456 100.00 Thanks, /Amal. ________________________________________ From: owner-statalist@hsphsun2.harvard.edu [owner-statalist@hsphsun2.harvard.edu] on behalf of Nick Cox [njcoxstata@gmail.com] Sent: 15 August 2012 12:48 To: statalist@hsphsun2.harvard.edu Subject: Re: st: Combining four variables into one At a guess, you should not -replace- any of these variables as they all might be useful and needed for something else. Consider gen gestht = max(gestht1, gestht2, gestht3, gesht4) or egen gesht = rowmax(gestht1 gestht2 gestht3 gestht4) On a small point of English: one diagnosis, two diagnoses. This kind of question bolsters my prejudice that the functions (including -egen- functions) are one of the most neglected parts of Stata. See also Cox, N.J. 2011. Speaking Stata: Fun and fluency with functions. The Stata Journal 11(3): 460-471 Abstract. Functions are the unsung heroes of Stata. This column is a tour of functions that might easily be missed or underestimated, with a potpourri of tips, tricks, and examples for a wide range of basic problems. Nick On Wed, Aug 15, 2012 at 11:31 AM, Amal Khanolkar <Amal.Khanolkar@ki.se> wrote: > I have the following four variables, where 1 indicates diagnoses for a particular type of hypertension. As I don't have sufficient number of cases when I take into account my covariates, I now need to combine these four variables to create 1 variable; 0=no > diagnoses of hypertension, and 1=diagnoses of any type of hypertension (in any of the four variables). Some subjects might have multiple diagnoses. Is there a better and easier way to do this than using the replace command? > > > > I would also like to create a variation of the combined variable such that each subject is entered only one if she has multiple diagnoses to compare it with the other combined variable, where all multiple diagnoses for a subject are inluded. > > > > tab gestht1 > > > > Chronic/ess | > > ential | > > hypertensio | > > n O10-O11 & | > > 642A-C, H | Freq. Percent Cum. > > ------------+----------------------------------- > > 0 | 2,986,530 99.84 99.84 > > 1 | 4,926 0.16 100.00 > > ------------+----------------------------------- > > Total | 2,991,456 100.00 > > > > . tab gestht2 > > > > Gestational | > > hypertensio | > > n O13 & | > > 642D, 642X | Freq. Percent Cum. > > ------------+----------------------------------- > > 0 | 2,970,036 99.28 99.28 > > 1 | 21,420 0.72 100.00 > > ------------+----------------------------------- > > Total | 2,991,456 100.00 > > > > . tab gestht3 > > > > Preeclampsi | > > a or | > > eclampsia | > > O14, O15 & | > > 642E-G | Freq. Percent Cum. > > ------------+----------------------------------- > > 0 | 2,936,962 98.18 98.18 > > 1 | 54,494 1.82 100.00 > > ------------+----------------------------------- > > Total | 2,991,456 100.00 > > > > . tab gestht4 > > > > Preexisting | > > hypertensio | > > n with | > > preeclampsi | > > a O11 & | > > 642H | Freq. Percent Cum. > > ------------+----------------------------------- > > 0 | 2,990,936 99.98 99.98 > > 1 | 520 0.02 100.00 > > ------------+----------------------------------- > > Total | 2,991,456 100.00 > * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

