Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Amal Khanolkar <Amal.Khanolkar@ki.se> |

To |
"statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu> |

Subject |
RE: st: Combining four variables into one |

Date |
Wed, 15 Aug 2012 11:07:55 +0000 |

Thanks Nick, Those were very simple and straightforward ways of combining variables. In the first option, does 'max' indicate the max possible value i.e. 1 in this case? Both ways suggested by you give me the same total of 80,346. But I was expecting a total of 81,360. Could some of the subjects with multiple diagnoses be counted just once, i.e. the first time they appear as coded as 1? . tab gestht gestht | Freq. Percent Cum. ------------+----------------------------------- 0 | 2,911,110 97.31 97.31 1 | 80,346 2.69 100.00 ------------+----------------------------------- Total | 2,991,456 100.00 . egen gesthtx = rowmax(gestht1 gestht2 gestht3 gestht4) . tab gesthtx gesthtx | Freq. Percent Cum. ------------+----------------------------------- 0 | 2,911,110 97.31 97.31 1 | 80,346 2.69 100.00 ------------+----------------------------------- Total | 2,991,456 100.00 Thanks, /Amal. ________________________________________ From: owner-statalist@hsphsun2.harvard.edu [owner-statalist@hsphsun2.harvard.edu] on behalf of Nick Cox [njcoxstata@gmail.com] Sent: 15 August 2012 12:48 To: statalist@hsphsun2.harvard.edu Subject: Re: st: Combining four variables into one At a guess, you should not -replace- any of these variables as they all might be useful and needed for something else. Consider gen gestht = max(gestht1, gestht2, gestht3, gesht4) or egen gesht = rowmax(gestht1 gestht2 gestht3 gestht4) On a small point of English: one diagnosis, two diagnoses. This kind of question bolsters my prejudice that the functions (including -egen- functions) are one of the most neglected parts of Stata. See also Cox, N.J. 2011. Speaking Stata: Fun and fluency with functions. The Stata Journal 11(3): 460-471 Abstract. Functions are the unsung heroes of Stata. This column is a tour of functions that might easily be missed or underestimated, with a potpourri of tips, tricks, and examples for a wide range of basic problems. Nick On Wed, Aug 15, 2012 at 11:31 AM, Amal Khanolkar <Amal.Khanolkar@ki.se> wrote: > I have the following four variables, where 1 indicates diagnoses for a particular type of hypertension. As I don't have sufficient number of cases when I take into account my covariates, I now need to combine these four variables to create 1 variable; 0=no > diagnoses of hypertension, and 1=diagnoses of any type of hypertension (in any of the four variables). Some subjects might have multiple diagnoses. Is there a better and easier way to do this than using the replace command? > > > > I would also like to create a variation of the combined variable such that each subject is entered only one if she has multiple diagnoses to compare it with the other combined variable, where all multiple diagnoses for a subject are inluded. > > > > tab gestht1 > > > > Chronic/ess | > > ential | > > hypertensio | > > n O10-O11 & | > > 642A-C, H | Freq. Percent Cum. > > ------------+----------------------------------- > > 0 | 2,986,530 99.84 99.84 > > 1 | 4,926 0.16 100.00 > > ------------+----------------------------------- > > Total | 2,991,456 100.00 > > > > . tab gestht2 > > > > Gestational | > > hypertensio | > > n O13 & | > > 642D, 642X | Freq. Percent Cum. > > ------------+----------------------------------- > > 0 | 2,970,036 99.28 99.28 > > 1 | 21,420 0.72 100.00 > > ------------+----------------------------------- > > Total | 2,991,456 100.00 > > > > . tab gestht3 > > > > Preeclampsi | > > a or | > > eclampsia | > > O14, O15 & | > > 642E-G | Freq. Percent Cum. > > ------------+----------------------------------- > > 0 | 2,936,962 98.18 98.18 > > 1 | 54,494 1.82 100.00 > > ------------+----------------------------------- > > Total | 2,991,456 100.00 > > > > . tab gestht4 > > > > Preexisting | > > hypertensio | > > n with | > > preeclampsi | > > a O11 & | > > 642H | Freq. Percent Cum. > > ------------+----------------------------------- > > 0 | 2,990,936 99.98 99.98 > > 1 | 520 0.02 100.00 > > ------------+----------------------------------- > > Total | 2,991,456 100.00 > * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: Combining four variables into one***From:*Nick Cox <njcoxstata@gmail.com>

**References**:**st: Combining four variables into one***From:*Amal Khanolkar <Amal.Khanolkar@ki.se>

**Re: st: Combining four variables into one***From:*Nick Cox <njcoxstata@gmail.com>

- Prev by Date:
**Re: st: replace and strpos** - Next by Date:
**st: Sensitivity analysis of relationship between study quality and effect size** - Previous by thread:
**Re: st: Combining four variables into one** - Next by thread:
**Re: st: Combining four variables into one** - Index(es):