Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: Combining four variables into one


From   Amal Khanolkar <Amal.Khanolkar@ki.se>
To   "statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu>
Subject   RE: st: Combining four variables into one
Date   Wed, 15 Aug 2012 11:07:55 +0000

Thanks Nick,

Those were very simple and straightforward ways of combining variables. In the first option, does 'max' indicate the max possible value i.e. 1 in this case?

Both ways suggested by you give me the same total of 80,346. But I was expecting a total of 81,360. Could some of the subjects with multiple diagnoses be counted just once, i.e. the first time they appear as coded as 1?

. tab gestht

     gestht |      Freq.     Percent        Cum.
------------+-----------------------------------
          0 |  2,911,110       97.31       97.31
          1 |     80,346        2.69      100.00
------------+-----------------------------------
      Total |  2,991,456      100.00



. egen gesthtx = rowmax(gestht1 gestht2 gestht3 gestht4)

. tab gesthtx

    gesthtx |      Freq.     Percent        Cum.
------------+-----------------------------------
          0 |  2,911,110       97.31       97.31
          1 |     80,346        2.69      100.00
------------+-----------------------------------
      Total |  2,991,456      100.00

Thanks,

/Amal.
________________________________________
From: owner-statalist@hsphsun2.harvard.edu [owner-statalist@hsphsun2.harvard.edu] on behalf of Nick Cox [njcoxstata@gmail.com]
Sent: 15 August 2012 12:48
To: statalist@hsphsun2.harvard.edu
Subject: Re: st: Combining four variables into one

At a guess, you should not -replace- any of these variables as they
all might be useful and needed for something else. Consider

gen gestht = max(gestht1, gestht2, gestht3, gesht4)

or

egen gesht = rowmax(gestht1 gestht2 gestht3 gestht4)

On a small point of English: one diagnosis, two diagnoses.

This kind of question bolsters my prejudice that the functions
(including -egen- functions) are one of the most neglected parts of
Stata. See also

Cox, N.J. 2011. Speaking Stata: Fun and fluency with functions.
The Stata Journal 11(3): 460-471

Abstract.  Functions are the unsung heroes of Stata. This column is a
tour of functions that might easily be missed or underestimated, with
a potpourri of tips, tricks, and examples for a wide range of basic
problems.

Nick

On Wed, Aug 15, 2012 at 11:31 AM, Amal Khanolkar <Amal.Khanolkar@ki.se> wrote:

> I have the following four variables, where 1 indicates diagnoses for a particular type of hypertension. As I don't have sufficient number of cases when I take into account my covariates, I now need to combine these four variables to create 1 variable; 0=no
>  diagnoses of hypertension, and 1=diagnoses of any type of hypertension (in any of the four variables). Some subjects might have multiple diagnoses. Is there a better and easier way to do this than using the replace command?
>
>
>
> I would also like to create a variation of the combined variable such that each subject is entered only one if she has multiple diagnoses to compare it with the other combined variable, where all multiple diagnoses for a subject are inluded.
>
>
>
>  tab gestht1
>
>
>
> Chronic/ess |
>
>      ential |
>
> hypertensio |
>
> n O10-O11 & |
>
>   642A-C, H |      Freq.     Percent        Cum.
>
> ------------+-----------------------------------
>
>           0 |  2,986,530       99.84       99.84
>
>           1 |      4,926        0.16      100.00
>
> ------------+-----------------------------------
>
>       Total |  2,991,456      100.00
>
>
>
> . tab gestht2
>
>
>
> Gestational |
>
> hypertensio |
>
>     n O13 & |
>
>  642D, 642X |      Freq.     Percent        Cum.
>
> ------------+-----------------------------------
>
>           0 |  2,970,036       99.28       99.28
>
>           1 |     21,420        0.72      100.00
>
> ------------+-----------------------------------
>
>       Total |  2,991,456      100.00
>
>
>
> . tab gestht3
>
>
>
> Preeclampsi |
>
>        a or |
>
>   eclampsia |
>
>  O14, O15 & |
>
>      642E-G |      Freq.     Percent        Cum.
>
> ------------+-----------------------------------
>
>           0 |  2,936,962       98.18       98.18
>
>           1 |     54,494        1.82      100.00
>
> ------------+-----------------------------------
>
>       Total |  2,991,456      100.00
>
>
>
> . tab gestht4
>
>
>
> Preexisting |
>
> hypertensio |
>
>      n with |
>
> preeclampsi |
>
>     a O11 & |
>
>        642H |      Freq.     Percent        Cum.
>
> ------------+-----------------------------------
>
>           0 |  2,990,936       99.98       99.98
>
>           1 |        520        0.02      100.00
>
> ------------+-----------------------------------
>
>       Total |  2,991,456      100.00
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index