Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Combining four variables into one


From   Nick Cox <[email protected]>
To   [email protected]
Subject   Re: st: Combining four variables into one
Date   Wed, 15 Aug 2012 12:21:33 +0100

The first question you can and should answer yourself. As I explained
in the article referenced, functions are documented

. help max()

and you can experiment using -display- to see what they do

. di max(2,3,5,42, .)
42

This example shows that missings are ignored (unless all arguments are
missing).

On your second question I am at disadvantage because I know nothing
about your data and you don't explain why you expect 81360. But, for
example, someone with

0 1 1 1

as diagnoses will have a maximum of 1 (not 3). So the total of 1s on
-gestht1- to -gestht4- will be less than the total of 1s on -gestht-
to the extent that there are multiple diagnoses.

On Wed, Aug 15, 2012 at 12:07 PM, Amal Khanolkar <[email protected]> wrote:
> Thanks Nick,
>
> Those were very simple and straightforward ways of combining variables. In the first option, does 'max' indicate the max possible value i.e. 1 in this case?
>
> Both ways suggested by you give me the same total of 80,346. But I was expecting a total of 81,360. Could some of the subjects with multiple diagnoses be counted just once, i.e. the first time they appear as coded as 1?
>
> . tab gestht
>
>      gestht |      Freq.     Percent        Cum.
> ------------+-----------------------------------
>           0 |  2,911,110       97.31       97.31
>           1 |     80,346        2.69      100.00
> ------------+-----------------------------------
>       Total |  2,991,456      100.00
>
>
>
> . egen gesthtx = rowmax(gestht1 gestht2 gestht3 gestht4)
>
> . tab gesthtx
>
>     gesthtx |      Freq.     Percent        Cum.
> ------------+-----------------------------------
>           0 |  2,911,110       97.31       97.31
>           1 |     80,346        2.69      100.00
> ------------+-----------------------------------
>       Total |  2,991,456      100.00
>
> Thanks,
>
> /Amal.
> ________________________________________
> From: [email protected] [[email protected]] on behalf of Nick Cox [[email protected]]
> Sent: 15 August 2012 12:48
> To: [email protected]
> Subject: Re: st: Combining four variables into one
>
> At a guess, you should not -replace- any of these variables as they
> all might be useful and needed for something else. Consider
>
> gen gestht = max(gestht1, gestht2, gestht3, gesht4)
>
> or
>
> egen gesht = rowmax(gestht1 gestht2 gestht3 gestht4)
>
> On a small point of English: one diagnosis, two diagnoses.
>
> This kind of question bolsters my prejudice that the functions
> (including -egen- functions) are one of the most neglected parts of
> Stata. See also
>
> Cox, N.J. 2011. Speaking Stata: Fun and fluency with functions.
> The Stata Journal 11(3): 460-471
>
> Abstract.  Functions are the unsung heroes of Stata. This column is a
> tour of functions that might easily be missed or underestimated, with
> a potpourri of tips, tricks, and examples for a wide range of basic
> problems.
>
> Nick
>
> On Wed, Aug 15, 2012 at 11:31 AM, Amal Khanolkar <[email protected]> wrote:
>
>> I have the following four variables, where 1 indicates diagnoses for a particular type of hypertension. As I don't have sufficient number of cases when I take into account my covariates, I now need to combine these four variables to create 1 variable; 0=no
>>  diagnoses of hypertension, and 1=diagnoses of any type of hypertension (in any of the four variables). Some subjects might have multiple diagnoses. Is there a better and easier way to do this than using the replace command?
>>
>>
>>
>> I would also like to create a variation of the combined variable such that each subject is entered only one if she has multiple diagnoses to compare it with the other combined variable, where all multiple diagnoses for a subject are inluded.
>>
>>
>>
>>  tab gestht1
>>
>>
>>
>> Chronic/ess |
>>
>>      ential |
>>
>> hypertensio |
>>
>> n O10-O11 & |
>>
>>   642A-C, H |      Freq.     Percent        Cum.
>>
>> ------------+-----------------------------------
>>
>>           0 |  2,986,530       99.84       99.84
>>
>>           1 |      4,926        0.16      100.00
>>
>> ------------+-----------------------------------
>>
>>       Total |  2,991,456      100.00
>>
>>
>>
>> . tab gestht2
>>
>>
>>
>> Gestational |
>>
>> hypertensio |
>>
>>     n O13 & |
>>
>>  642D, 642X |      Freq.     Percent        Cum.
>>
>> ------------+-----------------------------------
>>
>>           0 |  2,970,036       99.28       99.28
>>
>>           1 |     21,420        0.72      100.00
>>
>> ------------+-----------------------------------
>>
>>       Total |  2,991,456      100.00
>>
>>
>>
>> . tab gestht3
>>
>>
>>
>> Preeclampsi |
>>
>>        a or |
>>
>>   eclampsia |
>>
>>  O14, O15 & |
>>
>>      642E-G |      Freq.     Percent        Cum.
>>
>> ------------+-----------------------------------
>>
>>           0 |  2,936,962       98.18       98.18
>>
>>           1 |     54,494        1.82      100.00
>>
>> ------------+-----------------------------------
>>
>>       Total |  2,991,456      100.00
>>
>>
>>
>> . tab gestht4
>>
>>
>>
>> Preexisting |
>>
>> hypertensio |
>>
>>      n with |
>>
>> preeclampsi |
>>
>>     a O11 & |
>>
>>        642H |      Freq.     Percent        Cum.
>>
>> ------------+-----------------------------------
>>
>>           0 |  2,990,936       99.98       99.98
>>
>>           1 |        520        0.02      100.00
>>
>> ------------+-----------------------------------
>>
>>       Total |  2,991,456      100.00
>>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index