Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: generating indicators


From   Jordan Hoolachan <[email protected]>
To   "[email protected]" <[email protected]>
Subject   Re: st: generating indicators
Date   Fri, 10 Sep 2010 17:01:39 -0400

Hey Wassim,

Is there any reason in particular that you're avoiding having to
reshape your data? I can't quite picture how the indicators you're
looking for would work with data that is in long format.

Jordan


On Sep 10, 2010, at 4:48 PM, Wassim Tarraf <[email protected]> wrote:

> Hi Jordan- Yes the data is in long format. I was actually trying to avoid reshaping the data.
>
> Thanks,
> Wassim
>
>
>
> Jordan Hoolachan wrote:
>> Hi, Wassim
>>
>> It sounds like your data is currently in long format, is that right?
>> There may be a way generate the indicators that you want with the data
>> being in long but I personally would first transpose it to wide.
>> Another question: do you have a list of all the different ICD9 codes
>> that appear in your data set?  If you do, ignore this next part of
>> code.
>>
>> If you don't have a list of all ICD9 codes that are present, do the following:
>>
>> 1. sort varB
>> 2. by varB: gen code=_n
>> 3. list varB if code==1
>>
>> The above code will give you a list of the unique ICD9 codes that
>> appear in your dataset...you'll need this for the next step.
>>
>> Now, transpose your data from long to wide.  With the data in wide
>> format, you can use the egen function -rany- to produce the indicators
>> that you want (-findit egenmore- if you haven't already downloaded
>> it).  If you check out its help file, you'll see that it allows you to
>> specify a condition and then will indicate with a 0/1 if that
>> condition is met at least once over the a list of variables.  You can
>> implement a for loop to create all the indicators in one fell swoop.
>> The code would look something like this:
>>
>> foreach x in <the list of ICD9 codes printed out before> {
>> egen `x'_indic=rany(<the variables containing the ICD9 codes>), cond(@==" `x' ")
>> }
>>
>>
>> The above code will produce an indicator of the form "ICDx_indic" for
>> each of the ICD codes that appear in your data set.
>>
>> Hopefully that is clear enough..let me know if you have any questions.
>>
>> Jordan
>>
>>
>>
>>
>> Jordan Hoolachan
>> ScM Candidate
>> Department of Biostatistics
>> Johns Hopkins Bloomberg School of Public Health
>> 410-294-3670
>>
>>
>>
>> On Fri, Sep 10, 2010 at 3:45 PM, Wassim Tarraf <[email protected]> wrote:
>>
>>> Dear Stata list members- I have a dataset that includes a variable A which
>>> is an identifier (nonconsecutive person id numbers) and a variable B which
>>> is a list of medical (icd9) conditions (string). Each person (identified by
>>> A) has as many records as reported conditions (conditions could be reported
>>> more than once). I would appreciate suggestions on an efficient way to
>>> generate conditions indicators (coded 0,1) that would account for whether a
>>> specific individual reported a certain condition or not.
>>>
>>> Thanks,
>>> Wassim
>>> *
>>> *   For searches and help try:
>>> *   http://www.stata.com/help.cgi?search
>>> *   http://www.stata.com/support/statalist/faq
>>> *   http://www.ats.ucla.edu/stat/stata/
>>>
>>>
>>
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/statalist/faq
>> *   http://www.ats.ucla.edu/stat/stata/
>>
>>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index