[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Joseph McDonnell <jockmcdock@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Grouping records by a STRING datatype |

Date |
Wed, 1 Jul 2009 10:56:24 +0930 |

Hi Sam as I see it you have 4 sets of requirements 1) DX1 has to be one of a subset of the 481, 482, 483, 485, 486 categories 2) age>=2 months 3) PR1 has to begin with "860" 4) PR2, PR3 etc have to be blank. You don't mention how many of these you have but presumably it's a relatively small number. You CAN do this in one step (undoubtedly more efficient), but I'd advocate doing it in several steps for the sake of readability. So here's my suggestion... * initially include no patients . gen IsIn=0 * generate a variable which marks those with the correct DX1 . replace IsIn=1 if inlist(substr(DX1,1,3),"481","483","485","486") | inlist(substr(DX1,1,5),"482.1","482.3","482.9") * succesively unmark those within this group which don't fit the other criteria . replace IsIn=0 if agemonths<2 . replace IsIn=0 if substr(PR1,1,3)!="860" . replace IsIn=0 if trim(PR2)!="" | trim(PR3)!="" | trim(PR4)!="" At the end of this, those who are marked are the ones you wish to have. Hopefully. I've used the trim function because sometimes a space gets entered into a text field and they are difficult to spot. As I said, you can combine these but it becomes pretty unreadable. If there are more PRs, you might want to investigate loops. Worth doing in any case if you find you're doing repetitive programming. Hope this helps. Cheers Joseph On Wed, Jul 1, 2009 at 9:21 AM, Sam Lu<alamoboy@gmail.com> wrote: > Hi All, > > New user to STATA here. > > I just started learning STATA this past week, though I do have some > experience with R project and MySQL. My research advisor has asked > that I use STATA so here I am today. > > My research is medical in nature, so pardon if I use some jargo that's > not familar to everyone. I'm attempting to group various diagnoses by > their ICD-9 code. For example, the general category of "asthma" (or > another disease) has a 3-digit ICD-9 code of 493. A more specific > diagnosis of asthma builds on the 3-digit code. Thus, "extrinsic > asthma" would be 493.0 while "extrinsic asthma with status > asthmaticus" would be 493.01. ICD-9 codes stop at the fifth digit or > what math-types would normally call the hundredths place. > > I have not converted the ICD-9 codes from a string datatype to a > numerical one is because there are some ICD-9 cdoes that start with a > zero (0), and I fear that converting them to a numerical value may not > faithfully preserve the true code. So far, I can group major ICD-9 > category if there is only one ICD-9 code. For example, when I bin > asthma I use the following code (note that "DX1" is the principal > diagnosis): > > generate ACS = "Asthma" if regexm(DX1, "493+") > > The above code bins ICD9 codes that have 493 as their first three > digits, and appears to work fine. > > However, there are other predefined illness categories that have > multiple ICD9 codes plus other constraints (e.g., age, procedure > performed) that complicates matters. For example, "bacterial > pneumonia" encompasses a DX1 of 481.XX or 482.1X or 482.3X or 482.9X > or 483.XX or 485.XX or 486.XX. In addition, only patients with an age >>= 2 months are included, and the secondary diagnosis cannot be 282.6X > ("X" can be any number); also, the primary procedure (PR1) performed > must equal 860.XX while there cannot be any other procedures performed > (i.e., the fields for PR2, PR3, etc. must be blank). > > So, how do a code that monster of a query in STATA? > > > Thanks for any help, > > Sam > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: Grouping records by a STRING datatype***From:*Sam Lu <alamoboy@gmail.com>

- Prev by Date:
**RE: st: adoupdate question** - Next by Date:
**RE: st: Stata 11 data format** - Previous by thread:
**st: Grouping records by a STRING datatype** - Index(es):

© Copyright 1996–2015 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |